Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

General General

Machine-learning of long-range sound propagation through simulated atmospheric turbulence.

In The Journal of the Acoustical Society of America

Conventional numerical methods can capture the inherent variability of long-range outdoor sound propagation. However, computational memory and time requirements are high. In contrast, machine-learning models provide very fast predictions. This comes by learning from experimental observations or surrogate data. Yet, it is unknown what type of surrogate data is most suitable for machine-learning. This study used a Crank-Nicholson parabolic equation (CNPE) for generating the surrogate data. The CNPE input data were sampled by the Latin hypercube technique. Two separate datasets comprised 5000 samples of model input. The first dataset consisted of transmission loss (TL) fields for single realizations of turbulence. The second dataset consisted of average TL fields for 64 realizations of turbulence. Three machine-learning algorithms were applied to each dataset, namely, ensemble decision trees, neural networks, and cluster-weighted models. Observational data come from a long-range (out to 8 km) sound propagation experiment. In comparison to the experimental observations, regression predictions have 5-7 dB in median absolute error. Surrogate data quality depends on an accurate characterization of refractive and scattering conditions. Predictions obtained through a single realization of turbulence agree better with the experimental observations.

Hart Carl R, Wilson D Keith, Pettit Chris L, Nykaza Edward T

2021-Jun

General General

Robust North Atlantic right whale detection using deep learning models for denoising.

In The Journal of the Acoustical Society of America

This paper proposes a robust system for detecting North Atlantic right whales by using deep learning methods to denoise noisy recordings. Passive acoustic recordings of right whale vocalisations are subject to noise contamination from many sources, such as shipping and offshore activities. When such data are applied to uncompensated classifiers, accuracy falls substantially. To build robustness into the detection process, two separate approaches that have proved successful for image denoising are considered. Specifically, a denoising convolutional neural network and a denoising autoencoder, each of which is applied to spectrogram representations of the noisy audio signal, are developed. Performance is improved further by matching the classifier training to include the vestigial signal that remains in clean estimates after the denoising process. Evaluations are performed first by adding white, tanker, trawler, and shot noises at signal-to-noise ratios from -10 to +5 dB to clean recordings to simulate noisy conditions. Experiments show that denoising gives substantial improvements to accuracy, particularly when using the vestigial-trained classifier. A final test applies the proposed methods to previously unseen noisy right whale recordings and finds that denoising is able to improve performance over the baseline clean-trained model in this new noise environment.

Vickers William, Milner Ben, Risch Denise, Lee Robert

2021-Jun

General General

BeamLearning: An end-to-end deep learning approach for the angular localization of sound sources using raw multichannel acoustic pressure data.

In The Journal of the Acoustical Society of America

Sound source localization using multichannel signal processing has been a subject of active research for decades. In recent years, the use of deep learning in audio signal processing has significantly improved the performances for machine hearing. This has motivated the scientific community to also develop machine learning strategies for source localization applications. This paper presents BeamLearning, a multiresolution deep learning approach that allows the encoding of relevant information contained in unprocessed time-domain acoustic signals captured by microphone arrays. The use of raw data aims at avoiding the simplifying hypothesis that most traditional model-based localization methods rely on. Benefits of its use are shown for real-time sound source two-dimensional localization tasks in reverberating and noisy environments. Since supervised machine learning approaches require large-sized, physically realistic, precisely labelled datasets, a fast graphics processing unit-based computation of room impulse responses was developed using fractional delays for image source models. A thorough analysis of the network representation and extensive performance tests are carried out using the BeamLearning network with synthetic and experimental datasets. Obtained results demonstrate that the BeamLearning approach significantly outperforms the wideband MUSIC and steered response power-phase transform methods in terms of localization accuracy and computational efficiency in the presence of heavy measurement noise and reverberation.

Pujol Hadrien, Bavu Éric, Garcia Alexandre

2021-Jun

General General

Unsupervised analysis of background noise sources in active offices.

In The Journal of the Acoustical Society of America

Inside open-plan offices, background noise affects the workers' comfort, influencing their productivity. Recent approaches identify three main source categories: mechanical sources (air conditioning equipment, office devices, etc.), outdoor traffic noise, and human sources (speech). Whereas the first two groups are taken into account by technical specifications, human noise is still often neglected. The present paper proposes two procedures, based on machine-learning techniques, to identify the human and mechanical noise sources during working hours. Two unsupervised clustering methods, specifically the Gaussian mixture model and K-means clustering, were used to separate the recorded sound pressure levels that were recorded while finding the candidate models. Thus, the clustering validation was used to find the number of sound sources within the office and, then, statistical and metrical features were used to label the sources. The results were compared with the common parameters used in noise monitoring in offices, i.e., the equivalent continuous and 90th percentile levels. The spectra obtained by the two algorithms match with the expected shapes of human speech and mechanical noise tendencies. The outcomes validate the robustness and reliability of these procedures.

De Salvio Domenico, D’Orazio Dario, Garai Massimo

2021-Jun

General General

Meta-learning-aided orthogonal frequency division multiplexing for underwater acoustic communications.

In The Journal of the Acoustical Society of America

In this paper, a meta-learning-based underwater acoustic (UWA) orthogonal frequency division multiplexing (OFDM) system is proposed to deal with the environment mismatch in real-world UWA applications, which can effectively drive the model from the given UWA environment to the new UWA environment with a relatively small amount of data. With meta-learning, we consider multiple UWA environments as multi-UWA-tasks, wherein the meta-training strategy is utilized to learn a robust model from previously observed multi-UWA-tasks, and it can be quickly adapted to the unknown UWA environment with only a small number of updates. The experiments with the at-sea-measured WATERMARK dataset and the lake trial indicate that, compared with the traditional UWA-OFDM system and the conventional machine learning-based framework, the proposed method shows better bit error rate performance and stronger learning ability under various UWA scenarios.

Zhang Yonglin, Wang Haibin, Li Chao, Chen Desheng, Meriaudeau Fabrice

2021-Jun

General General

Matrix analysis for fast learning of neural networks with application to the classification of acoustic spectra.

In The Journal of the Acoustical Society of America

Neural networks are increasingly being applied to problems in acoustics and audio signal processing. Large audio datasets are being generated for use in training machine learning algorithms, and the reduction of training times is of increasing relevance. The work presented here begins by reformulating the analysis of the classical multilayer perceptron to show the explicit dependence of network parameters on the properties of the weight matrices in the network. This analysis then allows the application of the singular value decomposition (SVD) to the weight matrices. An algorithm is presented that makes use of regular applications of the SVD to progressively reduce the dimensionality of the network. This results in significant reductions in network training times of up to 50% with very little or no loss in accuracy. The use of the algorithm is demonstrated by applying it to a number of acoustical classification problems that help quantify the extent to which closely related spectra can be distinguished by machine learning.

Paul Vlad S, Nelson Philip A

2021-Jun