Comparison of ideal mask-based speech enhancement algorithms for speech mixed with white noise at low mixture signal-to-noise ratios



Graetzer, Simone and Hopkins, Carl ORCID: 0000-0002-9716-0793
(2022) Comparison of ideal mask-based speech enhancement algorithms for speech mixed with white noise at low mixture signal-to-noise ratios. The Journal of the Acoustical Society of America, 152 (6). pp. 3458-3470.

Access the full-text of this item by clicking on the Open Access link.

Abstract

<jats:p> The literature shows that the intelligibility of noisy speech can be improved by applying an ideal binary or soft gain mask in the time-frequency domain for signal-to-noise ratios (SNRs) between –10 and +10 dB. In this study, two mask-based algorithms are compared when applied to speech mixed with white Gaussian noise (WGN) at lower SNRs, that is, SNRs from −29 to –5 dB. These comprise an Ideal Binary Mask (IBM) with a Local Criterion (LC) set to 0 dB and an Ideal Ratio Mask (IRM). The performance of three intrusive Short-Time Objective Intelligibility (STOI) variants—STOI, STOI+, and Extended Short-Time Objective Intelligibility (ESTOI)—is compared with that of other monaural intelligibility metrics that can be used before and after mask-based processing. The results show that IRMs can be used to obtain near maximal speech intelligibility (&gt;90% for sentence material) even at very low mixture SNRs, while IBMs with LC =  0 provide limited intelligibility gains for SNR &lt; −14 dB. It is also shown that, unlike STOI, STOI+ and ESTOI are suitable metrics for speech mixed with WGN at low SNRs and processed by IBMs with LC =  0 even when speech is high-pass filtered to flatten the spectral tilt before masking. </jats:p>

Item Type: Article
Uncontrolled Keywords: Speech Intelligibility, Perceptual Masking, Speech Perception, Algorithms, Signal-To-Noise Ratio
Depositing User: Symplectic Admin
Date Deposited: 20 Dec 2022 08:39
Last Modified: 21 Aug 2023 02:25
DOI: 10.1121/10.0016494
Open Access URL: https://asa.scitation.org/doi/10.1121/10.0016494
Related URLs:
URI: https://livrepository.liverpool.ac.uk/id/eprint/3166721