TY - JOUR
T1 - An intrusive method for estimating speech intelligibility from noisy and distorted signals
AU - Mamun, Nursadul
AU - Zilany, Muhammad S.A.
AU - Hansen, John H.L.
AU - Davies-Venn, Evelyn E.
N1 - Publisher Copyright:
© 2021 Acoustical Society of America.
PY - 2021/9/1
Y1 - 2021/9/1
N2 - An objective metric that predicts speech intelligibility under different types of noise and distortion would be desirable in voice communication. To date, the majority of studies concerning speech intelligibility metrics have focused on predicting the effects of individual noise or distortion mechanisms. This study proposes an objective metric, the spectrogram orthogonal polynomial measure (SOPM), that attempts to predict speech intelligibility for people with normal hearing under adverse conditions. The SOPM metric is developed by extracting features from the spectrogram using Krawtchouk moments. The metric's performance is evaluated for several types of noise (steady-state and fluctuating noise), distortions (peak clipping, center clipping, and phase jitters), ideal time-frequency segregation, and reverberation conditions both in quiet and noisy environments. High correlation (0.97-0.996) is achieved with the proposed metric when evaluated with subjective scores by normal-hearing subjects under various conditions.
AB - An objective metric that predicts speech intelligibility under different types of noise and distortion would be desirable in voice communication. To date, the majority of studies concerning speech intelligibility metrics have focused on predicting the effects of individual noise or distortion mechanisms. This study proposes an objective metric, the spectrogram orthogonal polynomial measure (SOPM), that attempts to predict speech intelligibility for people with normal hearing under adverse conditions. The SOPM metric is developed by extracting features from the spectrogram using Krawtchouk moments. The metric's performance is evaluated for several types of noise (steady-state and fluctuating noise), distortions (peak clipping, center clipping, and phase jitters), ideal time-frequency segregation, and reverberation conditions both in quiet and noisy environments. High correlation (0.97-0.996) is achieved with the proposed metric when evaluated with subjective scores by normal-hearing subjects under various conditions.
UR - http://www.scopus.com/inward/record.url?scp=85114964320&partnerID=8YFLogxK
U2 - 10.1121/10.0005899
DO - 10.1121/10.0005899
M3 - Article
C2 - 34598625
AN - SCOPUS:85114964320
SN - 0001-4966
VL - 150
SP - 1762
EP - 1778
JO - Journal of the Acoustical Society of America
JF - Journal of the Acoustical Society of America
IS - 3
ER -