TY - JOUR
T1 - Non-negative factor analysis of Gaussian mixture model weight adaptation for language and dialect recognition
AU - Bahari, Mohamad Hasan
AU - Dehak, Najim
AU - Van Hamme, Hugo
AU - Burget, Lukas
AU - Ali, Ahmed M.
AU - Glass, Jim
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/7/1
Y1 - 2014/7/1
N2 - Recent studies show that Gaussian mixture model (GMM) weights carry less, yet complimentary, information to GMM means for language and dialect recognition. However, state-of-the-art language recognition systems usually do not use this information. In this research, a non-negative factor analysis (NFA) approach is developed for GMM weight decomposition and adaptation. This modeling, which is conceptually simple and computationally inexpensive, suggests a new low-dimensional utterance representation method using a factor analysis similar to that of the i-vector framework. The obtained subspace vectors are then applied in conjunction with i-vectors to the language/dialect recognition problem. The suggested approach is evaluated on the NIST 2011 and RATS language recognition evaluation (LRE) corpora and on the QCRI Arabic dialect recognition evaluation (DRE) corpus. The assessment results show that the proposed adaptation method yields more accurate recognition results compared to three conventional weight adaptation approaches, namely maximum likelihood re-estimation, non-negative matrix factorization, and a subspace multinomial model. Experimental results also show that the intermediate-level fusion of i-vectors and NFA subspace vectors improves the performance of the state-of-the-art i-vector framework especially for the case of short utterances.
AB - Recent studies show that Gaussian mixture model (GMM) weights carry less, yet complimentary, information to GMM means for language and dialect recognition. However, state-of-the-art language recognition systems usually do not use this information. In this research, a non-negative factor analysis (NFA) approach is developed for GMM weight decomposition and adaptation. This modeling, which is conceptually simple and computationally inexpensive, suggests a new low-dimensional utterance representation method using a factor analysis similar to that of the i-vector framework. The obtained subspace vectors are then applied in conjunction with i-vectors to the language/dialect recognition problem. The suggested approach is evaluated on the NIST 2011 and RATS language recognition evaluation (LRE) corpora and on the QCRI Arabic dialect recognition evaluation (DRE) corpus. The assessment results show that the proposed adaptation method yields more accurate recognition results compared to three conventional weight adaptation approaches, namely maximum likelihood re-estimation, non-negative matrix factorization, and a subspace multinomial model. Experimental results also show that the intermediate-level fusion of i-vectors and NFA subspace vectors improves the performance of the state-of-the-art i-vector framework especially for the case of short utterances.
KW - Dialect recognition
KW - Gaussian mixture model weight
KW - Language recognition
KW - Model adaptation
KW - Non-negative factor analysis
UR - http://www.scopus.com/inward/record.url?scp=84904156635&partnerID=8YFLogxK
U2 - 10.1109/TASLP.2014.2319159
DO - 10.1109/TASLP.2014.2319159
M3 - Article
AN - SCOPUS:84904156635
SN - 1558-7916
VL - 22
SP - 1117
EP - 1129
JO - IEEE Transactions on Audio, Speech and Language Processing
JF - IEEE Transactions on Audio, Speech and Language Processing
IS - 7
M1 - 2319159
ER -