TY - JOUR
T1 - Estimating Uniqueness of I-Vector-Based Representation of Human Voice
AU - Tandogan, Sinan E.
AU - Sencar, Husrev Taha
N1 - Publisher Copyright:
© 2005-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - We study the individuality of the human voice with respect to a widely used feature representation of speech utterances, namely, the i-vector model. As a first step toward this goal, we compare and contrast uniqueness measures proposed for different biometric modalities. Then, we introduce a new uniqueness measure that evaluates the entropy of i-vectors while taking into account speaker level variations. Our measure operates in the discrete feature space and relies on accurate estimation of the distribution of i-vectors. Therefore, i-vectors are quantized while ensuring that both the quantized and original representations yield similar speaker verification performance. Uniqueness estimates are obtained from two newly generated datasets and the public VoxCeleb dataset. The first custom dataset contains more than one and a half million speech samples of 20,741 speakers obtained from TEDx Talks videos. The second one includes over twenty one thousand speech samples from 1,595 actors that are extracted from movie dialogues. Using this data, we analyzed how several factors, such as the number of speakers, number of samples per speaker, sample durations, and diversity of utterances affect uniqueness estimates. Most notably, we determine that the discretization of i-vectors does not cause a reduction in speaker recognition performance. Our results show that the degree of distinctiveness offered by i-vector-based representation may reach 43-70 bits considering 5-second long speech samples; however, under less constrained variations in speech, uniqueness estimates are found to reduce by around 30 bits. We also find that doubling the sample duration increases the distinctiveness of the i-vector representation by around 20 bits.
AB - We study the individuality of the human voice with respect to a widely used feature representation of speech utterances, namely, the i-vector model. As a first step toward this goal, we compare and contrast uniqueness measures proposed for different biometric modalities. Then, we introduce a new uniqueness measure that evaluates the entropy of i-vectors while taking into account speaker level variations. Our measure operates in the discrete feature space and relies on accurate estimation of the distribution of i-vectors. Therefore, i-vectors are quantized while ensuring that both the quantized and original representations yield similar speaker verification performance. Uniqueness estimates are obtained from two newly generated datasets and the public VoxCeleb dataset. The first custom dataset contains more than one and a half million speech samples of 20,741 speakers obtained from TEDx Talks videos. The second one includes over twenty one thousand speech samples from 1,595 actors that are extracted from movie dialogues. Using this data, we analyzed how several factors, such as the number of speakers, number of samples per speaker, sample durations, and diversity of utterances affect uniqueness estimates. Most notably, we determine that the discretization of i-vectors does not cause a reduction in speaker recognition performance. Our results show that the degree of distinctiveness offered by i-vector-based representation may reach 43-70 bits considering 5-second long speech samples; however, under less constrained variations in speech, uniqueness estimates are found to reduce by around 30 bits. We also find that doubling the sample duration increases the distinctiveness of the i-vector representation by around 20 bits.
KW - Biometrics
KW - distinctiveness of a modality
KW - i-vector
KW - speaker recognition
KW - uniqueness estimation
UR - http://www.scopus.com/inward/record.url?scp=85104271002&partnerID=8YFLogxK
U2 - 10.1109/TIFS.2021.3071574
DO - 10.1109/TIFS.2021.3071574
M3 - Article
AN - SCOPUS:85104271002
SN - 1556-6013
VL - 16
SP - 3054
EP - 3067
JO - IEEE Transactions on Information Forensics and Security
JF - IEEE Transactions on Information Forensics and Security
M1 - 9404191
ER -