Abstract
Natural human-computer interaction requires, in addition to understand what the speaker is saying, recognition of behavioral descriptors, such as speaker's personality traits (SPTs). The complexity of this problem depends on the high variability and dimensionality of the acoustic, lexical and situational context manifestations of the SPTs. In this paper, we present a comparative study of automatic speaker personality trait recognition from speech corpora that differ in the source speaking style (broadcast news vs. conversational) and experimental context. We evaluated different feature selection algorithms such as information gain, relief and ensemble classification methods to address the high dimensionality issues. We trained and evaluated ensemble methods to leverage base learners, using three different algorithms such as SMO (Sequential Minimal Optimization for Support Vector Machine), RF (Random Forest) and Adaboost. After that, we combined them using majority voting and stacking methods. Our study shows that, performance of the system greatly benefits from feature selection and ensemble methods across corpora.
Original language | English |
---|---|
Pages (from-to) | 2851-2855 |
Number of pages | 5 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
DOIs | |
Publication status | Published - 2013 |
Externally published | Yes |
Event | 14th Annual Conference of the International Speech Communication Association, INTERSPEECH 2013 - Lyon, France Duration: 25 Aug 2013 → 29 Aug 2013 |
Keywords
- Ensemble methods
- Information gain
- Relief
- Speaker personality trait recognition