TY - GEN
T1 - Transfer of corpus-specific dialogue act annotation to ISO standard
T2 - 10th International Conference on Language Resources and Evaluation, LREC 2016
AU - Chowdhury, Shammur Absar
AU - Stepanov, Evgeny A.
AU - Riccardi, Giuseppe
PY - 2016
Y1 - 2016
N2 - Spoken conversation corpora often adapt existing Dialogue Act (DA) annotation specifications, such as DAMSL, DIT++, etc., to task specific needs, yielding incompatible annotations; thus, limiting corpora re-usability. Recently accepted ISO standard for DA annotation - Dialogue Act Markup Language (DiAML) - is designed as domain and application independent. Moreover, the clear separation of dialogue dimensions and communicative functions, coupled with the hierarchical organization of the latter, allows for classification at different levels of granularity. However, re-annotating existing corpora with the new scheme might require significant effort. In this paper we test the utility of the ISO standard through comparative evaluation of the corpus-specific legacy and the semi-automatically transferred DiAML DA annotations on supervised dialogue act classification task. To test the domain independence of the resulting annotations, we perform cross-domain and data aggregation evaluation. Compared to the legacy annotation scheme, on the Italian LUNA Human-Human corpus, the DiAML annotation scheme exhibits better cross-domain and data aggregation classification performance, while maintaining comparable in-domain performance.
AB - Spoken conversation corpora often adapt existing Dialogue Act (DA) annotation specifications, such as DAMSL, DIT++, etc., to task specific needs, yielding incompatible annotations; thus, limiting corpora re-usability. Recently accepted ISO standard for DA annotation - Dialogue Act Markup Language (DiAML) - is designed as domain and application independent. Moreover, the clear separation of dialogue dimensions and communicative functions, coupled with the hierarchical organization of the latter, allows for classification at different levels of granularity. However, re-annotating existing corpora with the new scheme might require significant effort. In this paper we test the utility of the ISO standard through comparative evaluation of the corpus-specific legacy and the semi-automatically transferred DiAML DA annotations on supervised dialogue act classification task. To test the domain independence of the resulting annotations, we perform cross-domain and data aggregation evaluation. Compared to the legacy annotation scheme, on the Italian LUNA Human-Human corpus, the DiAML annotation scheme exhibits better cross-domain and data aggregation classification performance, while maintaining comparable in-domain performance.
KW - DiAML
KW - Dialogue act
KW - Dialogue corpora
KW - ISO standard
KW - Speech
UR - http://www.scopus.com/inward/record.url?scp=84994380801&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84994380801
T3 - Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016
SP - 132
EP - 135
BT - Proceedings of the 10th International Conference on Language Resources and Evaluation, LREC 2016
A2 - Calzolari, Nicoletta
A2 - Choukri, Khalid
A2 - Mazo, Helene
A2 - Moreno, Asuncion
A2 - Declerck, Thierry
A2 - Goggi, Sara
A2 - Grobelnik, Marko
A2 - Odijk, Jan
A2 - Piperidis, Stelios
A2 - Maegaard, Bente
A2 - Mariani, Joseph
PB - European Language Resources Association (ELRA)
Y2 - 23 May 2016 through 28 May 2016
ER -