Cross-language transfer of semantic annotation via targeted crowdsourcing

Shammur Absar Chowdhury*, Arindam Ghosh, Evgeny A. Stepanov, Ali Orkan Bayer, Giuseppe Riccardi, Ioannis Klasinas

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

10 Citations (Scopus)

Abstract

The development of a natural language speech application requires the process of semantic annotation. Moreover multilingual porting of speech applications increases the cost and complexity of the annotation task. In this paper we address the problem of transferring the semantic annotation of the source language corpus to a low-resource target language via crowdsourcing. The current crowdsourcing approach faces several problems. First, the available crowdsourcing platforms have skewed distribution of language speakers. Second, speech applications require domain-specific knowledge. Third, the lack of reference target language annotation, makes crowdsourcing worker control very difficult. In this paper we address these issues on the task of cross-language transfer of domain-specific semantic annotation from an Italian spoken language corpus to Greek, via targeted crowdsourcing. The issue of domain knowledge transfer is addressed by priming the workers with the source language concepts. The lack of reference annotation is coped with a consensus-based annotation algorithm. The quality of annotation transfer is assessed using source language references and inter-annotator agreement. We demonstrate that the proposed computational methodology is viable and achieves acceptable annotation quality.

Original languageEnglish
Pages (from-to)2108-2112
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2014
Externally publishedYes
Event15th Annual Conference of the International Speech Communication Association: Celebrating the Diversity of Spoken Languages, INTERSPEECH 2014 - Singapore, Singapore
Duration: 14 Sept 201418 Sept 2014

Keywords

  • Annotation
  • Crowdsourcing
  • Porting
  • Spoken language understanding

Fingerprint

Dive into the research topics of 'Cross-language transfer of semantic annotation via targeted crowdsourcing'. Together they form a unique fingerprint.

Cite this