QCRI advanced transcription system (QATS) for the Arabic Multi-Dialect Broadcast media recognition: MGB-2 challenge

Sameer Khurana, Ahmed Ali

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

32 Citations (Scopus)

Abstract

In this paper, we describe Qatar Computing Research Institute's (QCRI) speech transcription system for the 2016 Dialectal Arabic Multi-Genre Broadcast (MGB-2) challenge. MGB-2 is a controlled evaluation using 1,200 hours audio with lightly supervised transcription Our system which was a combination of three purely sequence trained recognition systems, achieved the lowest WER of 14.2% among the nine participating teams. Key features of our transcription system are: purely sequence trained acoustic models using the recently introduced Lattice free Maximum Mutual Information (LF-MMI) modeling framework; Language model rescoring using a four-gram and Recurrent Neural Network with Max-Ent connections (RNNME) language models; and system combination using Minimum Bayes Risk (MBR) decoding criterion. The whole system is built using kaldi speech recognition toolkit.

Original languageEnglish
Title of host publication2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages292-298
Number of pages7
ISBN (Electronic)9781509049035
DOIs
Publication statusPublished - 7 Feb 2017
Event2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - San Diego, United States
Duration: 13 Dec 201616 Dec 2016

Publication series

Name2016 IEEE Workshop on Spoken Language Technology, SLT 2016 - Proceedings

Conference

Conference2016 IEEE Workshop on Spoken Language Technology, SLT 2016
Country/TerritoryUnited States
CitySan Diego
Period13/12/1616/12/16

Keywords

  • Arabic Speech Recognition
  • Bi-directional LSTM
  • Kaldi
  • Purely sequence trained acoustic models
  • QATS
  • RNN LM

Fingerprint

Dive into the research topics of 'QCRI advanced transcription system (QATS) for the Arabic Multi-Dialect Broadcast media recognition: MGB-2 challenge'. Together they form a unique fingerprint.

Cite this