Farspeech: Arabic natural language processing for live Arabic speech

Mohamed Eldesouki, Naassih Gopee, Ahmed Ali, Kareem Darwish

Research output: Contribution to journalConference articlepeer-review

4 Citations (Scopus)

Abstract

This paper presents FarSpeech, QCRI's combined Arabic speech recognition, natural language processing (NLP), and dialect identification pipeline. It features modern web technologies to capture live audio, transcribes Arabic audio, NLP processes the transcripts, and identifies the dialect of the speaker. For transcription, we use QATS, which is a Kaldi-based ASR system that uses Time Delay Neural Networks (TDNN). For NLP, we use a SOTA Arabic NLP toolkit that employs various deep neural network and SVM based models. Finally, our dialect identification system uses multi-modality from both acoustic and linguistic input. FarSpeech1 presents different screens to display the transcripts, text segmentation, part-of-speech tags, recognized named entities, diacritized text, and the identified dialect of the speech.

Original languageEnglish
Pages (from-to)2372-2373
Number of pages2
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2019-September
DOIs
Publication statusPublished - 2019
Event20th Annual Conference of the International Speech Communication Association: Crossroads of Speech and Language, INTERSPEECH 2019 - Graz, Austria
Duration: 15 Sept 201919 Sept 2019

Keywords

  • Live speech recognition
  • Natural Language Processing
  • Speech Transcription

Fingerprint

Dive into the research topics of 'Farspeech: Arabic natural language processing for live Arabic speech'. Together they form a unique fingerprint.

Cite this