Qcmuq@qalb-2015 shared task: Combining character level mt and error-tolerant finite-state recognition for arabic spelling correction

Houda Bouamor, Hassan Sajjad, Nadir Durrani, Kemal Oflazer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Citations (Scopus)

Abstract

We describe the CMU-Q and QCRI's joint efforts in building a spelling correction system for Arabic in the QALB 2015 Shared Task. Our system is based on a hybrid pipeline that combines rule-based linguistic techniques with statistical methods using language modeling and machine translation, as well as an error-tolerant finite-state automata method. We trained and tested our spelling corrector using the dataset provided by the shared task organizers. Our system outperforms the baseline system and yeilds better correction quality with an F-score of 68.12 on L1- test-2015 testset and 38.90 on the L2-test- 2015. This ranks us 2nd in the L2 subtask and 5th in the L1 subtask.

Original languageEnglish
Title of host publication2nd Workshop on Arabic Natural Language Processing, ANLP 2015 - held at 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015 - Proceedings
EditorsNizar Habash, Stephan Vogel, Kareem Darwish
PublisherAssociation for Computational Linguistics (ACL)
Pages144-149
Number of pages6
ISBN (Electronic)9781941643587
Publication statusPublished - 2015
Event2nd Workshop on Arabic Natural Language Processing, ANLP 2015 - Beijing, China
Duration: 30 Jul 2015 → …

Publication series

Name2nd Workshop on Arabic Natural Language Processing, ANLP 2015 - held at 53rd Annual Meeting of the Association for Computational Linguistics, ACL 2015 - Proceedings

Conference

Conference2nd Workshop on Arabic Natural Language Processing, ANLP 2015
Country/TerritoryChina
CityBeijing
Period30/07/15 → …

Fingerprint

Dive into the research topics of 'Qcmuq@qalb-2015 shared task: Combining character level mt and error-tolerant finite-state recognition for arabic spelling correction'. Together they form a unique fingerprint.

Cite this