Arabic retrieval revisited: Morphological hole filling

Kareem Darwish, Ahmed M. Ali

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Citations (Scopus)

Abstract

Due to Arabic's morphological complexity, Arabic retrieval benefits greatly from morphological analysis - particularly stemming. However, the best known stemming does not handle linguistic phenomena such as broken plurals and malformed stems. In this paper we propose a model of character-level morphological transformation that is trained using Wikipedia hypertext to page title links. The use of our model yields statistically significant improvements in Arabic retrieval over the use of the best statistical stemming technique. The technique can potentially be applied to other languages.

Original languageEnglish
Title of host publication50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference
Pages218-222
Number of pages5
Publication statusPublished - 2012
Event50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Jeju Island, Korea, Republic of
Duration: 8 Jul 201214 Jul 2012

Publication series

Name50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference
Volume2

Conference

Conference50th Annual Meeting of the Association for Computational Linguistics, ACL 2012
Country/TerritoryKorea, Republic of
CityJeju Island
Period8/07/1214/07/12

Fingerprint

Dive into the research topics of 'Arabic retrieval revisited: Morphological hole filling'. Together they form a unique fingerprint.

Cite this