Modelwith minimal translation units, but decodewith phrases

Nadir Durrani, Alexander Fraser, Helmut Schmid

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

17 Citations (Scopus)

Abstract

N-gram-based models co-exist with their phrase-based counterparts as an alternative SMT framework. Both techniques have pros and cons. While the N-gram-based framework provides a better model that captures both source and target contexts and avoids spurious phrasal segmentation, the ability to memorize and produce larger translation units gives an edge to the phrase-based systems during decoding, in terms of better search performance and superior selection of translation units. In this paper we combine N-grambased modeling with phrase-based decoding, and obtain the benefits of both approaches. Our experiments show that using this combination not only improves the search accuracy of the N-gram model but that it also improves the BLEU scores. Our system outperforms state-of-The-Art phrase-based systems (Moses and Phrasal) and N-gram-based systems by a significant margin on German, French and Spanish to English translation tasks.

Original languageEnglish
Title of host publicationProceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics
Subtitle of host publicationHuman Language Technologies
PublisherAssociation for Computational Linguistics (ACL)
Pages1-11
Number of pages11
ISBN (Electronic)9781937284473
Publication statusPublished - 2013
Externally publishedYes
Event2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013 - Atlanta, United States
Duration: 9 Jun 201314 Jun 2013

Publication series

NameNAACL HLT 2013 - 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Main Conference

Conference

Conference2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2013
Country/TerritoryUnited States
CityAtlanta
Period9/06/1314/06/13

Fingerprint

Dive into the research topics of 'Modelwith minimal translation units, but decodewith phrases'. Together they form a unique fingerprint.

Cite this