QCRI-MES Submission at WMT13: Using Transliteration Mining to Improve Statistical Machine Translation

Hassan Sajjad, Svetlana Smekalova, Nadir Durrani, Alexander Fraser, Helmut Schmid

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

This paper describes QCRI-MES's submission on the English-Russian dataset to the Eighth Workshop on Statistical Machine Translation. We generate improved word alignment of the training data by incorporating an unsupervised transliteration mining module to GIZA++ and build a phrase-based machine translation system. For tuning, we use a variation of PRO which provides better weights by optimizing BLEU+1 at corpus-level. We transliterate out-of-vocabulary words in a post-processing step by using a transliteration system built on the transliteration pairs extracted using an unsupervised transliteration mining system. For the Russian to English translation direction, we apply linguistically motivated pre-processing on the Russian side of the data.

Original languageEnglish
Title of host publicationWMT 2013 - 8th Workshop on Statistical Machine Translation, Proceedings
EditorsOndrej Bojar, Christian Buck, Chris Callison-Burch, Barry Haddow, Philipp Koehn, Christof Monz, Matt Post, Herve Saint-Amand, Radu Soricut, Lucia Specia
PublisherAssociation for Computational Linguistics (ACL)
Pages219-224
Number of pages6
ISBN (Electronic)9781937284572
Publication statusPublished - 2013
Externally publishedYes
Event8th Workshop on Statistical Machine Translation, WMT 2013 - Sofia, Bulgaria
Duration: 8 Aug 20139 Aug 2013

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference8th Workshop on Statistical Machine Translation, WMT 2013
Country/TerritoryBulgaria
CitySofia
Period8/08/139/08/13

Fingerprint

Dive into the research topics of 'QCRI-MES Submission at WMT13: Using Transliteration Mining to Improve Statistical Machine Translation'. Together they form a unique fingerprint.

Cite this