Edinburgh s phrase-based machine translation systems for WMT-14

Nadir Durrani, Barry Haddow, Philipp Koehn, Kenneth Heafield

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

27 Citations (Scopus)

Abstract

This paper describes the University of Edinburgh s (UEDIN) phrase-based submissions to the translation and medical translation shared tasks of the 2014 Workshop on Statistical Machine Translation (WMT). We participated in all language pairs. We have improved upon our 2013 system by i) using generalized representations, specifically automatic word clusters for translations out of English, ii) using unsupervised character-based models to translate unknown words in Russian-English and Hindi-English pairs, iii) synthesizing Hindi data from closely-related Urdu data, and iv) building huge language on the common crawl corpus.

Original languageEnglish
Title of host publicationACL 2014 - 9th Workshop on Statistical Machine Translation, WMT 2014, Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages97-104
Number of pages8
ISBN (Electronic)9781941643174
Publication statusPublished - 2014
Externally publishedYes
Event9th Workshop on Statistical Machine Translation, WMT 2014 at the 52nd Conference of the Association for Computational Linguistics, ACL 2014 - Baltimore, United States
Duration: 26 Jun 201427 Jun 2014

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference9th Workshop on Statistical Machine Translation, WMT 2014 at the 52nd Conference of the Association for Computational Linguistics, ACL 2014
Country/TerritoryUnited States
CityBaltimore
Period26/06/1427/06/14

Fingerprint

Dive into the research topics of 'Edinburgh s phrase-based machine translation systems for WMT-14'. Together they form a unique fingerprint.

Cite this