TY - GEN
T1 - Investigating the usefulness of generalizedword representations in SMT
AU - Durrani, Nadir
AU - Koehn, Philipp
AU - Schmid, Helmut
AU - Fraser, Alexander
PY - 2014
Y1 - 2014
N2 - We investigate the use of generalized representations (POS, morphological analysis and word clusters) in phrase-based models and the N-gram-based Operation Sequence Model (OSM). Our integration enables these models to learn richer lexical and reordering patterns, consider wider contextual information and generalize better in sparse data conditions. When interpolating generalized OSM models on the standard IWSLT and WMT tasks we observed improvements of up to +1.35 on the English-to-German task and +0.63 for the German-to-English task. Using automatically generated word classes in standard phrase-based models and the OSM models yields an average improvement of +0.80 across 8 language pairs on the IWSLT shared task.
AB - We investigate the use of generalized representations (POS, morphological analysis and word clusters) in phrase-based models and the N-gram-based Operation Sequence Model (OSM). Our integration enables these models to learn richer lexical and reordering patterns, consider wider contextual information and generalize better in sparse data conditions. When interpolating generalized OSM models on the standard IWSLT and WMT tasks we observed improvements of up to +1.35 on the English-to-German task and +0.63 for the German-to-English task. Using automatically generated word classes in standard phrase-based models and the OSM models yields an average improvement of +0.80 across 8 language pairs on the IWSLT shared task.
UR - http://www.scopus.com/inward/record.url?scp=84931089394&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84931089394
T3 - COLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014: Technical Papers
SP - 421
EP - 432
BT - COLING 2014 - 25th International Conference on Computational Linguistics, Proceedings of COLING 2014
PB - Association for Computational Linguistics, ACL Anthology
T2 - 25th International Conference on Computational Linguistics, COLING 2014
Y2 - 23 August 2014 through 29 August 2014
ER -