Measuring Machine Translation Errors in New Domains

Ann Irvine; John Morgan; Marine Carpuat; Hal Daumé III; Dragos Munteanu

Vol. 1 (2013)

TACL approved

Measuring Machine Translation Errors in New Domains

Published 2013-10-31

Ann Irvine
John Morgan
Marine Carpuat
Hal Daumé III
Dragos Munteanu

Ann Irvine

John Morgan

Marine Carpuat

Hal Daumé III
University of Maryland

Dragos Munteanu

Abstract

We develop two techniques for analyzing the effect of porting a machine translation system to a new domain. One is a macro-level analysis that measures how domain shift affects corpus-level evaluation; the second is a micro-level analysis for word-level errors. We apply these methods to understand what happens when a Parliament-trained phrase-based machine translation system is applied in four very different domains: news, medical texts, scientific articles and movie subtitles. We present quantitative and qualitative experiments that highlight opportunities for future research in domain adaptation for machine translation.

PDF (Presented at EMNLP 2013)