Skip to main navigation menu Skip to main content Skip to site footer

Dynamically Shaping the Reordering Search Space of Phrase-Based Statistical Machine Translation

Abstract

Defining the reordering search space is a crucial issue in phrase-based SMT between distant languages. In fact, the optimal trade-off between accuracy and complexity of decoding is nowadays reached by harshly limiting the input permutation space. We propose a method to dynamically shape such space and, thus, capture long-range word movements without hurting translation quality nor decoding time. The space defined by loose reordering constraints is dynamically pruned through a binary classifier that predicts whether a given input word should be translated right after another. The integration of this model into a phrase-based decoder improves a strong Arabic-English baseline already including state-of-the-art early distortion cost (Moore and Quirk, 2007) and hierarchical phrase orientation models (Galley and Manning, 2008). Significant improvements in the reordering of verbs are achieved by a system that is notably faster than the baseline, while BLEU and METEOR remain stable, or even increase, at a very high distortion limit. 

PDF (Presented at EMNLP 2014)

Author Biography

Arianna Bisazza

PhD student at Fondazione Bruno Kessler (HLT unit) and University of Trento

Marcello Federico

Head of the Human Language Technologies Unit at Fondazione Bruno Kessler