Unsupervised Quality Estimation for Neural Machine Translation

Marina Fomicheva; Shuo Sun; Lisa Yankovskaya; Frédéric Blain; Francisco Guzmán; Mark Fishel; Nikolaos Aletras; Vishrav Chaudhary; Lucia Specia

Vol. 8 (2020)

TACL approved

Unsupervised Quality Estimation for Neural Machine Translation

Published 2020-09-10

Marina Fomicheva
Shuo Sun
Lisa Yankovskaya
Frédéric Blain
Francisco Guzmán
Mark Fishel
Nikolaos Aletras
Vishrav Chaudhary
Lucia Specia

Marina Fomicheva
University of Sheffield

Shuo Sun
Facebook Applied Machine Learning

Lisa Yankovskaya
University of Tartu

Frédéric Blain
University of Sheffield

Francisco Guzmán
Facebook Applied Machine Learning

Mark Fishel
University of Tartu

Nikolaos Aletras
University of Sheffield

Vishrav Chaudhary
Facebook Applied Machine Learning

Lucia Specia
University of Sheffield Imperial College London

Abstract

Quality Estimation (QE) is an important component in making Machine Translation (MT) useful in real-world applications, as it is aimed to inform the user on the quality of the MT output at test time. Existing approaches require large amounts of expert annotated data, computation and time for training. As an alternative, we devise an unsupervised approach to QE where no training or access to additional resources besides the MT system itself is required. Different from most of the current work that treats the MT system as a black box, we explore useful information that can be extracted from the MT system as a by-product of translation. By employing methods for uncertainty quantification, we achieve very good correlation with human judgments of quality, rivalling state-of-the-art supervised QE models. To evaluate our approach we collect the first dataset that enables work on both black-box and glass-box approaches to QE.

Article at MIT Press Presented at EMNLP 2020