Learning to Remember Translation History with a Continuous Cache

Zhaopeng Tu; Yang Liu; Shuming Shi; Tong Zhang

Vol. 6 (2018)

TACL approved

Learning to Remember Translation History with a Continuous Cache

Published 2018-07-02

Zhaopeng Tu
Yang Liu
Shuming Shi
Tong Zhang

Zhaopeng Tu
Tencent AI Lab

Yang Liu
Department of Computer Science and Technology, Tsinghua University

Shuming Shi
Tencent AI Lab

Tong Zhang
Tencent AI Lab

Abstract

Existing neural machine translation (NMT) models generally translate sentences in isolation, missing the opportunity to take advantage of document-level information. In this work, we propose to augment NMT models with a very light-weight cache-like memory network, which stores recent hidden representations as translation history. The probability distribution over generated words is updated online depending on the translation history retrieved from the memory, endowing NMT models with the capability to dynamically adapt over time. Experiments on multiple domains with different topics and styles show the effectiveness of the proposed approach with negligible impact on the computational cost.

Article at MIT Press PDF (presented at ACL 2018)