Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models

Jianhui Pang; Fanghua Ye; Derek Fai Wong; Dian Yu; Shuming Shi; Zhaopeng Tu; Longyue Wang

Vol. 13 (2025)

TACL approved

Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models

Published 2025-12-25

Jianhui Pang
Fanghua Ye
Derek Fai Wong
Dian Yu
Shuming Shi
Zhaopeng Tu
Longyue Wang

Jianhui Pang
University of Macau

Fanghua Ye
University College London

Derek Fai Wong
University of Macau

Dian Yu

Shuming Shi

Zhaopeng Tu
Tencent AI Lab

Longyue Wang
Tencent AI Lab

Abstract

The evolution of Neural Machine Translation (NMT) has been significantly influenced by six core challenges, which have acted as benchmarks for progress in this field. This study revisits these challenges, offering insights into their ongoing relevance in the context of advanced Large Language Models (LLMs): domain mismatch, amount of parallel data, rare word prediction, translation of long sentences, attention model as word alignment}, and sub-optimal beam search}. Our empirical findings show that LLMs effectively reduce reliance on parallel data for major languages during pretraining and significantly improve translation of long sentences containing approximately 80 words, even translating documents up to 512 words. Despite these improvements, challenges in domain mismatch and rare word prediction persist. While word alignment and beam search challenges, specific to NMT, may not apply to LLMs, we identify three new challenges for LLM translation tasks: inference efficiency, translation of low-resource languages during pretraining, and human-aligned evaluation.

Presented at ACL 2025 Article at MIT Press