Learning to Paraphrase Sentences to Different Complexity Levels

Alison Hanyi Chi; Li-Kuang Chen; Yi-Chen Chang; Shu-Hui Lee; Jason S. Chang

Vol. 11 (2023)

TACL approved

Learning to Paraphrase Sentences to Different Complexity Levels

Published 2023-11-26

Alison Hanyi Chi
Li-Kuang Chen
Yi-Chen Chang
Shu-Hui Lee
Jason S. Chang

Alison Hanyi Chi
National Tsing Hua University

Li-Kuang Chen
National Tsing Hua University

Yi-Chen Chang
National Tsing Hua University

Shu-Hui Lee
National Tsing Hua University

Jason S. Chang
National Tsing Hua University

Abstract

While sentence simplification is an active research topic in NLP, its adjacent tasks of sentence complexification and same-level paraphrasing are not. To train models on all three tasks, we present two new unsupervised datasets. We compare these datasets, one labeled by a weak classifier and the other by a rule-based approach, with a single supervised dataset. Using these three datasets for training, we perform extensive experiments on both multitasking and prompting strategies. Compared to other systems trained on unsupervised parallel data, models trained on our weak classifier labeled dataset achieve state-of-the-art performance on the ASSET simplification benchmark.

Presented at EMNLP 2023 Article at MIT Press