Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation

Patrick Fernandes; Aman Madaan; Emmy Liu; António Farinhas; Pedro Martins; Amanda Bertsch; José Souza; Shuyan Zhou; Tongshuang Wu; Graham Neubig; André Martins

Vol. 11 (2023)

TACL approved

Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation

Published 2023-12-23

Patrick Fernandes
Aman Madaan
Emmy Liu
António Farinhas
Pedro Martins
Amanda Bertsch
José Souza
Shuyan Zhou
Tongshuang Wu
Graham Neubig
André Martins

Patrick Fernandes
Carnegie Mellon University / Instituto Superior Técnico

Aman Madaan
Carnegie Mellon University

Emmy Liu
Carnegie Mellon University

António Farinhas
Instituto Superior Técnico

Pedro Martins
Unbabel

Amanda Bertsch
Carnegie Mellon University

José Souza
Unbabel

Shuyan Zhou
Carnegie Mellon University

Tongshuang Wu
Carnegie Mellon University

Graham Neubig
Carnegie Mellon University / Inspired Cognition

André Martins
Instituto Superior Técnico / Unbabel

Abstract

Natural language generation has witnessed significant advancements due to the training of large language models on vast internet-scale datasets. Despite these advancements, there exists a critical challenge: these models can inadvertently generate content that is toxic, inaccurate, and unhelpful, and existing automatic evaluation metrics often fall short of identifying these shortcomings.
As models become more capable, human feedback is an invaluable signal for evaluating and improving models. This survey aims to provide an overview of recent research that has leveraged human feedback to improve natural language generation.
First, we introduce a taxonomy distilled from existing research to categorize and organize the varied forms of feedback.

Next, we discuss how feedback can be described by its format and objective, and cover the two approaches proposed to use feedback (either for training or decoding): directly using feedback or training feedback models. We also discuss existing datasets for human-feedback data collection, and concerns surrounding feedback collection. Finally, we provide an overview of the nascent field of AI feedback, which uses large language models to make judgments based on a set of principles and minimize the need for human intervention. We also release a website of this survey at feedback-gap-survey.info

Article at MIT Press Presented at EMNLP 2023