Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue Response Generation Models by Causal Discovery

Tao Feng; Lizhen Qu; Gholamreza Haffari

Vol. 11 (2023)

TACL approved

Less is More: Mitigate Spurious Correlations for Open-Domain Dialogue Response Generation Models by Causal Discovery

Published 2023-05-31

Tao Feng
Lizhen Qu
Gholamreza Haffari

Tao Feng
Monash University

Lizhen Qu
Monash University

Gholamreza Haffari
Monash University

Abstract

In this paper, we conduct the first study on spurious correlations for open-domain response generation models based on a corpus CGDIALOG curated in our work. The current models indeed suffer from spurious correlations and have a tendency of generating irrelevant and generic responses. Inspired by causal discovery algorithms, we propose a novel model-agnostic method for training and inference of response generation model using a conditional independence classifier. The classifier is trained by a constrained self-training method, coined CONSTRAIN, to overcome data scarcity. The experimental results based on both human and automatic evaluation show that our method significantly outperforms the competitive baselines in terms of relevance, informativeness, and fluency.

Presented at ACL 2023

Author Biography

Tao Feng

Faculty of Information Technology