Investigating Reasons for Disagreement in Natural Language Inference

Nanjiang Jiang; Marie-Catherine de Marneffe

Vol. 10 (2022)

TACL approved

Investigating Reasons for Disagreement in Natural Language Inference

Published 2023-12-23

Nanjiang Jiang
Marie-Catherine de Marneffe

Nanjiang Jiang
The Ohio State University

Marie-Catherine de Marneffe
Department of Linguistics / FNRS The Ohio State University / UCLouvain

Abstract

We investigate how disagreement in natural language inference (NLI) annotation arises. We developed a taxonomy of disagreement sources with 10 categories spanning 3 high-level classes. We found that some disagreements are due to uncertainty in the sentence meaning, others to annotator biases and task artifacts, leading to different interpretations of the label distribution. We explore two modeling approaches for detecting items with potential disagreement: a 4-way classification with a “Complicated” label in addition to the three standard NLI labels, and a multilabel classification approach. We found that the multilabel classification is more expressive and gives better recall of the possible interpretations in the data.

Presented at EMNLP 2022 Article at MIT Press