Coreference Resolution through a seq2seq Transition-Based System

Bernd Bohnet; Chris Alberti; Michael Collins

Vol. 11 (2023)

TACL approved

Coreference Resolution through a seq2seq Transition-Based System

Published 2023-03-14

Bernd Bohnet
Chris Alberti
Michael Collins

Bernd Bohnet
Google Inc

Chris Alberti
Google Inc

Michael Collins
Google Inc

Abstract

Most recent coreference resolution systems use search algorithms over possible spans to identify mentions and resolve coreference. We instead present a coreference resolution system that uses a text-to-text (seq2seq) paradigm to predict mentions and links jointly. We implement the coreference system as a transition system and use multilingual T5 as an underlying language model. We obtain state-of-the-art accuracy on the CoNLL-2012 datasets with 83.3 F1-score for English (a 2.3 higher F1-score than previous work) using only CoNLL data for training, 68.5 F1-score for Arabic (+4.1 higher than previous work) and 74.3 F1-score for Chinese (+5.3). In addition we use the SemEval-2010 data sets for experiments in the zero-shot setting, a few-shot setting, and supervised setting using all available training data. We get substantially higher zero-shot F1-scores for 3 out of 4 languages than previous approaches and significantly exceed previous supervised state-of-the-art results for all five tested languages.

Presented at ACL 2023 Article at MIT Press

Author Biography

Bernd Bohnet

Senior Research Scientist, Google Reesearch