FeTaQA: Free-form Table Question Answering

Linyong Nan; Chiachun Hsieh; Ziming Mao; Xi Lin; Neha Verma; Rui Zhang; Wojciech Kryściński; Hailey Schoelkopf; Riley Kong; Xiangru Tang; Mutethia Mutuma; Benjamin Rosand; Isabel Trindade; Renusree Bandaru; Jacob Cunningham; Caiming Xiong; Dragomir Radev

Vol. 10 (2022)

TACL approved

FeTaQA: Free-form Table Question Answering

Published 2022-01-28

Linyong Nan
Chiachun Hsieh
Ziming Mao
Xi Lin
Neha Verma
Rui Zhang
Wojciech Kryściński
Hailey Schoelkopf
Riley Kong
Xiangru Tang
Mutethia Mutuma
Benjamin Rosand
Isabel Trindade
Renusree Bandaru
Jacob Cunningham
Caiming Xiong
Dragomir Radev

Linyong Nan
Yale University

Chiachun Hsieh
The University of Hong Kong

Ziming Mao
Yale University

Xi Lin
Facebook AI

Neha Verma
Yale University

Rui Zhang
Penn State University

Wojciech Kryściński
Salesforce Research

Hailey Schoelkopf
Yale University

Riley Kong
Archbishop Mitty High School

Xiangru Tang
Yale University

Mutethia Mutuma
Yale University

Benjamin Rosand
Yale University

Isabel Trindade
Yale University

Renusree Bandaru
Penn State University

Jacob Cunningham
Penn State University

Caiming Xiong
Salesforce Research

Dragomir Radev
Yale University

Abstract

Existing table question answering datasets contain abundant factual questions that primarily evaluate a QA system’s comprehension of query and tabular data. However, restricted by their short-form answers, these datasets fail to include question-answer interactions that represent more advanced and naturally occurring information needs: questions that ask for reasoning and integration of information pieces retrieved from a structured knowledge source. To complement the existing datasets and to reveal the challenging nature of the table-based question answering task, we introduce FeTaQA, a new dataset with 10K Wikipedia-based {table, question, free-form answer, supporting table cells} pairs. FeTaQA is collected from noteworthy descriptions of Wikipedia tables which contain information people tend to seek; generation of these descriptions requires advanced processing that humans perform on a daily basis: understand the question and table, retrieve, integrate, infer, and conduct text planning and surface realization to generate an answer. We provide two benchmark methods for the proposed task: a pipeline method based on semantic parsing-based QA systems and an end-to-end method based on large pretrained text generation models, and show that FeTaQA poses a challenge for both methods

Article at MIT Press Presented at ACL 2022