Skip to main navigation menu Skip to main content Skip to site footer

Trick Me If You Can: Human-in-the-loop Generation of Adversarial Question Answering Examples

Abstract

Adversarial evaluation is a promising paradigm to stress test a model's ability to understand natural language. While past approaches expose superficial patterns learned by models, the resulting adversarial examples are limited in complexity and diversity. We propose a human-in-the-loop adversarial generation process, where humans are guided by model interpretations through an interactive interface. We apply this generation framework to a question answering task called Quizbowl, and ask trivia enthusiasts to craft questions that trick computer systems. We validate the resulting adversarial questions via live human--computer tournaments, showing that although they appear ordinary for human players, the questions systematically stump both neural and information retrieval models. The adversarial questions cover diverse phenomena, spanning multi-hop reasoning to entity type distractors, exposing remaining challenges in robust question answering.
Article at MIT Press (presented at ACL 2019)