Compositional Zero-Shot Domain Transfer with Text-to-Text Models

Fangyu Liu; Qianchu Liu; Shruthi Bannur; Fernando Pérez-García; Naoto Usuyama; Sheng Zhang; Tristan Naumann; Aditya Nori; Hoifung Poon; Javier Alvarez-Valle; Ozan Oktay; Stephanie L Hyland

Vol. 11 (2023)

TACL approved

Compositional Zero-Shot Domain Transfer with Text-to-Text Models

Published 2023-09-07

Fangyu Liu
Qianchu Liu
Shruthi Bannur
Fernando Pérez-García
Naoto Usuyama
Sheng Zhang
Tristan Naumann
Aditya Nori
Hoifung Poon
Javier Alvarez-Valle
Ozan Oktay
Stephanie L Hyland

Fangyu Liu
Cambridge University

Qianchu Liu
Microsoft Health Futures

Shruthi Bannur
Microsoft Health Futures

Fernando Pérez-García
Microsoft Health Futures

Naoto Usuyama
Microsoft Health Futures

Sheng Zhang
Microsoft Health Futures

Tristan Naumann
Microsoft Health Futures

Aditya Nori
Microsoft Health Futures

Hoifung Poon
Microsoft Health Futures

Javier Alvarez-Valle
Microsoft Health Futures

Ozan Oktay
Microsoft Health Futures

Stephanie L Hyland
Microsoft Health Futures

Abstract

Label scarcity is a bottleneck for improving task performance in specialised domains. We propose a novel compositional transfer learning framework (DoT5) for zero-shot domain transfer. Without access to in-domain labels, DoT5 jointly learns domain knowledge (from MLM of unlabelled in-domain free text) and task knowledge (from task training on more readily available general-domain data) in a multi-task manner. To improve the transferability of task training, we design a strategy named NLGU: we simultaneously train NLG for in-domain label-to-data generation which enables data augmentation for self-finetuning and NLU for label prediction. We evaluate DoT5 on the biomedical domain and the resource-lean subdomain of radiology, focusing on NLI, text summarisation and embedding learning. DoT5 demonstrates the effectiveness of compositional transfer learning through multi-task learning. In particular, DoT5 outperforms the current SOTA in zero-shot transfer by over 7 absolute points in accuracy on RadNLI. We validate DoT5 with ablations and a case study demonstrating its ability to solve challenging NLI examples requiring in-domain expertise.

Article at MIT Press Presented at EMNLP 2023