Template-based Abstractive Microblog Opinion Summarisation

Iman Munire Bilal; Bo Wang; Adam Tsakalidis; Dong Nguyen; Rob Procter; Maria Liakata

Vol. 10 (2022)

TACL approved

Template-based Abstractive Microblog Opinion Summarisation

Published 2022-11-22

Iman Munire Bilal
Bo Wang
Adam Tsakalidis
Dong Nguyen
Rob Procter
Maria Liakata

Iman Munire Bilal
Department of Computer Science, University of Warwick, Coventry, UK The Alan Turing Institute, London, UK

Bo Wang
Center for Precision Psychiatry, Massachusetts General Hospital, Boston, USA The Alan Turing Institute, London, UK

Adam Tsakalidis
School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK The Alan Turing Institute, London, UK

Dong Nguyen
Department of Information and Computing Sciences, Utrecht University, Utrecht, the Netherlands

Rob Procter
Department of Computer Science, University of Warwick, Coventry, UK The Alan Turing Institute, London, UK

Maria Liakata
School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK Department of Computer Science, University of Warwick, Coventry, UK The Alan Turing Institute, London, UK

Abstract

We introduce the task of microblog opinion summarisation (MOS) and share a dataset of 3100 gold-standard opinion summaries to facilitate research in this domain. The dataset contains summaries of tweets spanning a 2-year period and covers more topics than any other public Twitter summarisation dataset. Summaries are abstractive in nature and have been created by journalists skilled in summarising news articles following a template separating factual information (main story) from author opinions. Our method differs from previous work on generating gold-standard summaries from social media, which usually involves selecting representative posts and thus favours extractive summarisation models. To showcase the dataset's utility and challenges, we benchmark a range of abstractive and extractive state-of-the-art summarisation models and achieve good performance, with the former outperforming the latter. We also show fine-tuning is necessary to improve performance and investigate the benefits of using different sample sizes.

Article at MIT Press Presented at EMNLP 2022

Author Biography

Iman Munire Bilal

Computer Science Department, PhD Candidate