Skip to main navigation menu Skip to main content Skip to site footer

Template-based Abstractive Microblog Opinion Summarisation

Abstract

We introduce the task of microblog opinion summarisation (MOS) and share a dataset of 3100 gold-standard opinion summaries to facilitate research in this domain. The dataset contains summaries of tweets spanning a 2-year period and covers more topics than any other public Twitter summarisation dataset. Summaries are abstractive in nature and have been created by journalists skilled in summarising news articles following a template separating factual information (main story) from author opinions. Our method differs from previous work on generating gold-standard summaries from social media, which usually involves selecting representative posts and thus favours extractive summarisation models. To showcase the dataset's utility and challenges, we benchmark a range of abstractive and extractive state-of-the-art summarisation models and achieve good performance, with the former outperforming the latter. We also show fine-tuning is necessary to improve performance and investigate the benefits of using different sample sizes.
Article at MIT Press Presented at EMNLP 2022

Author Biography

Iman Munire Bilal

Computer Science Department, PhD Candidate