Skip to main navigation menu Skip to main content Skip to site footer

Dynamic Language Models for Streaming Text

Abstract

We present a probabilistic language model that captures temporal dynamics and conditions on arbitrary non-linguistic context features. These context features serve as important indicators of language changes that are otherwise difficult to capture using text data by itself. We learn our model in an efficient online fashion that is scalable for large, streaming data. With five streaming datasets from two different genres— economics news articles and social media—we evaluate our model on the task of sequential language modeling. Our model consistently outperforms competing models. 

PDF (Presented at EMNLP 2014)