Unsupervised Lexicon Discovery from Acoustic Input

Chia-ying Lee; Timothy J. O'Donnell; James Glass

Vol. 3 (2015)

TACL approved

Unsupervised Lexicon Discovery from Acoustic Input

Published 2015-07-17

Chia-ying Lee
Timothy J. O'Donnell
James Glass

Abstract

We present a model of unsupervised phonological lexicon discovery -- the problem of simultaneously learning phoneme-like and word-like units from acoustic input. Our model builds on earlier models of unsupervised phone-like unit discovery from acoustic data (Lee and Glass, 2012), and unsupervised symbolic lexicon discovery using the Adaptor Grammar framework (Johnson et al., 2006), integrating these earlier approaches using a probabilistic model of phonological variation. We show that the model is competitive with state-of-the-art spoken term discovery systems, and present analyses exploring the model's behavior and the kinds of linguistic structures it learns.

PDF (presented at ACL 2016)