Minimally Supervised Number Normalization

Kyle Gorman; Richard Sproat

Vol. 4 (2016)

TACL approved

Minimally Supervised Number Normalization

Published 2016-11-21

Kyle Gorman
Richard Sproat

Kyle Gorman
Google

Richard Sproat
Google

Abstract

We propose two models for verbalizing numbers, a key component in speech recognition and synthesis systems. The first model uses an end-to-end recurrent neural network. The second model, drawing inspiration from the linguistics literature, uses finite-state transducers constructed with a minimal amount of training data. While both models achieve near-perfect performance, the latter model can be trained using several orders of magnitude less data than the former, making it particularly useful for low-resource languages.

PDF (presented at EMNLP 2016)

Author Biography

Kyle Gorman

Software engineer in the Speech & Language Algorithms group at Google, Inc.