The Return of Lexical Dependencies: Neural Lexicalized PCFGs

Hao Zhu; Yonatan Bisk; Graham Neubig

Vol. 8 (2020)

TACL approved

The Return of Lexical Dependencies: Neural Lexicalized PCFGs

Published 2020-11-13

Hao Zhu
Yonatan Bisk
Graham Neubig

Hao Zhu
Carnegie Mellon University

Yonatan Bisk
Allen Institute for Artificial Intelligence Microsoft Research AI Carnegie Mellon University Paul G. Allen School for Computer Science and Engineering, University of Washington

Graham Neubig
Carnegie Mellon University

Abstract

In this paper we demonstrate that context free grammar (CFG) based methods for grammar induction benefit from modeling lexical dependencies. This contrasts to the most popular current methods for grammar induction, which focus on discovering either constituents or dependencies. Previous approaches to marry these two disparate syntactic formalisms (e.g. lexicalized PCFGs) have been plagued by sparsity, making them unsuitable for unsupervised grammar induction. However, in this work, we present novel neural models of lexicalized PCFGs which allow us to overcome sparsity problems and effectively induce both constituents and dependencies within a single model. Experiments demonstrate that this unified framework results in stronger results on both representations than achieved when modeling either formalism alone. Code is available at https://github.com/neulab/neural-lpcfg.

Article at MIT Press Presented at EMNLP 2020