TREETALK: Composition and Compression of Trees for Image Descriptions

Polina Kuznetsova; Vicente Ordonez; Tamara Berg; Yejin Choi

Vol. 2 (2014)

TACL approved

TREETALK: Composition and Compression of Trees for Image Descriptions

Published 2014-10-07

Polina Kuznetsova
Vicente Ordonez
Tamara Berg
Yejin Choi

Polina Kuznetsova
Stony Brook University

Vicente Ordonez
University of North Carolina at Chapel Hill

Tamara Berg
University of North Carolina at Chapel Hill

Yejin Choi
Stony Brook University

Abstract

We present a new tree based approach to composing expressive image descriptions that makes use of naturally occuring web images with captions. We investigate two related tasks: image caption generalization and generation, where the former is an optional sub-task of the latter. The high-level idea of our approach is to harvest expressive phrases (as tree fragments) from existing image descriptions, then to compose a new description by selectively combining the extracted (and optionally pruned) tree fragments. Key algorithmic components are tree composition and compression, both integrating tree structure with sequence structure. Our proposed system attains significantly better performance than previous approaches for both image caption generalization and generation. In addition, our work is the first to show the empirical benefit of automatically generalized captions for composing natural image descriptions.

PDF (Presented at EMNLP 2014)