Representation Learning for Grounded Spatial Reasoning

Michael Janner; Karthik Narasimhan; Regina Barzilay

Vol. 6 (2018)

TACL approved

Representation Learning for Grounded Spatial Reasoning

Published 2018-01-27

Michael Janner
Karthik Narasimhan
Regina Barzilay

Michael Janner
Massachusetts Institute of Technology

Karthik Narasimhan
Massachusetts Institute of Technology

Regina Barzilay
Massachusetts Institute of Technology

Abstract

The interpretation of spatial references is highly contextual, requiring joint inference over both language and the environment. We consider the task of spatial reasoning in a simulated environment, where an agent can act and receive rewards. The proposed model learns a representation of the world steered by instruction text. This design allows for precise alignment of local neighborhoods with corresponding verbalizations, while also handling global references in the instructions. We train our model with reinforcement learning using a variant of generalized value iteration. The model outperforms state-of-the-art approaches on several metrics, yielding a 45% reduction in goal localization error.

Article at MIT Press PDF (presented at ACL 2018)