Learning Distributed Representations of Texts and Entities from Knowledge Base

Ikuya Yamada; Hiroyuki Shindo; Hideaki Takeda; Yoshiyasu Takefuji

Vol. 5 (2017)

TACL approved

Learning Distributed Representations of Texts and Entities from Knowledge Base

Published 2017-11-06

Ikuya Yamada
Hiroyuki Shindo
Hideaki Takeda
Yoshiyasu Takefuji

Ikuya Yamada
Studio Ousia

Hiroyuki Shindo
Nara Institute of Science and Technology

Hideaki Takeda
National Institute of Informatics

Yoshiyasu Takefuji
Keio University

Abstract

We describe a neural network model that jointly learns distributed representations of texts and knowledge base (KB) entities. Given a text in the KB, we train our proposed model to predict entities that are relevant to the text. Our model is designed to be generic with the ability to address various NLP tasks with ease. We train the model using a large corpus of texts and their entity annotations extracted from Wikipedia. We evaluated the model on three important NLP tasks (i.e., sentence textual similarity, entity linking, and factoid question answering) involving both unsupervised and supervised settings. As a result, we achieved state-of-the-art results on all three of these tasks. Our code and trained models are publicly available for further academic research.

PDF (presented at ACL 2018)