J-NERD: Joint Named Entity Recognition and Disambiguation with Rich Linguistic Features

Dat Ba Nguyen; Martin Theobald; Gerhard Weikum

Vol. 4 (2016)

TACL approved

J-NERD: Joint Named Entity Recognition and Disambiguation with Rich Linguistic Features

Published 2016-06-02

Dat Ba Nguyen
Martin Theobald
Gerhard Weikum

Dat Ba Nguyen
Max-Planck Institute for Informatics

Martin Theobald
University of Ulm

Gerhard Weikum
Max-Planck Institute for Informatics

Abstract

Methods for Named Entity Recognition and Disambiguation (NERD) perform NER and NED in two separate stages. Therefore, NED may be penalized with respect to precision by NER false positives, and suffers in recall from NER false negatives. Conversely, NED does not fully exploit information computed by NER such as types of mentions. This paper presents J-NERD, a new approach to perform NER and NED jointly, by means of a probabilistic graphical model that captures mention spans, mention types, and the mapping of mentions to entities in a knowledge base. We present experiments with different kinds of texts from the CoNLL’03, ACE’05, and ClueWeb’09-FACC1 corpora. J-NERD consistently outperforms state-of-the-art competitors in end-to-end NERD precision, recall, and F1.

PDF (presented at ACL 2016)