Neural Fine-grained Entity Type Classification is the task of assigning a fine-grained type (e.g., athlete or artist instead of just person) to an entity mentioned in text. Fine-grained types have been shown superior to the standard generic types for most NLP tasks, including information extraction which is our main concern. Other tasks where fine-grained types have improved existing methods include question answering, named entity disambiguation (a.k.a. entity linking), and knowledge base completion, to mention just a few.

Given their large number, fine-grained type systems are organized into hierarchies (e.g., athlete and artist are sibling types, descending from person and musician is a subtype of artist). However, the hierarchical nature of types creates a problem for obtaining large training datasets, as is often done through distant supervision (Mitz et al 2009): namely, to rely on a corpus annotated with identifiers of entities from a knowledge graph. The image below illustrates the problem:

Problems of distant supervidion for NFETC

Note that the entity for which we seek training data has three types: person, athlete and coach, but the sentences annotated with that entity are not applicable to all types. Sentence S2, for example, is useful for the type athlete only and should not be used for training an entity classifier for type coach. Moreover, S2 should "contribute" less towards the type person than the type athlete.


To address the limitation of distant supervision described above, we introduce an "end-to-end" neural network based on a standard bi-LSTM for context encoding and with a hierarchy-aware loss function that penalizes using training sentences that seem out of context (with respect to other training examples for the same type). In other words, the network learns which phrases are more appropriate for which levels of the hierarchy as it goes.


We tested our network on two standard datasets in the literature: the FIGER dataset (Lin and Weld 2012) and a modified version of the Ontonotes dataset for the fine-grained entity typing task (Gillick et al. 2014).

Main results

In the table, Attentive, AFET, LNR+FIGER and AAA are previous work, and the numbers reported are from the respective papers.

As one can see, Our NFETC method which makes use of the hierarchy-aware loss function outperforms the previous work across all metrics and on both datasets.


For more details, code and/or data, please check:

  • Code and data used in the paper
  • P. Xu and D. Barbosa. Neural fine-grained entity type classification with hierarchy-aware loss. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 16–25. New Orleans, Louisiana, June 2018. Association for Computational Linguistics. URL:, doi:10.18653/v1/N18-1002. [Bibtex]  [PDF]