Back to Search Start Over

THINKER - Entity Linking System for Turkish Language.

Authors :
Kalender, Murat
Korkmaz, Emin Erkan
Source :
IEEE Transactions on Knowledge & Data Engineering. Feb2018, Vol. 30 Issue 2, p367-380. 14p.
Publication Year :
2018

Abstract

Entity linking is one of the problems to be handled in order to process natural language and to enrich the existing unstructured text with metadata. The generation of assignments between knowledge base entities and lexical units is called entity linking. Although a number of systems have been proposed for linking entity mentions in various languages, there is currently no publicly available entity linking system specific to the Turkish language. This paper presents a novel entity linking system—THINKER - for linking Turkish content with entities defined in the Turkish dictionary (tdk.gov.tr) or Turkish Wikipedia (tr.wikipedia.org). Specifically, we first propose a novel machine learning based entity detection algorithm for the Turkish language. Then, we propose a collective disambiguation algorithm which utilizes a set of metrics for the linking task and, which is optimized using a genetic algorithm. The effectiveness of THINKER is validated empirically over generated data sets. The experimental results show that THINKER outperformed the state-of-the-art cross-lingual and multilingual entity linking systems in the literature. High entity linking performance (74.81 percent F1 score) is achieved by extending previous methods with some features specific to Turkish language and by developing a novel method that can learn better representations of entity embeddings. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
10414347
Volume :
30
Issue :
2
Database :
Academic Search Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
127252570
Full Text :
https://doi.org/10.1109/TKDE.2017.2761743