Article Preview
Top1. Introduction
A large number of unstructured texts have emerged to represent web content and it increases the need for the transformation of these texts into well-structured forms (Jia et al., 2020). To overcome this issue, entity linking (EL) aims to map ambiguous mentions of referent entities from a given knowledge base in documents. It is also a preliminary task for many research areas such as relation extraction (Zhang et al., 2018), link prediction (Wei et al., 2017) and knowledge base completion (Shang et al., 2019). The main challenge of entity linking is to disambiguate candidate entities for detected mentions. Ranking candidate entities based on their relatedness to the detected mentions is the key step for the entity disambiguation. Recent state-of-the-art EL methods consider global coherence to compute the relatedness between candidate entities where the global coherence is the relatedness between all candidates in the same document (Ratinov et al., 2011). But the computation of global coherence usually depends on well-defined Wikipedia link structures. Although Wikipedia is a great source for EL research, we need knowledge agnostic approaches to disambiguate entities by exchanging the underlying knowledge base such as DBpedia (Mendes et al., 2011) and YAGO2 (Ho art et al., 2011a) or any domain-oriented knowledge bases including DBLP (Ley, 2002) and LinkedMDB (Hassanzadeh & Consens, 2009).
General approaches to EL task considering most domains which tend to not including an external knowledge base readily available. As a result, these methods cannot link to any entities in many situations. Hence, many fine-grained domain-oriented entities do not exist in a knowledge base (Chen et al. 2018). In most cases, it is not feasible to transform the system into the knowledge agnostic form which can be adjusted to suit a particular domain.
In this study, we present an EL system by transferring the main approach of sequence-to-sequence learning to EL problem in a domain-oriented way. The domain orientation aims to detect domains or topics of the input texts and then filter the relevant candidate entities before the disambiguation. Rather using a conditional random field (CRF) (Lafferty et al., 2001) with the bidirectional long short-term memory (Bi-LSTM) (Graves & Schmidhuber, 2005) models for the EL method (Inan & Dikenelli, 2018), we employ an attention mechanism (Bahdanau et al., 2014) to pay more attention for different mentions in the input sequence as it outputs the corresponding entity in the output sequence while injecting the detected domain information of these entities. Also, we train the Bi-LSTM model on top of the generated semantic embeddings (Ristoski & Paulheim, 2016) which produces a knowledge agnostic way to adapt the model for a particular domain by embedding the input vector into the encoder model and corresponding entity vector representations as to the output of the decoder model. The contributions of our study can be summarized as below:
- •
Our method introduces a novel algorithm that leverages semantic and document embeddings to become a knowledge base independent EL method for both domain detection and global entity disambiguation.
- •
Our method introduces three vertical layer architecture to implement a sequence-to-sequence learning model like a neural translation task in which the detected mention sequences of a document will be translated into a sequence of referent entities using the domain-oriented knowledge base.
- •
Our method presents a domain information layer before neural models in which it produces a scalable architecture for EL task using domain-oriented knowledge resources. Thus, it is possible to extend the architecture by adding necessary knowledge for a new domain.
- •
Finally, the attention mechanism leverages the domain information of mentions to increase the importance of mentions while assigning candidate entities.