Improvement of Web Semantic and Transformer-Based Knowledge Graph Completion in Low-Dimensional Spaces

Improvement of Web Semantic and Transformer-Based Knowledge Graph Completion in Low-Dimensional Spaces

Xiai Yan, Yao Yi, Weiqi Shi, Hua Tian, Xin Su
Copyright: © 2024 |Pages: 18
DOI: 10.4018/IJSWIS.336919
Article PDF Download
Open access articles are freely available for download

Abstract

In recent years, knowledge graph completion (KGC) has garnered significant attention. However, noise in the graph poses numerous challenges to the completion of tasks, including error propagation, missing information, and misleading relations. Many existing KGC methods utilize the multi-head self-attention mechanism (MHA) in transformers, which yields favorable results in low-dimensional space. Nevertheless, employing MHA introduces the risk of overfitting due to a large number of additional parameters, and the choice of model loss function is not comprehensive enough to capture the semantic discriminatory nature between entities and relationships and the treatment of RDF indicates that the dataset contains only positive (training) examples, and the error facts are not encoded, which tends to cause overgeneralization.
Article Preview
Top

Introduction

The landmark development of the Semantic Web can be traced back to the idea put forward by Berners-Lee et al. that it is desirable to augment the original World Wide Web by treating the Semantic Web as machine-understandable information (2001). In this view, machine-understandable information is accomplished by assigning certain data expressiveness to metadata, which is usually an ontological form of data, has logical semantics, and is amenable to inference. The basic model of the Semantic Web is resource description framework (RDF), a standardized format for representing and exchanging data for describing relationships and properties between resources (d’Amato, 2020).

Knowledge graphs (KGs) consist of structured collections of relationships represented as triples (head entity, relationship, tail entity). They are widely used in various projects, such as intelligent quizzes, comprehensive searches, and recommender systems that rely on substantial data support. Considering the historical issues of semantic networks, especially ontologies, knowledge graphs can be viewed as a richer and more complex semantic network (e.g., RDF), which can contain multiple types of entities and relationships and provide richer semantic information (Breit et al., 2023). Despite containing millions of factual statements, KGs still require a considerable amount of work. For instance, a significant portion of individuals in the Freebase Knowledge Graph, around 71%, lack information regarding their place of birth, while 75% lack information about their nationality (Dong et al., 2014). This highlights the incompleteness of knowledge coverage within KGs. To address this issue, the task of knowledge graph completion (KGC) was introduced.

Knowledge graph completion refers to the task of predicting the presence of connections or edges between two nodes in graph-structured data. In the context of graph theory and network analysis, a graph comprises a collection of nodes and the edges that connect them. The objective of KGC is to infer the existence of unknown edges by leveraging the available partial graph information.

The concept of low-dimensional embedding involves mapping high-dimensional representations of entities and relations into a lower-dimensional vector space. This mapping enables the distances and relationships between entities and relations in the vector space to reflect their semantic similarities and associations within the knowledge graph. Numerous approaches exist for achieving knowledge graph completion through low-dimensional embedding. The challenge lies in maintaining a low embedding dimension while still achieving satisfactory model performance.

In recent years, notable progress has been made in knowledge graph completion using the enhanced generic encoder of the transformer model (Baghershahi et al., 2022; Liu et al., 2022; X. Zhang et al., 2022). The attention mechanism, particularly self-attention, has played a crucial role in achieving these advancements. Self-attention effectively captures the dependencies among linear projections within the model and maps them to the output, allowing important information to be learned and focused automatically. Despite the impressive results obtained with transformer models, they often face challenges related to high-dimensional embeddings, complexity, and scalability. These issues arise from the stacking of multiple encoder layers and the subsequent increase in the number of coding blocks (Baghershahi et al., 2022).

In order to solve some of the these problems, we introduce an improved transformer-based model TFttOM; due to the inclusion of multi-layer self-attention and feed-forward networks in the encoder, we reduce the transformer encoder block instead of stacking and mixing multiple encoders as a way to reduce the overall number of parameters of the model and the computational complexity, thus improving the model computational efficiency. We introduce a separable self-attention method with linear complexity to reduce the free parameters and thus enhance the computational efficiency. For the binary cross-entropy function, we choose to optimize it to improve the model's recognition ability. The contribution of this paper can be summarized in the following three areas. (1) TFttOM was proposed to reduce the overall parameter count and improve computational efficiency by simplifying the encoding and decoding modules. (2) The introduction of the linear separable self-attention module reduces the time complexity of the model. (3) A new loss function is proposed to reduce the risk of semantic discrimination while improving generalization ability.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 1 Issue (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing