Is Prompt the Future?: A Survey of Evolution of Relation Extraction Approach Using Deep Learning and Big Data

Is Prompt the Future?: A Survey of Evolution of Relation Extraction Approach Using Deep Learning and Big Data

Zhen Zhu, Liting Wang, Dongmei Gu, Hong Wu, Behrooz Janfada, Behrouz Minaei-Bidgoli
DOI: 10.4018/IJITSA.328681
Article PDF Download
Open access articles are freely available for download

Abstract

A vast amount of unstructured data is being generated in the age of big data. Relation extraction (RE) is the critical way to improve the utility of the data by extracting structured data, which has seen a great evolution in recent years. This paper first introduces five paradigms of RE, namely the rule-based paradigm, the machine learning paradigm, the deep learning model-based paradigm, and the two types of current mainstream methods with pretrained language models. Based on the RE scenario, a comprehensive introduction is made for the currently popular paradigm with prompt learning, which is investigated regarding four aspects. The main contributions of this paper are as follows. Since big models are too large to be easily trained, prompt learning has become a promising research direction for RE, our work is, therefore, a systematic introduction to this paradigm for RE and compared with traditional paradigms. In addition, this paper summarizes the current problems faced by RE tasks and proposes valuable research directions with prompt learning.
Article Preview
Top

Introduction

With the Internet growing hugely popular in the age of big data, a vast amount of data is being generated every second. The data existing in various forms like social networks, video websites, news, advertisements, etc., are released by publishers on the network with specific purposes. It is of great value to extract effective information from these various data (Niklaus et al., 2018) to provide potential utility to human society, such as domain knowledge graph construction, public opinion monitoring, and problem analysis and diagnosis (Wang et al., 2014). Changing these unstructured data to structured data and extracting key information automatically (Simoes et al., 2009) is the information extraction (IE) task discussed here. The information extraction task contains three main important tasks: word separation, named entity recognition (NER), and relation extraction (RE; Muslea et al., 1999). The topic of this paper is developing with relation extraction in mind.

Relation extraction aims to extract entity relation facts in a specific form (e.g., in the form of a triple). For example, “OpenAI was founded in the late 2015, and is headquartered in San Francisco, California.” It is possible to extract that OpenAI is located in San Francisco California, forming a triple (OpenAI headquarters, located in San Francisco, California). Relation extraction is a fundamental task for many downstream tasks, such as machine translation (MT; Bordes et al., 2013), Q&A systems (Bordes et al., 2014; Li et al., 2015), search engines, etc. (Xiong et al., 2017; Schlichtkrull et al., 2018). There are various methods for entity relation extraction, including the rule-based template approach (Califf & Mooney, 1997), machine learning approaches (Kambhatla, 2004), deep learning model-based approaches (Zeng et al., 2014; Zhang et al., 2015), deep learning network model-based approaches with pretrained models, relation extraction approaches based on large-scale pretrained models for relation extraction methods (Zeng et al., 2017) that brought great changes in recent years.

Despite the fact that the relation extraction task has evolved for many years and has made significant progress, it remains to be fully addressed and is an area calling for continuous research and breakthroughs. The existing problems include insufficient labeled data and the intractable noise problem induced by distant-supervised approaches. Poor extraction performance also arises in massive relation categories under open domains. These problems lead to limited practical applications of relation extraction in industry. However, with the successful application of large models in several tasks in the natural language domain this year, this paper argues that a breakthrough in relation extraction is forthcoming in just a few years. A large model’s strong generalization ability, along with new paradigms such as prompt learning, will hopefully overcome the limitations such as insufficient labeled training data and too many open relation classifications, which can ultimately solve various practical problems in production and encourage wide use of relation extraction in industry. Since prompt learning with big models become a promising research direction for RE, our work is, therefore, a systematic introduction to this paradigm for RE and compared with traditional paradigms. In addition, our paper summarizes the current problems faced by RE tasks and proposes valuable research directions with prompt learning.

Complete Article List

Search this Journal:
Reset
Volume 17: 1 Issue (2024)
Volume 16: 3 Issues (2023)
Volume 15: 3 Issues (2022)
Volume 14: 2 Issues (2021)
Volume 13: 2 Issues (2020)
Volume 12: 2 Issues (2019)
Volume 11: 2 Issues (2018)
Volume 10: 2 Issues (2017)
Volume 9: 2 Issues (2016)
Volume 8: 2 Issues (2015)
Volume 7: 2 Issues (2014)
Volume 6: 2 Issues (2013)
Volume 5: 2 Issues (2012)
Volume 4: 2 Issues (2011)
Volume 3: 2 Issues (2010)
Volume 2: 2 Issues (2009)
Volume 1: 2 Issues (2008)
View Complete Journal Contents Listing