Deep Learning in Chinese Text Information Extraction Model for Coastal Biodiversity

Deep Learning in Chinese Text Information Extraction Model for Coastal Biodiversity

Xiujuan Wang, Xuerong Li
Copyright: © 2023 |Pages: 15
DOI: 10.4018/IJSWIS.331756
Article PDF Download
Open access articles are freely available for download

Abstract

In the coastal areas of China, scientists have collected nearly 500 species of coastal plants and seaweeds. The collected information includes species description, morphological characteristics, habitat distribution and resource value of plants in China. By effectively extracting Chinese text information, this article establishes a Chinese text information extraction model based on DL. This article is based on short-term and short-term memory artificial neural networks for short text classification. In addition, this article also integrates the L-MFCNN models of MFCNN for short text classification. Comparing the two methods with traditional text recognition algorithms, information extraction based on syntax analysis and deep learning, the results show that, compared with the comparison method, the recognition accuracy of Chinese text information of this neural network model can reach 96.69%. Through model training and parameter adjustment, Chinese text information of coastal biodiversity can be quickly extracted, and species categories or names can be identified.
Article Preview
Top

1. Introduction

Language is the most important tool for human communication. As one of the carriers of language, text, together with images and videos, constitutes the most important way of data storage. At present, climate change, natural disasters and other reasons lead to faster species extinction, and research on biodiversity conservation and sustainable use has increasingly become the focus of biodiversity research (Muluneh, 2021). Biological research is more important in the face of many biological and related global problems such as environmental degradation and endangered species. Therefore, the information extraction of the Chinese text of the biodiversity environment is more meaningful (Anne, 2012). Coastal biological species are one of the important contents in the field of biodiversity, and research on its diversity has attracted many researchers (Litjens, 2017). The extraction of textual information on coastal biodiversity is the starting point for coastal biological and ecological research. Due to the complexity of species, it is very difficult for researchers to quickly identify all these biological species. Search engines also cannot give accurate species information by species’ descriptions (Wu Ying, 2019).

At present, most of the research on biodiversity of Chinese texts focuses on dictionary modeling analysis and machine learning algorithm derivation of shallow learning (Chun, 2018). In 2019, the feature selection of text clustering and the improved krill swarm algorithm were proposed by scholars (Abuligah, L 2019). This paper presents the research results from the following aspects: the process of DL, the text characteristics and Chinese text information extraction for coastal biodiversity, the construction of Chinese text information extraction model, and the application of DL in species identification. Since 2015, the application of genetic algorithm in vector space model information retrieval has been put forward with relevant theories (Abuligah, 2015). Starting from 2017, the unsupervised text feature selection technology based on hybrid particle swarm genetic algorithm has been proposed by relevant scholars (Abuligah, 2017). In 2018, based on the hybrid clustering analysis of the improved krill-herd algorithm, relevant theories were studied by scholars (Abuligah, 2018). The traditional multi-classification text representation and classification method based on bag-of-words model features mainly extracts the low-level features of the text, which has the inherent disadvantages of high dimensionality and high sparseness of the text feature representation vectors. Therefore, the traditional multi-classification text representation and classification methods are difficult to achieve the expected performance. In view of this, it is of great value and practical significance to study the multi-classification text representation and classification methods that extract low-dimensional and dense high-level features of texts.

This paper first introduces the research background and significance of this paper, puts forward the research value of Chinese Sentiment analysis based on deep learning in the field of marine biological recognition, and analyzes the development and status quo of Sentiment analysis and deep learning research at home and abroad. In response to these issues, this article compares the three basic theories of CNN, LSTM, and RM, identifies their advantages and disadvantages, and chooses to use CNN theory to complete this paper experiment. Utilizing deep learning to automate the extraction of Chinese text features, utilizing the powerful classification function of vector machines, and using the proposed algorithm, a system for identifying and analyzing marine organisms is implemented. The article structure is shown in Figure 1.

Figure 1.

Article structure

IJSWIS.331756.f01

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 1 Issue (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing