Web Service Clustering Approach Based on Network and Fused Document-Based and Tag-Based Topics Similarity

Web Service Clustering Approach Based on Network and Fused Document-Based and Tag-Based Topics Similarity

Deng Li Ping, Guo Bing, Zheng Wen
Copyright: © 2021 |Pages: 19
DOI: 10.4018/IJWSR.2021070104
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

To produce a web services clustering with values that satisfy many requirements is a challenging focus. In this article, the authors proposed a new approach with two models, which are helpful to the service clustering problem. Firstly, a document-tag LDA model (DTag-LDA) is proposed that considers the tag information of web services, and the tag can describe the effective information of documents accurately. Based on the first model, this article further proposes an efficient document weight and tag weight-LDA model (DTw-LDA), which fused multi-modal data network. To further improve the clustering accuracy, the model constructs the network for describing text and tag respectively and then merges the two networks to generate web service network clustered. In addition, this article also designs experiments to verify that the used auxiliary information can help to extract more accurate semantics by conducting service classification. And the proposed method has obvious advantages in precision, recall, purity, and other performance.
Article Preview
Top

Introduction

Web service is a kind of application system that depends on the Internet. It provides various data computing and resource sharing services for Internet users. With the rapid development of Web 2.0, mobile Internet, Internet of things, cloud computing and other technologies, a large number of Internet applications based on SOA (Service-Oriented Architecture) have been created, and web services have gradually become the mainstream technology to realize SOA Architecture(Shi et al.,2018). Web services on the Internet show a rapid growth trend. According to statistics, there are dozens of new web services called API (Application Programming Interface) on Programmable Web, which is the largest and most active web service publishing and sharing platform. Among them, from June 2011 to March 2018, the number of services in the website increased from 3261 to more than 19000, with a growth rate of up to 500% (Almarini et al., 2019). As a result, managing Web service resources efficiently has been a critical challenge to people facing nowadays. Web service discovery is a signification and nontrivial task in the domain of Web service computing.

At present, Web service clustering has been widely concerned, which is a method to solve the problem of service discovery. Many studies show that Web service clustering will greatly improve the ability of Web service search engine to retrieve related services. An important limitation of traditional Web services clustering research is that researchers only focused on using the WSDL (Web Services Description Language) document information of Web services (such as service name, content, type, message, port), while the single data source of traditional service clustering method limits the accuracy of clustering. To narrow the shortcomings of traditional Web service clustering methods, some auxiliary features based on Web service information can be exploited to enhance the ability of service clustering, which such as using multiple fusion information(Shi et al., 2019), description text(Elgazzar et al.,2010), tag(Lei et al.,2020;Si &Sun, 2009), tag sharing information(Belém et al.,2017) to improve the performance of service clustering. In the past few years, tagging technologies have been widely used to enhance the Web service management. Tagging systems such as Programmable Web allows users to associate some relevant tags(called the behavior of tagging) to the APIs to be registered. Tags (Si &Sun, 2009) are not only beneficial for platforms like Programmable Web to classify and index APIs, but also contribute to the retrieval resource for the benefits of users. Tag as an effective way of resource management and retrieval, has become a hot research object in recent years. For example, Chen L et al.(2013) proposed to use tag information and WSDL document information to improve service clustering performance based on LDA model (Latent Dirichlet Allocation, document-topic generation model). Although the service clustering effect of Chen's method is better than that of the traditional clustering method, it only considers the semantic information of tags and ignores the network structure information of tags and documents. So the method cannot improve the effectiveness of service clustering comprehensively. To narrow such negative effect, firstly this paper considers tag information to improve the effectiveness of description documents, and mines potential topics and semantics through topic model, then maps service content from high-dimensional word vector space to low-dimensional topic vector space, and last realizes the dimensionality reduction of service documents. Secondly, constructing Web service network clustering is based on topic distribution vector, which can avoid the problem that the service scale is too large to affect the effect of service clustering.

Therefore, this paper proposes a novel Web service clustering approach based on network and fused document-based and tag-based topics similarity. It not only uses tag information and mines potential topics to enrich the semantics of short document by exploiting the probabilistic topic distributions, but also proposes two models. They constructs a reasonable service network with considering the semantics and structure of tag information, and provides a feasible solution to the above problems.

Complete Article List

Search this Journal:
Reset
Volume 21: 1 Issue (2024)
Volume 20: 1 Issue (2023)
Volume 19: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 18: 4 Issues (2021)
Volume 17: 4 Issues (2020)
Volume 16: 4 Issues (2019)
Volume 15: 4 Issues (2018)
Volume 14: 4 Issues (2017)
Volume 13: 4 Issues (2016)
Volume 12: 4 Issues (2015)
Volume 11: 4 Issues (2014)
Volume 10: 4 Issues (2013)
Volume 9: 4 Issues (2012)
Volume 8: 4 Issues (2011)
Volume 7: 4 Issues (2010)
Volume 6: 4 Issues (2009)
Volume 5: 4 Issues (2008)
Volume 4: 4 Issues (2007)
Volume 3: 4 Issues (2006)
Volume 2: 4 Issues (2005)
Volume 1: 4 Issues (2004)
View Complete Journal Contents Listing