Article Preview
TopIntroduction
Web services are loosely coupled software components that are a popular implementation of service-oriented architecture. They can be published, located, and invoked across the Web through messages encoded according to XML-based standards, including WSDL, UDDI, and SOAP (Curbera, Duftler et al., 2002). Web services have been extended to give value-added customized services such as travel planner to users through service composition (Paik, Chen et al., 2014). Web service discovery is a main stage of service composition. The discovery of Web services is a challenging and time-consuming task because of unnecessary similarity calculations in the matchmaking process within repositories such as UDDIs and Web portals.
Clustering Web services into similar groups is an efficient approach to improving discovery performance. We can greatly reduce the search space by clustering the services. Clustering the Web services enables the user to identify appropriate and interesting services according to his or her requirements while excluding potential candidate services outside the relevant cluster and thereby limiting the search space to that cluster alone. Further, it enables efficient browsing for similar services within the same cluster. A clustering approach requires similarity calculation matrices to compute the similarity of services. First, the method computes the similarity of feature (SoF) of the services. Then, the similarity of services (SoS) is computed as an aggregate of the individual SoF values. Several clustering approaches have been used to compute the SoFs in current functionally based clustering approaches. String-based cosine similarity (Platzer, Rosenberg et al., 2009), the corpus-based normalized Google distance (NGD) (Liu & Wong 2009; Elgazzar, Hassan et al., 2010), knowledge-based ontology methods (Wagner, Ishikawa et al., 2011; Wen, Sheng et al., 2011; Xie, Chen et al., 2011) and hybrid term similarity (HTS) (Kumara, Paik et al., 2014) are some of them. The clustering approach adopted will affect the service clustering performance. However, existing approaches compute the SoFs as global values without considering specific domains. Therefore, one main issue in current functionally based clustering approaches is that they fail to identify changes in features for different domains. Semantic SoFs can change according to the domain. For example, the AmbulanceLocationInformation service will have a strong semantic similarity with Medical domain services within the Medical domain and with Vehicle domain services within the Vehicle domain. However, it is remote from other domains such as Food or Film. Current clustering approaches fail to identify the semantic relationships between services that exist within a particular domain. As a result, using these approaches, some services may be placed in clusters that the user had not expected. To capture this semantic relationship, we need to analyze domain knowledge. Although ontology-based clustering approaches do use domain knowledge through ontologies (Wen, Sheng et al., 2011), the ontologies involve a shared model for domains.