Article Preview
TopIntroduction
With the continuous development of Web technology, the Internet is moving from the Web 2.0 era characterized by the interconnection of people to the Web 3.0 era of knowledge interconnection (Sheth, A., & Thirunarayan, K. 2013). Its goal is to realize the Internet that can be understood by both humans and machines. It makes the network more intelligent. In this context, how to knowledge and efficiently manage the massive data on the Web so that it can provide users with higher-quality information services has become a hot issue that academia and industry are concerned about. In 2012, Google took the lead in launching the knowledge graph, and it was used as an auxiliary knowledge base to enhance its search function and build a next-generation intelligent search engine. Subsequently, various types of knowledge graphs have been launched, such as Wikipedia-based YAGO (Suchanek, F. M., et al. 2008; Hoffart, J., et al. 2013; Mahdisoltan,i F., Biega, J., & Suchanek, F. M. 2015), Dbpedia (Auer, S. O. R., et al. 2007) and Freebase (Bollacker, K. D., et al. 2008).
At the same time, commodity-related data on the Internet is also increasing rapidly, while the demand for accurate acquisition of commodity information by upper-level applications/users is difficult to meet. The contradiction between the two has not been alleviated, and there is a situation that is getting worse. The main reason for this contradiction is that, on the one hand, most of the data carrying product information exists in an unstructured form, which severely limits their automated and intelligent applications; on the other hand, these large-scale information lack efficient data management mechanism, it causes users to directly face fragmented and highly redundant information, which further exacerbates the problem of information overload. Knowledgeable and structured processing of massive data containing commodity information on the Web, and achieving unified and efficient management can not only effectively resolve the contradiction, but also provide users with more comprehensive and accurate information services (Kim, H. 2017; Bartalesi, V., & Meghin,i C. 2017). How to efficiently retrieve large-scale commodity knowledge has become an important issue. There are many knowledge retrieval systems that support large-scale knowledge graphs, such as SW-store (Abadi, D. J., et al. 2009; Abadi, D. J., Marcus, A., Madden, S., et al. 2007), RDF-3x (Neumann, T., & Weikum, G. 2008; Neumann, T., & Weikum, G. 2010; Neumann, T., et al., 2010), Hexastore (Weiss, C., Karras, P., & Bernstein, A. 2008) and gStore (Zou, L., et al. 2014; Zou, L., Mo, J., et al. 2011; Zeng, L., & Zou, L. 2018). When storing data, the URI (Uniform Resource Identifier) text in the knowledge graph is converted into an ID value through a mapping dictionary, thereby reducing the cost of data storage and query. Knowledge retrieval systems based on graph models, such as gStore, can use the graph structure characteristics of the knowledge graph to process knowledge retrieval, and there is high query efficiency.
When processing large-scale queries, a large amount of text needs to be converted into corresponding ID values, which leads to frequent access to the mapping dictionary. At this time, the time cost of the mapping dictionary cannot be ignored. In addition, the retrieval system based on the graph model cannot make full use of the structural features of the product knowledge graph when processing product knowledge queries, resulting in low performance when querying product viewpoint knowledge, and it cannot meet the requirements of product knowledge retrieval performance. This paper focuses on the above-mentioned problems and the main contributions are as follows: