Article Preview
TopIntroduction
Business-to-business (B2B) systems have so far been based on a variety of centralized or client-server models. In recent years, decentralized Peer-to-Peer (P2P) architectures have evolved to provide the infrastructure and non-functional characteristics required for implementing much more demanding and complex tasks. The traditional B2B techniques, including the modern standards such as the ebXML framework, only address the issues of relative large business, even when claiming enabling of B2B infrastructure for Small and Medium Enterprises (SMEs). Larger companies have more long-term collaborative relationships which also provide more stable collaboration patterns. SMEs, on the other hand, often tend to do business in a more ad-hoc manner and constantly look for the best trading opportunities for survival and competitiveness. Interactions between SMEs are highly dynamic, which is P2P by their natures. The SMEs seek partnership irrespectively. As such, P2P architectural framework naturally satisfies SMEs need.
Different aspects are behind the motivation of using P2P architectures to support business collaborations. First, using P2P architectures can reduce the cost of maintenance of a centralized server and the relevant business data, decrease the risk of a centralized server to become a single point of failure, and diminish the risk of shutting down the centralized server for unfinished business transactions. Second, P2P architectures provide scalable environments. It is able to deal with transient users. The Peer-to-Peer computing paradigm is viewed as a novel approach for people to share resources such as files and computing cycles, or to support collaborative tasks. During the past few years, the Internet has been gradually shifting toward a distributed system that supports more than a unique client-server application. Peer-to-Peer (P2P) systems are distributed systems, in which nodes of equal roles and capabilities exchange information and services directly with each other, making it more popular. Peer-to-Peer (P2P) systems’ design, including efficient techniques for search, route queries and retrieval of data, allows the user to share huge volumes of data. However, the major problem in such networks is query routing, i.e., deciding to which other (Super-) Peers the query has to be sent for high efficiency and effectiveness. Traditional P2P systems offer support for richer queries; they provide the option to search by identifier, such as a keyword search with regular expressions. Search techniques for these systems must therefore operate under a different set of constraints than those techniques developed for persistent storage utilities.
However, the technique of broadcasting all queries to all Peers suffer from limited efficiency and scalability. In hybrid P2P systems (Ioannidis et al., 2008; Annapureddy et al., 2007) composed of (Super-) Peers, when a Peer submits a query, this Peer becomes the source of this query. Then the query is transmitted to its Super-Peer (SP). The routing policy use semantic mappings between schemas of (Super-) Peers to quickly determine the relevant neighbors (SP), and to which neighbors the query is to be sent. A query received by a SP is processed over its local collection of data sources of different Peers. Once results are found, the SP will send a single response message back to the query source. The time the user must wait for the results to arrive is an important factor; and, it is affected by the mediation process which remains difficult to realize in such a context when the number of (Super-) Peers increases. Several reasons affect response times, such as the time it takes for the query to travel through several SP in the network; and, whenever the SP is forced to look for connections (i.e., mappings) in order to route the query. For these reasons, response times tend to be slow in hybrid P2P networks. Satisfaction time is simply the time that has elapsed between the submission of the query by the user, and the time he receives the overall results.
Recently, data mining has gained in popularity due to the emergence of vast quantities of data. In this paper, a practical issue about data mining in P2P network is discussed. The motivations behind P2P data mining include the optimal usage of available computational resources, privacy, and dependability to eliminate critical points of service.
In this paper, the effect of data mining in P2P query routing is presented. The proposed method focuses on how the query is routed to relevant Peers with minimum query processing at SP level in order to improve answering time of the queries by using data mining technique. The important advantage of the suggested approach is scalability.