Article Preview
TopIntroduction
Deep Web is a treasury of products and services. This treasury is hidden behind Web forms which give access to data stored in distant databases. Deep Web is very rich in terms of quantity (more than 90% of Web) and quality of service.
Users are usually searching for new and complementary services at the best prices and quality of service. However they are often disappointed because there is no way to compare products of two competitive deep web services. This drawback is due to two factors: 1) deep web services are offering competitive commercial products, 2) these services are offering query interfaces with different query capabilities. Hence queries submitted to servers have different meaning. In order to find a good product, the burden of information retrieval is the responsibility of users. This is time consuming and need a considerable cognitive effort. For novice users, this charge is the deadline between them and deep Web.
We try to facilitate information retrieval from deep web to novice users. Our goal is to build a universal web form which allows novice users query many web services at the same time. This web form make possible to formulate only one query and obtaining results from all merged web services at the same time. We aim to create this new web form which must be user friendly and easy to understand for novice users (Figure 1).
Figure 1. Architecture of our deep Web Information retrieval system
Top2. Architecture Of Proposed System
We have subdivided the main problem (Information Retrieval from deep Web) into three sub-problems (Schema Extraction, Schema Integration, and Schema Visualization) and for each sub-problem we give some response keys.
Deep web services are clustered into several domains of interests, for example Airfare, Automobiles, Books, etc. This is a preprocessing step done by web crawler which classifies each web service into some category based on its query form. Since query forms are public interfaces they are easily reached through search engines (such as Google and Yahoo) from the surface Web. As we focus on deep web this step is beyond our interest.