Save 10% on All IGI Global Research Books
& OnDemand Individual Chapter & Article DownloadsAvailable exclusively on IGI Global’s Online Bookstore. Offer valid through October 31, 2024

Special Offers
- Save 10% on the IGI Global Online bookstore
  Now through October 31, 2024, save 10% on all IGI Global research books & OnDemand individual chapter & article downloads. IGI Global contributors may stack this discount with their exclusive 50% contributor discount, which is automatically applied when logged into a contributor portal account. Non-contributors may also combine the discount with one other discount, including coupon codes. Not valid on open access processing charges, e-collections, or videos. Discount is not applicable for distributors.
  Explore Books & Chapters
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Capturing Semantics of Web Page using Weighted TAG- Tree for Information Retrieval

R. Vishnu Priya, A. Vadivel

Source Title: International Journal of Asian Business and Information Management (IJABIM) 3(4)

DOI: 10.4018/jabim.2012100102

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Web pages are highly dynamic and it’s difficult to retrieve the relevant web pages in top 10 search results. This is based on some ranking mechanism incorporated retrieval system. The Retrieval system is designed for ranking the relevant web pages for user query. Usually, the retrieval system considers many techniques for ranking such as link based, connectivity based and keyword based techniques. The authors’ rank the web pages using the keywords and its associated TAGs. Based on the importance of each TAGs, weights are assigned and the semantics of the page is captured. In addition, the semantic information is represented in compact tree form, which supports both incremental and interactive mining with refined retrieval. From the experimental result, the authors have observed that the performance of the proposed approach is encouraging compared to the recently proposed approach.

Article Preview

Top

1. Introduction

In current scenario, the web is considered as a major information source in everyday and in every body’s life. Nearly, one million web pages are added every day and several hundred gigabytes are changed every month. Due to this fact of booming web data and web users, it is found to be tedious to find relevant or interesting information in top 10 retrieved results. In this situation, the web drew attention of many researchers for extracting knowledge from the web, which could also be the base process that helps Web Searching (WS), Information Retrieval (IR) and Web Mining (WM).

IR deals with the searching of relevant web pages from heterogeneous data, such as text, semi structure database, unstructured database and multimedia. The amount of web pages retrieved is on the higher side compared to the number of relevant web pages. Hence, retrieving relevant web pages is becoming an essential issue. In order to retrieve the relevant pages, the information retrieval systems calculate a numeric score for each web page based on how well it is relevant to the user queries. The web pages are ranked based on the scores and displayed to the users. This process of web page ranking mechanism is performed in most of the well-known search engine systems.

Majority of the users use Google, MSN and Yahoo search engines for retrieving the relevant information. Currently, one of the popular search engine is Google and it indexes more than 3 billion web pages in the world as well as this number increases with the rate of 7.3 million pages per day (Forsati et al., 2009). Google use a well-known algorithm for ranking pages called page rank.

Page rank algorithm (Page et al., 1998) use link-base concept, where query independent fixed score is assigned to each element of hyperlinked set of web pages to measure relative importance of each web page within the result set. The algorithm uses the web graph, where nodes are World Wide Web pages and edges are hyperlinks. Both rank and hyperlinks are considered for ranking, where rank value indicates the importance of a page and hyperlinks are counted as vote of support. The rank of each page is defined as the weighted sum of ranks of all pages having link to the page. In addition, the value of damping factor (d) is added for removing the effect of sink pages. Usually, a user randomly surfs the web by clicking the links on the current page. This process of surfing a page is continued and again jumped to a random page if the user reaches a page with no output links. Therefore, the damping factor is calculated, while a user is in web pages with probability of d will be selected as one output link randomly or will jump to other web pages with the probability of 1-d. In this way, a rank for a page is calculated.

A page has a high rank if it has more back links or page having links to this page have higher ranks (Bidoke & Yazdani, 2008). If there is no links to a web page, then the page has no rank. Once the logic of ranking mechanism of Google is known, some organization has not developed their business instead they have shown interest in increasing the page rank of their pages in web. This is done with the purpose of displaying their pages in top-10 results, as the users usually browse only the first or second pages of the search result. The well-known tactics to increase page rank are publishing articles on article directories, submitting your website to web directories, exchanging links with other websites, commenting on other people's blogs, posting on question and answer sites like Yahoo! Answers, using Twitter, Facebook and other networking sites, using bookmarking sites, participating in forums, providing an RSS feed on your website, using link building services and tools and buying links (rank.html). In addition, a score for a page is increased based on the number of time a page is visited/clicked. We can understand that the total count of clicks can be increased with a simple source code. Due to this fact all commercial web pages, social networking web pages like Facebook, Google+, orkut and so on are displayed as the top result in current search engines, which will not provide relevant information for user. It has also been found that the page rank concept is vulnerable to manipulate (PageRank).

Complete Article List

Search this Journal:

Reset

Volume 15: 1 Issue (2024)

Volume 14: 1 Issue (2023)

Volume 13: 2 Issues (2022)

Volume 12: 4 Issues (2021)

Volume 11: 4 Issues (2020)

Volume 10: 4 Issues (2019)

Volume 9: 4 Issues (2018)

Volume 8: 4 Issues (2017)

Volume 7: 4 Issues (2016)

Volume 6: 4 Issues (2015)

Volume 5: 4 Issues (2014)

Volume 4: 4 Issues (2013)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Capturing Semantics of Web Page using Weighted TAG- Tree for Information Retrieval

Abstract

1. Introduction

Complete Article List