An Intelligent Web Caching System for Improving the Performance of a Web-Based Information Retrieval System

An Intelligent Web Caching System for Improving the Performance of a Web-Based Information Retrieval System

Sathiyamoorthi V., Suresh P., Jayapandian N., Kanmani P., Deva Priya M., Sengathir Janakiraman
Copyright: © 2020 |Pages: 19
DOI: 10.4018/IJSWIS.2020100102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

With an increasing number of web users, the data traffic generated by these users generates tremendous network traffic which takes a long time to connect with the web server. The main reason is, the distance between the client making requests and the servers responding to those requests. The use of the CDN (content delivery network) is one of the strategies for minimizing latency. But, it incurs additional cost. Alternatively, web caching and preloading are the most viable approaches to this issue. It is therefore decided to introduce a novel web caching strategy called optimized popularity-aware modified least frequently used (PMLFU) policy for information retrieval based on users' past access history and their trends analysis. It helps to enhance the proxy-driven web caching system by analyzing user access requests and caching the most popular web pages driven on their preferences. Experimental results show that the proposed systems can significantly reduce the user delay in accessing the web page. The performance of the proposed system is measured using IRCACHE data sets in real time.
Article Preview
Top

Introduction

It has long been recognized that good interactive response time is very much crucial for any kind of online information providers as it improves users’ satisfaction, their retention and productivity. It also extends to e-commerce websites. A widely cited analysis study (Rich, 2015) provides a proof of an eight-second electronic trading rule, if a website requires longer than eight seconds for loading, the consumer is far more likely to get irritated and leave the portal. While performance continues to improve over time as a consequence of improved bandwidth and device latencies, customers do want expanded capabilities. As a result, internet service providers continue to put greater bandwidth demands as they grow. There is also a great deal of competition and business potential for content suppliers to have strong digital platform in order to retain their users. As a consequence, enhancing websites response times has been regarded by several researchers in the past. Some wish to enhance efficiency by achieving improvements in bandwidth and change in response time by wider pipes or alternate networking technologies. Some seek to allow more effective usage of their current resources (Waleed et al., 2011). But, the main reasons for delayed response are distance and time for establishing connection between them. The general procedure for fetching information from the origin server is given below.

Usually, when an URL (such as www.xyz.com) into the IP address (for an example 201.192.111.22). Once the client's request has been received and revalidated by the origin server, it can generate and transmit the response. Every response from the server has header and likely information. All these mentioned activities takes certain amount of time to complete its task such as time it takes to communicate with DNS to resolve hostname, time to connect with origin server for fetching and transferring information and so on. There are several strategies to minimize these delays in the Web.

One such a solution is to add additional servers to an existing network and distribute traffic to one or more of such servers. However, connecting additional servers to the network and breaking up traffic is a costly solution. Another technique is to use CDN (Content Delivery Network) at different locations for minimizing latency. But, it also incurs additional cost. The most cost-effective approach is to use intermediate caching for faster and better response (Teng et al., 2005; Song et al., 2017; Najme & Mohammad, 2017). It loads Web Pages closer to the clients thereby minimizing the delay and saves time for Web Page retrieval (Clarke & Wade, 2019). As per network partitioning, the caching can be situated in various positions in client-server communication model as shown in the Figure 1.

Figure 1.

Possible cache locations

IJSWIS.2020100102.f01

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 1 Issue (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing