State of the Art in Distributed Privacy-Preserving Protocols in Private Web Search

State of the Art in Distributed Privacy-Preserving Protocols in Private Web Search

Mohib Ullah, Arbab Waseem Abbas, Lala Rukh, Kamran Ullah, Muhammad Inam Ul Haq
Copyright: © 2023 |Pages: 25
DOI: 10.4018/978-1-6684-6914-9.ch001
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Web search engine (WSE) is an inevitable software system used by people worldwide to retrieve data from the web by using keywords called queries. WSE stores search queries to build the user's profile and provide personalized results. User search queries often hold identifiable information that could compromise the user's privacy. Preserving privacy in web searches is the primary concern of users from various backgrounds. Many techniques have been proposed to preserve a person's web search privacy with time. Some techniques preserve an individual's privacy by obfuscating a user's profile by sending fictitious queries with the original ones. Others hide their identity and preserve privacy through unlinkability. However, a distributed technique preserves privacy by providing unlinkability and obfuscation. In distributed protocols, a group of users collaborate to forward each other queries to WSE, providing unlinkability and obfuscation. This work presents a survey of distributed privacy-preserving protocols. The benefits, limitations, and evaluation parameters are detailed in this work.
Chapter Preview
Top

Introduction

The world wide web (WWW) is a vast network of information and resources where users can access and search for a wide range of content, including text, images, videos, and audio (Khan & Ali, 2013; Khan, Ullah, Khan, Uddin, & Al-Yahya, 2021). Web search engines, such as Google and Bing, play a crucial role in helping users find the information they are looking for by processing large amounts of data and presenting relevant results based on their search queries. Search engines have become indispensable tools for Internet users as they allow easy and fast access to information on a global scale. Research has shown that people are becoming increasingly satisfied with the performance of search engines, but at the same time, they are also becoming increasingly concerned about their privacy (Ullah, 2020b). Using personalization algorithms by search engines to present search results and advertisements tailored to the user’s interests is seen as both a strength and a weakness by different people.

On the one hand, personalized results can provide a more relevant and enjoyable experience for the user. On the other hand, it also raises privacy concerns as search engines collect and store large amounts of data about the user’s activities, interests, and behaviours, which could be used for various purposes, including targeted advertising. Web search engines build a user profile based on various factors such as interests, preferences, and previous searches to provide more relevant results. This user profile can improve the accuracy of search results but also raises privacy concerns as it reveals sensitive information about the user. The user’s profile contains their unique I.D., name, employer’s details, location, and potentially sensitive information such as their health status, political views, religion, etc. (Cooper, 2008). As a result, users are often forced to trade off accuracy for privacy, which can result in less relevant search results (Dan & Davison, 2016). Search engines must balance the need for personalized results with user privacy protection.

A survey conducted in 2012 showed that many users were concerned about the privacy implications of web search engines recording their data and search queries (Ullah, 2020b). The query log, which records users’ search activities, is a valuable resource for search engines as it helps them to provide more relevant results (Kaaniche, Masmoudi, Znina, Laurent, & Demir, 2020). However, the storage of query logs also poses a significant privacy threat, as this data can be disclosed to advertising agencies and media, potentially revealing sensitive information about users. The release of the AOL log in 2006, where twenty million queries generated by 658000 users over three months were published, is a well-known example of a privacy breach (Barbaro, Zeller, & Hansell, 2006; Wang, Liu, & Wang, 2020). This incident highlights the need for search engines to implement strong privacy policies and measures to protect user data and ensure users’ privacy.

In some cases, web search engines may be required by court order to disclose individual queries as part of legal proceedings, such as divorce or civil lawsuits. These incidents have raised further concerns about the privacy of users’ search data and the security of their personal information. The 2014 incident, where 80 million health records were lost by the second-largest insurance company in the U.S., highlights the need for better security measures to protect sensitive information (Mathews-Hunt, 2016; Yang, Onik, Lee, Ahmed, & Kim, 2019). These incidents have sparked a movement for privacy preservation among online community members and have led to questions about the privacy policies of the WSE. These incidents put a question mark on WSEs’ policy regarding user privacy.

So far, many techniques have been proposed to protect users’ privacy during their Web searches. These techniques can be classified into five major classes, i.e., distributed techniques, stand-alone methods, query scrambling, third-party infrastructure, and hybrid techniques. This chapter overviews the most prominent privacy-aware web search schemes and techniques. The classification of these privacy-preserving schemes and solutions is shown in Figure 1.

Figure 1.

Taxonomy of private web search schemes

978-1-6684-6914-9.ch001.f01

Complete Chapter List

Search this Book:
Reset