Save 10% on All IGI Global Research Books
& OnDemand Individual Chapter & Article DownloadsAvailable exclusively on IGI Global’s Online Bookstore. Offer valid through October 31, 2024

Special Offers
- Save 10% on the IGI Global Online bookstore
  Now through October 31, 2024, save 10% on all IGI Global research books & OnDemand individual chapter & article downloads. IGI Global contributors may stack this discount with their exclusive 50% contributor discount, which is automatically applied when logged into a contributor portal account. Non-contributors may also combine the discount with one other discount, including coupon codes. Not valid on open access processing charges, e-collections, or videos. Discount is not applicable for distributors.
  Explore Books & Chapters
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

A Multi-Objective Approach to Big Data View Materialization

Akshay Kumar, T. V. Vijay Kumar

Source Title: International Journal of Knowledge and Systems Science (IJKSS) 12(2)

DOI: 10.4018/IJKSS.2021040102

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Big data comprises voluminous and heterogeneous data that has a limited level of trustworthiness. This data is used to generate valuable information that can be used for decision making. However, decision making queries on Big data consume a lot of time for processing resulting in higher response times. For effective and efficient decision making, this response time needs to be reduced. View materialization has been used successfully to reduce the query response time in the context of a data warehouse. Selection of such views is a complex problem vis-à-vis Big data and is the focus of this paper. In this paper, the Big data view selection problem is formulated as a bi-objective optimization problem with the two objectives being the minimization of the query evaluation cost and the minimization of the update processing cost. Accordingly, a Big data view selection algorithm that selects Big data views for a given query workload, using the vector evaluated genetic algorithm, is proposed. The proposed algorithm aims to generate views that are able to reduce the response time of decision-making queries.

Article Preview

Top

1. Introduction

Big data analysis is an essential element of the Business Intelligence (BI) process used for generating beneficial information for an organization to enable it to take appropriate and timely decisions. This Big data is collected from various sources, such as transactional data, e-commerce data, social media, scientific explorations, IoT devices etc. A Big data application requires to efficiently process large amount of data, which is collected and integrated from large numbers of unverifiable sources. The Big data sources, which could be heterogeneous - structured, semi-structured and unstructured, generate data at a brisk pace with such data having varying levels of integrity (Gupta et al., 2012; Jacobs, 2009; Kumar & Vijay Kumar, 2015; Zikopoulos, 2011). The raw Big data is cleaned, collated and analyzed to make it more reliable and valuable so that it can be used for effective and efficient business decision making. However, this high value visual information should be valid and should not be vulnerable to heterogeneity considering the low veracity of Big data (Firican, 2017; Gandomi & Haider, 2015; Khan et al., 2014).

A Big data application must efficiently process the large and heterogeneous data generated from various sources. Big data view materialization is a technique that can optimize the processing time of Big data queries, even as continuous updates to Big data continue in real time. View materialization, which is a widely studied problem for various types of database systems, is concerned with the identification of sets of views which, when materialized, would optimize the query processing time and the resource utilization, even as the data continues to receive updates. This is shown to be an NP-Hard problem (Harinarayan et al., 1996). View materialization was first studied in the context of RDBMS and data warehouse (Chirkova et al., 2001; Gupta, 1996; Harinarayan et al., 1996; Mami & Bellahsene, 2012; Roussopoulos, 1998). Empirical based (Agrawal et al., 2000), heuristic based (Gupta, 1996; Harinarayan et al., 1996) and meta-heuristic based (Goswami et al., 2017; Arun & Vijay Kumar, 2015a, 2015b, 2017a, 2017b; Vijay Kumar & Arun, 2016, 2017, Vijay Kumar & Kumar, 2014, 2015; Kumar & Vijay Kumar, 2018) techniques were used to address this problem.

One of the key characteristics of materializing views over a data warehouse was the ease of data representation due to the structured nature of the data. However, a large portion of Big data is semi-structured and unstructured. Big data is voluminous, has a high rate of growth and is low on authenticity. In addition, Big data is not directly processed by structured tools like RDBMS and Data Warehouse. Rather, a large number of frameworks and tools are being used to store and process Big data. Some of these include the Hadoop distributed file system (HDFS), map-reduce framework, Apache Hadoop, Apache Spark framework (Dean & Ghemawat, 2012; Dezyre, 2015; Hadoop, 2008; Hadoop, 2012; Manyika et al., 2011) and many other tools including NoSQL databases, Hive, BigTable, Neo4j etc. As Big data is voluminous, it is stored and processed using distributed systems. Thus, the Big data view materialization problem needs to be addressed for distributed file systems (DFS).

This paper defines the Big data view materialization problem, as a bi-objective optimization problem with the objectives being to minimize the query evaluation cost of workload queries, as also to minimize the update processing cost of the materialized views, subject to a constraint on the total size of the materialized views. These two objectives, in general, conflict with each other, as minimizing the query evaluation cost may also lead to an increase in the update processing cost and vice versa. This paper uses the vector evaluated genetic algorithm (VEGA), a multi-objective evolutionary algorithm given in (Schaffer, 1985), to select Big data views that can reduce the response times for the workload queries. Accordingly, a VEGA based Big data view selection algorithm that selects Big data views for a given query workload is proposed herein.

This paper is organized as follows: Section 2 discuses a brief account of the view materialization problem for different data types and database management systems. View materialization in the context of Big data is discussed in section 3. Section 4 discusses the formulation of the Big data view materialization problem, as a bi-objective Big data view selection problem. VEGA based Big data view selection algorithm is given in section 5 followed by an example illustrating its use to select Big data views in section 6. Experimental results are discussed in section7. Section 8 is the conclusion.

Complete Article List

Search this Journal:

Reset

Volume 15: 1 Issue (2024)

Volume 14: 1 Issue (2023)

Volume 13: 4 Issues (2022): 2 Released, 2 Forthcoming

Volume 12: 4 Issues (2021)

Volume 11: 4 Issues (2020)

Volume 10: 4 Issues (2019)

Volume 9: 4 Issues (2018)

Volume 8: 4 Issues (2017)

Volume 7: 4 Issues (2016)

Volume 6: 4 Issues (2015)

Volume 5: 4 Issues (2014)

Volume 4: 4 Issues (2013)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Multi-Objective Approach to Big Data View Materialization

Abstract

1. Introduction

Complete Article List