A Novel Approach Using Non-Synonymous Materialized Queries for Data Warehousing

A Novel Approach Using Non-Synonymous Materialized Queries for Data Warehousing

Sonali Ashish Chakraborty
Copyright: © 2021 |Pages: 22
DOI: 10.4018/IJDWM.2021070102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Data from multiple sources are loaded into the organization data warehouse for analysis. Since some OLAP queries are quite frequently fired on the warehouse data, their execution time is reduced by storing the queries and results in a relational database, referred as materialized query database (MQDB). If the tables, fields, functions, and criteria of input query and stored query are the same but the query criteria specified in WHERE or HAVING clause do not match, then they are considered non-synonymous to each other. In the present research, the results of non-synonymous queries are generated by reusing the existing stored results after applying UNION or MINUS operations on them. This will reduce the execution time of non-synonymous queries. For superset criteria values of input query, UNION operation is applied, and for subset values, MINUS operation is applied. Incremental result processing of existing stored results, if required, is performed using Data Marts.
Article Preview
Top

1. Introduction

The enterprise warehouse data is collected from multiple sources and is loaded through the Extraction, Transformation and Loading (ETL) process (Thareja, 2009). The data is used for analytical processing by the decision support system using the OLAP queries (Gupta, 2014). The results of the OLAP queries are generated after traversing through enormous number of warehouse records. The data marts (Thareja, 2009) are subset of the data warehouse and they store data based on the needs of the users. Therefore, the data marts contain fewer amounts of data as compared to a data warehouse (Rusu et al. (2004), Rusu et al. (2005), Tjioe and Taniar (2005)). Some of the OLAP queries are frequently fired by the organization and to generate the results of such queries all the warehouse records are traversed repeatedly which increases the processing time and the cost of the system.

Various techniques have been suggested by the researchers over the past few decades to reduce and to optimize (Bara, 2008) the result retrieval cost from a data warehouse. One of the options used for reducing the result retrieval cost from the data warehouse is to increase the query performance. The performance of a query can be improved either by tuning it so that the time consumed by the query decreases during runtime (Karthik et al., 2012) or by using an appropriate indexing technology (Neil and Quass, 1997) to speed up the queries in the data warehouse environments. The multidimensional data cubes (Gupta (2014) and Han et al. (2011)) and materialized views (Gupta et al. (1993) and Gupta and Mumick (1995)) can be used to store the results from a data warehouse. In case of data cubes, the aggregates on the multiple dimensions are pre-computed and therefore while storing enormous number of aggregates huge storage space is required. There exists a trade-off between the materializing of the cube and the cost to materialize them (Harinarayan et al., 1996). Researchers Deshpande et al. (1996), Agrawal et al. (1997), Datta and Thomas (1999), Shanmugasundaram et al. (1999), Chun et al. (2001) proposed various techniques for reducing the materialization cost of the data cubes

The materialized views store a static snapshot of view (Rob, 2006) results with their definition (Gupta et al. (1993), Gupta and Mumick (1995), Chaudhuri and Dayal (1997), Zhou et al. (2007)). The major limitations of the materialized views are that they are not supported by all database management systems and require an explicit invocation for execution. The users are required to have knowledge about the tables and the fields used in all the stored materialized views (Zhou et al., 2007). The maintenance of materialized views incurs more processing cost. While eagerly updating the view results, all the materialized views are refreshed whether they are invoked or not. Various view maintenance techniques are discussed by authors Gupta and Mumick (1995), (Zhuge et al. (1995), Quass (1996), Gupta et al (1996), Mumick et al. (1997). Goldstein (2001) presented a fast and scalable algorithm to determine if a part or all of a query can be computed from a materialized view. However the existing techniques and the methods can be implemented only if the same input query or the materialized view as fired by the user pre-exist in the database. To generate the results of the non-existing input queries or the materialized views or for those queries with minor changes in the criteria the data warehouse is invoked.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 6 Issues (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing