Article Preview
Top1. Introduction
The enterprise warehouse data is collected from multiple sources and is loaded through the Extraction, Transformation and Loading (ETL) process (Thareja, 2009). The data is used for analytical processing by the decision support system using the OLAP queries (Gupta, 2014). The results of the OLAP queries are generated after traversing through enormous number of warehouse records. The data marts (Thareja, 2009) are subset of the data warehouse and they store data based on the needs of the users. Therefore, the data marts contain fewer amounts of data as compared to a data warehouse (Rusu et al. (2004), Rusu et al. (2005), Tjioe and Taniar (2005)). Some of the OLAP queries are frequently fired by the organization and to generate the results of such queries all the warehouse records are traversed repeatedly which increases the processing time and the cost of the system.
Various techniques have been suggested by the researchers over the past few decades to reduce and to optimize (Bara, 2008) the result retrieval cost from a data warehouse. One of the options used for reducing the result retrieval cost from the data warehouse is to increase the query performance. The performance of a query can be improved either by tuning it so that the time consumed by the query decreases during runtime (Karthik et al., 2012) or by using an appropriate indexing technology (Neil and Quass, 1997) to speed up the queries in the data warehouse environments. The multidimensional data cubes (Gupta (2014) and Han et al. (2011)) and materialized views (Gupta et al. (1993) and Gupta and Mumick (1995)) can be used to store the results from a data warehouse. In case of data cubes, the aggregates on the multiple dimensions are pre-computed and therefore while storing enormous number of aggregates huge storage space is required. There exists a trade-off between the materializing of the cube and the cost to materialize them (Harinarayan et al., 1996). Researchers Deshpande et al. (1996), Agrawal et al. (1997), Datta and Thomas (1999), Shanmugasundaram et al. (1999), Chun et al. (2001) proposed various techniques for reducing the materialization cost of the data cubes
The materialized views store a static snapshot of view (Rob, 2006) results with their definition (Gupta et al. (1993), Gupta and Mumick (1995), Chaudhuri and Dayal (1997), Zhou et al. (2007)). The major limitations of the materialized views are that they are not supported by all database management systems and require an explicit invocation for execution. The users are required to have knowledge about the tables and the fields used in all the stored materialized views (Zhou et al., 2007). The maintenance of materialized views incurs more processing cost. While eagerly updating the view results, all the materialized views are refreshed whether they are invoked or not. Various view maintenance techniques are discussed by authors Gupta and Mumick (1995), (Zhuge et al. (1995), Quass (1996), Gupta et al (1996), Mumick et al. (1997). Goldstein (2001) presented a fast and scalable algorithm to determine if a part or all of a query can be computed from a materialized view. However the existing techniques and the methods can be implemented only if the same input query or the materialized view as fired by the user pre-exist in the database. To generate the results of the non-existing input queries or the materialized views or for those queries with minor changes in the criteria the data warehouse is invoked.