Data Discovery Over Time Series From Star Schemas Based on Association, Correlation, and Causality

Data Discovery Over Time Series From Star Schemas Based on Association, Correlation, and Causality

Wallace Anacleto Pinheiro, Geraldo Xexéo, Jano Moreira de Souza, Ana Bárbara Sapienza Pinheiro
Copyright: © 2020 |Pages: 17
DOI: 10.4018/IJDWM.2020100106
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This work proposes a methodology applied to repositories modeled using star schemas, such as data marts, to discover relevant time series relations. This paper applies a set of measures related to association, correlation, and causality to create connections among data. In this context, the research proposes a new causality function based on peaks and values that relate coherently time series. To evaluate the approach, the authors use a set of experiments exploring time series about a particular neglected disease that affects several Brazilian cities called American Tegumentary Leishmaniasis and time series about the climate of some cities in Brazil. The authors populate data marts with these data, and the proposed methodology has generated a set of relations linking the notifications of this disease to the variation of temperature and pluviometry.
Article Preview
Top

Some authors (Bimonte, Sautot, Journaux, & Faivre, 2017) propose strategies to make easier the process of designing and building a data warehouse (Chandra & Gupta, 2018; Ralph. Kimball & Ross, 2013; Romero & Abelló, 2009), Others suggest ways to keep the track of the whole history of objects in data warehouse efficiently (Atay & Garani, 2019; Golfarelli & Rizzi, 2009). However, reaching the bond among data from different thematic contexts, existing in these repositories, is still a challenge.

Data marts, which compose data warehouses, are non-volatile data repositories, oriented by subject or themes (R Kimball, 2012; Ralph. Kimball & Ross, 2013), being shaped according to Star schemas or Snowflake schemas. These schemas were created to facilitate the manipulation and visualization of large volumes of data. Star schemas follow a denormalized data approach whereas snowflake schemas follow some rules of normalization (Garani & Helmer, 2012).

This work applies the dimensional schema, used in data marts, to find relations (associations, correlations and causalities) in the stored data. It was not found in the areas of Business Intelligence, Data Analytics or Data Mining a strategy that uses the structure of star schemas to find and evaluate relations in an automatic and comprehensive way, trying to find relevant connections, as proposed in this work.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 6 Issues (2023)
Volume 18: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing