Understanding of the Exploratory Graph Theoretical Approach for Data Analysis With Supervised and Unsupervised Learning

Understanding of the Exploratory Graph Theoretical Approach for Data Analysis With Supervised and Unsupervised Learning

Kiran Hemanthraj Muloor, Somesh Kumar Sahu, Rajshree Dahal
DOI: 10.4018/978-1-6684-4580-8.ch016
(Individual Chapters)
No Current Special Offers


Information is a vital part of optimizing the effectiveness, profitability, and dynamic abilities of organizations of all sizes, which leads to expanded deals, profits, and benefits. Currently, organizations deal with immense datasets, but owning a lot of data doesn't boost the business unless ventures investigate the available data and drive authoritative development. It is possible to automate exploratory data analysis to save a lot of time and effort, since we no longer need to write code for each visualization and statistical analysis. Automation of the process generates a report that includes all the visualization and data analysis as well.
Chapter Preview


The exploratory data analysis [EDA] process was first developed in 1970 by Tukey and John Wilder. It is a method used prior to analyzing and creating a model to gain a complete understanding of the data. Simple quantitative and graphical techniques can be used to identify and explore the gaps in the data. According to Tukey, too much emphasis in statistics is placed on statistical hypothesis testing (confirmatory data analyses); instead, the data should be used to suggest hypotheses for further testing. Using EDA, the characteristics, behaviors, features, and features of the data are identified, reviewed, modified, and corrected before putting the data into any statistical model or analysis. EDA techniques have been incorporated into data mining (Tukey, 1977b).

To demonstrate how simple graphics and quantitative methods can be used to openly explore data, John W. Tukey brought the term “Exploratory Data Analysis” into existence.

Typical graphical techniques include

  • Data visualization (e.g., stem-and-leaf diagrams, histograms, scatter plots)

  • Statistical plotting (such as mean, box and residual plots)

  • The positioning of multiple plots in order to enhance cognition

Data analysis is a life cycle that includes numerous stages. Exploratory data analysis is one of the stages that help the analyst understand the domain and data better so that he or she can derive insights and provide solutions based on the information provided by the data. Figure 1 shows details about 7 stages of Data analytics process.

Figure 1.

7 stage data analytics process


Although there is no unified structure for analyzing data, multiple steps are involved in achieving a problem-solving process based on data analysis. Data Analytics primarily consists of 6 stages excluding Model Deployment. Each stagehas their own significance in the world of analysis. Let’s have a detailed look into these stages.

  • 1.

    Data Collection: At this phase of the data analytics life cycle, the objective of the data will be defined, as will the means and means by which it will be achieved. By analyzing the data, one can identify the core objectives of a firm, what it is looking to accomplish, and what it needs to achieve to accomplish those objectives. The team gains an understanding of the business domain during this phase, as well as checking whether the business unit or institution has undertaken similar projects in the past to apply learned lessons.

In order to be an effective data manager, you must identify potential uses and demands for your company's information, such as where the data comes from, what the data can tell you, and what the business benefits will be by using it. Rather than focusing on data as it stands, as a data analyst you need to focus on enterprise data requirements. Additionally, it is your responsibility to assess the tools and systems you will need to read, organize, and analyze the data you receive.

This phase will involve a number of activities including framing the business problem as an analytics challenge and formulating initial hypotheses (IHs) that will be tested and learned from the data that can be collected. It is therefore essential to ensure that the drawn-in objective is achieved during the subsequent phases.

  • 2.

    Exploratory Data Analysis: Any Data Analysis or Data Science project begins with an exploratory data analysis, or EDA, as an initial step. The objective of EDA is to investigate a dataset, identify patterns and anomalies (outliers), and finally formulate hypotheses based on the knowledge we have about the dataset.

Complete Chapter List

Search this Book: