Analyzing AQI before Covid '19: Experimental Study of 3 Years for Intelligent Environment Conducted at North Indian Zone to Extract Knowledge

Analyzing AQI before Covid '19: Experimental Study of 3 Years for Intelligent Environment Conducted at North Indian Zone to Extract Knowledge

DOI: 10.4018/979-8-3693-2109-6.ch017
(Individual Chapters)
No Current Special Offers


In the populated and developing countries, governments consider the regulation and protection of environment as a major task and should take into consideration the concept of smart environment monitoring. The main motive of these systems is to enhance the environment with various technology including sensors, processors, data sets, and other devices connected across the globe through a network. This system can help in monitoring air quality. Also, these factors contribute a lot to air pollution. So, forecasting air quality index using an intelligent environment system includes a machine learning model to predict air quality index for NCR (National Capital Region). The values of major pollutants like SO2, PM2.5, CO, PM10, NO2, and O3. The authors have implemented different machine learning algorithms of classification and regression techniques. To make their prediction more accurate, mean square error, mean absolute error, and R square errors have been considered. The chapter helps to frame a structured view of air quality prediction methods in the reader's mind and also gives suggestions for other prediction methods as well. The real challenge is to decide which method will be applied in predicting air quality. Hence, it is important to test and use all these methods.
Chapter Preview


This research paper discusses the different parameters of air quality and environment using various machine learning algorithms. This paper also introduces us with the reason behind the contamination of air. Pollution can cause harm to not only air but to water bodies as well. It can take the form of noise, heat and even light. The substances which are responsible for pollution can be either foreign substance or they may have been naturally occurred in the nature. Pollution is mostly classified as to be caused by any one single source rather than many.

Intelligent Environment Systems

In this era of globalization, every country in the world is facing problems related to environment. In order to control these problems, it has become a primary concern of thinking for the various organizations and governments. This emerging problem produces a need of monitoring of environment and finding more environment nourishing solutions. And, this need brings smart monitoring techniques into the picture (Dipanda, A. et al., 2016).

Intelligent Environment Systems plays an important role in approximately all sectors. This field is becoming a must have for cities with increased industrialization, high population and massive transportation, as these sectors are the main reason behind increasing pollution (Wilson, T., et al., 2018).

Types of Pollution

Pollution is mainly of four types: water pollution, soil pollution, noise pollution and air pollution. We here elaborate detail about air pollution. A factor which induces air pollution is stubble farming, motor vehicle emission, topological factor, and open construction work. The ordinance of environmental contamination has drawn public examination. NCR(National Capital Region) one of the most contaminated territories in the world (Tripathi, C.B. et al., 2019).

Components in Pollution Particles

Different researches have carried out several experiments and have come to a conclusion that the concentration of pollutants in NCR is alarmingly higher as compared to any other region (Srivastava, C. et al., 2018).This has made the lives of all the residents less for up to 6 years. While some researchers have (Aggarwal, P. et al., 2015) concluded that pollution has affected human fitness. Hence, we enhancing the air quality forecasting is one of the best objectives for civilization. Sulphur dioxide, PM2.5 and NO are major pollutants found in the air. Sulphur Dioxide is a gas, present in air(Gallardo, M. et al., 2017).

This combines easily with different substances to form harmful substances like Sulphur acid, sulfurous acid etc. Sulfur dioxide affects social fitness when it is inhaled in. It causes a burning feeling in the nose, throat, and airways to result into coughing, wheezing, the brevity of breath, or a tense feeling around the chest. The concentration of Sulphur dioxide in the environment affects the places we can live in (Bhalgat, P. et al., 2019).PM2.5 is also known as fine particulate matter (2.5 micrometers is one 400th of a millimeter). Fine particulate matter (PM2.5) is important among the pollutant index because it is a big concern to people's health when its level in the air becomes high (Pandey, G. et al., 2013). So, it has been categorized according to Air quality index table.

Research Problem Introduction and Motivation

  • With an average of 98.6 Particulate Matter (PM 2.5) concentrations, Delhi was the most polluted city in the world. 21 cities out of 30, which were the most polluted, were from India.

  • India leads the charts for the most polluted cities in the world. All those cities in India, who were to be monitored as per WHO, didn’t report for the annual pollution exposure 2019.


Numerous other models exist to check the concentration of pollutants in cities like Delhi. Traditionally analytical models and statistical models include synthetic variation models and atmospheric dispersal models, which were applied for prognostication. Recently it was seen that machine learning methods give a more accurate result in cases of prognostication models.

Key Terms in this Chapter

Support Vector Regressor: Support Vector Machine can also be used as a regression method, maintaining all the main features that characterize the algorithm (maximal margin). The Support Vector Regression (SVR) uses the same principles as the SVM for classification, with only a few minor differences. First of all, because output is a real number it becomes very difficult to predict the information at hand, which has infinite possibilities.

Data Visualization: Data visualization is used to represent dataset and understand it through variousplots and graphs. In air quality, we have used heatmap, correlation matrix and boxplot to visualize the data. Through, Data visualization, we get to know the relationbetween various attributes of the dataset. Python offers multiple libraries for datavisualization and analysis.

k-Nearest Neighbour Regression: K nearest neighbors is a simple algorithm that stores all available cases and predict the numerical target based on a similarity measure (e.g., distance functions). KNN has been used in statistical estimation and pattern recognition already in the beginning of 1970’s as a non-parametric technique.

Multiple Linear Regression: Multiple Linear Regression (MLR) is a statistical technique for finding the linear relation between the independent variables (predictors) and the dependent or response variable. The general MLR model is built from N observations of the multiple predictor variables xk (k = 1, 2,.., m) and the observed target data y.

Lasso and Ridge Regression: Ridge regression and Lasso regression are very similar in working to Linear Regression. The only difference is the addition of the l1 penalty in Lasso Regression and the l2 penalty in Ridge Regression. The primary reason why these penalty terms are added is two ensure there is regularization, shrinking the weights of the model to zero or close to zero to ensure that the model does not overfit the data.

Complete Chapter List

Search this Book: