Improving Coronary Artery Disease Prediction: Use of Random Forest, Feature Importance and Case-Based Reasoning

Improving Coronary Artery Disease Prediction: Use of Random Forest, Feature Importance and Case-Based Reasoning

Fouad Henni, Baghdad Atmani, Fatiha Atmani, Fatima Saadi
Copyright: © 2023 |Pages: 17
DOI: 10.4018/ijdsst.319307
Article PDF Download
Open access articles are freely available for download

Abstract

Cardiovascular diseases (CVDs) are the number one cause of death globally. Coronary artery disease (CAD) is the most common form of CVDs. Abundant research works propose decision support systems for CAD early detection. Most of proposed solutions have their origins in the realm of machine learning and datamining. This paper presents two solutions for CAD prediction. The first solution optimizes a random forest model (RFM) through hyperparameters tuning. The second solution uses a case-based reasoning (CBR) methodology. The CBR solution takes advantage of feature importance to improve the execution time of the retrieve step in the CBR cycle. The experimentations show that the RFM outperformed most recent published models for CAD diagnosis. By reducing the number of attributes, the CBR solution improves the execution time and also performs very well in terms of diagnosis accuracy. The performance of the CBR solution is intended to be enhanced because CBR is a learning methodology.
Article Preview
Top

Introduction

According to the World Health Organization (WHO), an estimated 31% of all deaths worldwide are caused by cardiovascular diseases (CVDs). Also, more than 75% of CVD deaths occur in low- and middle-income countries (WHO, 2022). Coronary Artery Disease (CAD) is the most common form of CVDs. It occurs when one or more of the coronary arteries becomes narrow or blocked. In CAD, major blood vessels that supply blood, oxygen and nutrients to the heart become damaged or diseased (WHO, 2022).

Research works on CAD aim for the early detection of this disease in its preliminary stage because this is the only way to prevent from a severe form. Consequently, there is a need for decision support systems that can predict CAD in a non-invasive manner. Abundant research studies propose prediction models to achieve an accurate diagnosis. The majority of proposed solutions have their origins in the realm of machine learning and datamining.

CAD symptoms may differ from person to person. However, because many people have no symptoms, they do not know they have the CAD until they have chest pain, a heart attack, or sudden cardiac arrest. This led to the construction of heart disease datasets from previous patients’ records. Most of CAD datasets are provided by the University of California Irvine (UCI) machine learning repository (Dua & Graff, 2019). CAD prediction models can be trained on available datasets and used to diagnose the presence of this disease for new patients.

This paper proposes two solutions for the early detection of CAD. The first solution is based on an enhanced random forest classifier. The optimization of this model is obtained through a long process of hyperparameters tuning. The second solution takes advantage of the features’ importance revealed by the random forest algorithm to improve the retrieve step in the Case-Based Reasoning (CBR) cycle. Both solutions are tested on an experimental dataset. The proposed models represent the core of a decision support system for CAD diagnosis.

The paper is organized as follows. Section 2 walks through the background of this research in terms of methodologies, coronary artery disease, the structure of the dataset and a literature review of related recent works that use datamining techniques or CBR in CAD diagnosis. Section 3 presents the authors’ approach to deal with the CAD prediction, summarizes the experimentations realized with the proposed solutions and compares the results with related research works. Section 4 concludes this paper and put forth the achievements realized.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 2 Issues (2023)
Volume 14: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing