Article Preview
TopIntroduction
According to the World Health Organization (WHO), an estimated 31% of all deaths worldwide are caused by cardiovascular diseases (CVDs). Also, more than 75% of CVD deaths occur in low- and middle-income countries (WHO, 2022). Coronary Artery Disease (CAD) is the most common form of CVDs. It occurs when one or more of the coronary arteries becomes narrow or blocked. In CAD, major blood vessels that supply blood, oxygen and nutrients to the heart become damaged or diseased (WHO, 2022).
Research works on CAD aim for the early detection of this disease in its preliminary stage because this is the only way to prevent from a severe form. Consequently, there is a need for decision support systems that can predict CAD in a non-invasive manner. Abundant research studies propose prediction models to achieve an accurate diagnosis. The majority of proposed solutions have their origins in the realm of machine learning and datamining.
CAD symptoms may differ from person to person. However, because many people have no symptoms, they do not know they have the CAD until they have chest pain, a heart attack, or sudden cardiac arrest. This led to the construction of heart disease datasets from previous patients’ records. Most of CAD datasets are provided by the University of California Irvine (UCI) machine learning repository (Dua & Graff, 2019). CAD prediction models can be trained on available datasets and used to diagnose the presence of this disease for new patients.
This paper proposes two solutions for the early detection of CAD. The first solution is based on an enhanced random forest classifier. The optimization of this model is obtained through a long process of hyperparameters tuning. The second solution takes advantage of the features’ importance revealed by the random forest algorithm to improve the retrieve step in the Case-Based Reasoning (CBR) cycle. Both solutions are tested on an experimental dataset. The proposed models represent the core of a decision support system for CAD diagnosis.
The paper is organized as follows. Section 2 walks through the background of this research in terms of methodologies, coronary artery disease, the structure of the dataset and a literature review of related recent works that use datamining techniques or CBR in CAD diagnosis. Section 3 presents the authors’ approach to deal with the CAD prediction, summarizes the experimentations realized with the proposed solutions and compares the results with related research works. Section 4 concludes this paper and put forth the achievements realized.