Identifying Disease and Diagnosis in Females Using Machine Learning

Identifying Disease and Diagnosis in Females Using Machine Learning

Sabyasachi Pramanik, Samir Kumar Bandyopadhyay
Copyright: © 2023 |Pages: 24
DOI: 10.4018/978-1-7998-9220-5.ch187
OnDemand:
(Individual Chapters)
Available
$33.75
List Price: $37.50
10% Discount:-$3.75
TOTAL SAVINGS: $3.75

Abstract

Here, the researchers are trying to prepare a model for identifying whether a patient is diabetic or not. The Pima Indian Dataset has been used in this case study. There are two types of diabetes. The research consists of two stages. The first is data pre-processing, and the other is classifier construction. After pre-processing, the data classifier will be constructed which will predict whether the patient is diabetic or not. Here the researchers plan to use decision tree classifier and random tree classifier. After studying the dataset, the researchers handled the missing values in optimum ways. All the types of proposed algorithm have been described in this article.
Chapter Preview
Top

Introduction

When a human body doesn’t produce insulin or doesn’t produce sufficient insulin or doesn’t use it efficiently it can lead to dangerous complications called Diabetes (Lee et. al. 2021). This disorder occurs when the glucose of our blood gets high, also known as blood sugar. In the human body the main source of energy is blood glucose and they are getting energy from whatever we eat. For having energy, this glucose gets into the body cells to be used with the help of insulin. Possessing extra visceral fat is significantly connected to having the highest risk of developing diabetes, such as type 2 diabetes. Obese women are also more likely than males to have a healthy metabolism. The present epidemic of diabetes in India is mostly due to changes in lifestyle. Increasing prevalence may be attributed to a variety of factors, including fast changes in food habits, lack of physical activity, and increase in body weight, particularly the build-up of belly fat.

Types of Diabetes

  • 1.

    Type 1 Diabetes (De Bois, M., et al. 2022): This is also known as autoimmune disorder. Here the diabetes pancreatic cells are damaged and for that the pancreas fails to produce enough body insulin.

  • 2.

    Type 2 Diabetes (Yang, L., et al. 2021): It is also known as Adult-Onset Diabetes. Here the pancreas either produces excessive insulin or it resists insulin and it affects the way body processes blood sugar.

  • 3.

    Pre-Diabetes (Harimoorthy, K., et al. 2021): Pre-Diabetes is nearly same as Type 2 Diabetes. Here, in the Pre-Diabetes stage the amount of blood sugar is not greater than Type 2 Diabetes.

  • 4.

    Gestational Diabetes Mellitus (GDM) (Barik, S., et al. 2021): This type of diabetes consists of carbohydrate of varying intensity during pregnancy. GDM has no specific clinical features, it is diagnosed after screening.

Symptoms of Diabetes

  • 1.

    Excessive thirst

  • 2.

    Slow healing sores and recurrent infection

  • 3.

    Feeling lazy

  • 4.

    Blurred vision

  • 5.

    Tingling in hand and feet

  • 6.

    Swollen gums

  • 7.

    Excessive urination

  • 8.

    Weight lost

Causes of Diabetes

  • 1.

    Diabetes due to obesity

  • 2.

    Hereditary

  • 3.

    High sugar levels during pregnancy

  • 4.

    Blood vessel diseases

  • 5.

    High blood pressure & high cholesterol

  • 6.

    Pre-diabetes or impaired fasting glucose

Key Terms in this Chapter

Decision Tree: A decision tree is a decision-making aid that employs a tree-like model of choices and their potential results, such as chance event outcomes, cost objects, and usefulness. It’s one approach to show an algorithm made up entirely of conditional control statements.

Diabetes Mellitus: A condition in which the body's capacity to create or react to the hormone insulin is hampered, resulting in improper carbohydrate metabolism and high blood glucose levels.

Random Forests: It known as random decision forests is an ensemble learning approach for classification, regression, and other tasks that works by building a large number of decision trees during training. For classification tasks, the random forest's output is the class chosen by the majority of trees.

Support-Vector Machines: They are supervised learning models using learning algorithms that evaluate data for classification and regression analysis in machine learning.

Pima Indian Diabetes: The National Institute of Diabetes and Digestive and Kidney Diseases provided this data. The dataset's goal is to diagnose if a person has diabetes using diagnostic metrics provided in the collection.

K-Nearest Neighbour Algorithm: The k-nearest neighbours’ technique is a non-parametric supervised learning approach invented by Evelyn Fix and Joseph Hodges in 1951 and subsequently extended by Thomas Cover in statistics. It is used in the categorization and regression of data. In both circumstances, the input is a data set with the k closest training samples.

Complete Chapter List

Search this Book:
Reset