Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

A Comprehensive Approach for Using Hybrid Ensemble Methods for Diabetes Detection

Md Sakir Ahmed, Abhijit Bora

Source Title: Critical Approaches to Data Engineering Systems and Analysis

DOI: 10.4018/979-8-3693-2260-4.ch001

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This study is focused on the possible application of hybrid models as well as their usage in the detection of diabetes. This study focuses on various machine learning algorithms like Decision Trees, Random Forests, Logistic Regression, K-nearest neighbor, Support Vector Machines, Gaussian Naive Bayes, Adaptive Boosting Classifier, and Extreme Gradient Boosting as well as the usage of Stacking Classifier for the preparation of the hybrid model. An in-depth analysis was also made during this study to compare the traditional approach with the hybrid approach. Moreover, the usage of data augmentation and its application during an analysis has also been discussed along with the application of hyperparameter tuning and cross-validation during training of the various models.

Chapter Preview

Top

1. Introduction

1.1 Background of the problem

With the increase in the consumption of processed foods, there has also been an increase in the number of cases of cases of diabetes. There has been a linear increase in the number of cases, affecting a vast spectrum of the global population ranging from children to adults to seasoned citizens. This sudden increase can be directly correlated to increased consumption of processed foods as per studies, however, this is not the only factor. Lack of physical activity, consumption of alcohol, smoking, and improper sleep schedules are some of the contributing factors to the rise in the number of cases. Traditional approaches for diabetes detection include urine tests, random blood sugar tests, clinical symptoms, risk assessments, etc. However, with the advent of technology emphasis needs to be given to finding newer methods to identify and assess the various risk factors as well as for early detection of diabetes. This may in turn reduce the number of cases occurring annually enabling a healthier life for the global population.

1.2 Proposed solution

This study is a brief introduction to hybrid ensemble learning models and focuses on giving a detailed overview of the possible implications of these models on early diagnosis as well as their usage for the identification of risk factors. Several machine learning algorithms like Decision Trees, Random Forest, Logistic Regression, K-nearest neighbors, Support Vector Machines, Gaussian Naive Bayes, and Adaptive Boosting Classifier can be used to diagnose diabetes as well as other diseases quite accurately. The accuracy can be further increased by stacking multiple models using a stacking classifier. These stacked models are known as hybrid models and provide better accuracy compared to just using a single classifier due to their robustness in identifying noise in the data, better hyperparameter tuning, enhanced adaptability, and improved generalization.

In this study, several traditional machine-learning algorithms were used along with hyperparameter tuning to find the best suitable parameter for the given data after which the best-performing models were stacked together along with Extreme Gradient Boosting Classifier, and their accuracy for training and testing was calculated. The observations are discussed in detail in Sections 3 and 4.

Top

The HoeffdingTree algorithm was used (Mercaldo et al., 2017) to detect diabetes and showed 77% accuracy and 77.5% recall. Various machine (Al-Zebari & Sengur, 2019) learning algorithms were implemented. It was found that Logistic Regression yielded the best result with an accuracy of 77.9%, and the Coarse Gaussian SVM technique yielded an accuracy of 65.5% which was the lowest among all the algorithms. Again several machine learning models were implemented (Islam et al., 2020) and it was found that the random forest classifier gave a very high accuracy of 99.35% for the prediction of diabetes. It was also observed that (Islam and Khanam, 2021) the Gaussian Naive Bayes classifier yielded an accuracy of 79.87% for the prediction of diabetes. A web app was also developed (Pankaj et al., 2021) for the diagnosis of diabetes that uses a questionnaire rather than a medical test and utilizes machine learning algorithms to predict if a person has diabetes. Similarly, various other machine learning algorithms were implemented (Farajollahi et al., 2021) and it was found that the Adaptive Boosting Classifier yielded the highest accuracy of 81%. In another study (Mangal and Jain, 2022) it was found that Random Forest yielded an accuracy of 99% for the detection of diabetes. It was also observed (Liu et al., 2022) that the Extreme Gradient Boosting classifier yielded the best accuracy of 75% among several other algorithms and proposed that it can be used for screening individuals at high risk of Type 2 diabetes at an early stage. It was also proposed (Charitha et al., 2022) that machine learning algorithms can be used to predict Type 2 diabetes, and observed that the Light Gradient Boosting Machine yielded the highest accuracy of 91.47% . Again in another study (Bhat et al., 2022) it was found that random forest gives an accuracy of 97.75% in the detection of Diabetes Mellitus. In another study (Gowthami et al., 2023), various algorithms were implemented, namely., Logistic Regression, K-Nearest Neighbors, Decision Trees, Random Forest, and Support Vector Machines found that the Random Forest Classifier yielded the highest accuracy with an accuracy of 98% for Type 2 Diabetes Mellitus. KNN was also applied (Rathi and Madeira., 2023) to detect and test its implication regarding Diabetes Mellitus.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

A Comprehensive Approach for Using Hybrid Ensemble Methods for Diabetes Detection

Abstract

1. Introduction

1.1 Background of the problem

1.2 Proposed solution

Complete Chapter List

A Comprehensive Approach for Using Hybrid Ensemble Methods for Diabetes Detection

Abstract

1. Introduction

1.1 Background of the problem

1.2 Proposed solution

2. Related Work

Complete Chapter List