Search the World's Largest Database of Information Science & Technology Terms & Definitions
InfInfoScipedia LogoScipedia
A Free Service of IGI Global Publishing House
Below please find a list of definitions for the term that
you selected from multiple scholarly research resources.

What is Oversampling

Encyclopedia of Data Science and Machine Learning
The process of randomly adding duplicate observations in the minority class until the minority class has the same observations as the majority class.
Published in Chapter:
Effective Bankruptcy Prediction Models for North American Companies
Rachel Cardarelli (Bryant University, USA), Son Nguyen (Bryant University, USA), Rick Gorvett (Bryant University, USA), and John Quinn (Bryant University, USA)
Copyright: © 2023 |Pages: 15
DOI: 10.4018/978-1-7998-9220-5.ch108
Abstract
Bankruptcy prediction is a widely researched topic. However, few studies focus on dealing with the imbalance problem. This article proposes a new technique that applies a bagging undersampling procedure to balance the data and compares it to random undersampling and five oversampling techniques. The performance of the algorithm is evaluated by a random forest's balanced accuracy, sensitivity, and specificity. The results show that models trained after applying the oversampling techniques are prone to overfitting, and the model trained after applying the proposed method had the highest balanced accuracy without overfitting.
Full Text Chapter Download: US $37.50 Add to Cart
eContent Pro Discount Banner
InfoSci OnDemandECP Editorial ServicesAGOSR