Reduction Strategies to Tackle Class Imbalance in Datasets

Krishnaveni, C.V.

Reduction Strategies to Tackle Class Imbalance in Datasets

Files

TH12217.pdf (1.75 MB)

Date

2021-07-28

Authors

Krishnaveni, C.V.

Publisher

University of Hyderabad

Abstract

Banking, retail, financial, scientific and telecommunications and various other sectors have all been using data mining technologies, for processing massive amounts of data measured in zeta bytes. While this massive amount of data is useful, datasets have to be processed effectively to perform predictive and inferential forecasts for a target population. The Class imbalance, where there are fewer instances of a class than the number of instances in other class/classes in a dataset has posed challenges to the traditional classifiers. Traditional classifiers fail to handle the imbalanced datasets due to inherent assumptions made in designing them. The distribution of classes within the dataset has a direct impact on the classifier/model performance. One of the proven practices to address this problem is to balance the classes in the training data sets. Main goals of the balancing are increasing sensitivity, selecting representative samples from the majority class, maintaining trade-off between Majority Class and Minority Class prediction rates.

URI

https://dspcae.uohyd.ac.in/handle/1/15352

Collections

Computer and Information Sciences - Theses

Full item page