IJMEMES logo

Industrial Engineering Journal


MACHINE LEARNING AND CLASS IMBALANCE: A LITERATURE SURVEY

Swati V. Narwane

Sudhir D. Sawarkar

Abstract

The rapid growth in technologies and inexpensive internet connection has increased the volume of data generated. The data generated can be used to derive lots of information and patterns. Data sets are an essential part of the Machine Learning (ML) technique. But modern data sets are suffering from class imbalance. ML does not work very well with unbalanced data sets. In this context, this paper aims to provide a systematic literature review of unbalanced data sets for ML. The collected papers on class imbalance problem for ML were 4 major categories like binary class imbalance, multi-class imbalance, binary and multi-class imbalance, and rare events class imbalance. The survey focused on, various issues in class imbalance for ML. The purpose of the present paper is to help the scholars and readers in understanding the impact of the class imbalance for ML. This article contributes to the role of unbalanced data sets and their impact on the predictive systems

Keywords- Big data, Unbal ance data set s, Cl ass imbal ance probl em, Machine Learning (ML),

Volume (2019)

Number 10 (Oct)

📄 PDF