Beyondthehypesondatarebalancinginimbalancelearning: towards abalancedframeworkandrecommendersystem

Abstract
Class-imbalanced learning presents critical challenges in machine learning, largely because data in most domains are naturally imbalanced. Although resampling techniques have been widely applied to address this issue, their effectiveness has been inconsistent and sometimes flawed, owing to artificial assumptions. In this study, we move beyond hype surrounding resampling methods by exploring alternative strategies, such as ensemble learning, cost-sensitive algorithms, and one-class classification techniques. Through rigorous experimentation across extreme, moderate, and mild imbalance levels, our findings reveal that these alternatives often outperform traditional resampling in terms of F1-scores, with ensemble SVM and one class logistic regression achieving notable values of 1.00 and 0.90, respectively. In addition, we introduce a knowledge-based recommender system designed to help practitioners choose the most appropriate techniques for addressing class imbalance. This research argues that resampling is not always the optimal solution for all instances, thereby advocating a more balanced framework that leverages advanced methods for superior performance in imbalanced learning tasks. Our study advances the field by offering a pragmatic, data-driven approach to overcoming class imbalances, contributing valuable insights for both researchers and practitioners.
Description
Keywords
Citation