Classification of Leukemia Cancer Data using Correlation Based Feature Selection Model: A Comparative Approach
Loading...
Date
2025
Journal Title
Journal ISSN
Volume Title
Publisher
Faculty of Engineering, Osun State University, Osogbo, Nigeria - UNIOSUN Journal of Engineering and Environmental Sciences
Abstract
The abundance of data obtained from microarray experiments presents challenges related
to the number of variables and the presence of random fluctuations. Despite the efforts that had
been made by previous researchers, emphasizing how data mining aids the implementation of
models to facilitate informed prediction, gaps are evident which requires improvement over the
earlier models. Dimensionality reduction techniques, such as Correlation Based Feature Selection
(CBFS), are good candidate solutions to these problems by selecting pertinent features for
categorization. This research implements a model for classification of leukemia cancer using CBFS
with Support Vector Machine (SVM), k-Nearest Neighbors (KNN), Decision Tree (DT), and
Ensemble classifiers. The evaluation of the performance of these machine learning models was
carried out using sensitivity, specificity, precision and accuracy. The findings indicate that the
CBFS+DT model outperforms the other models in terms of sensitivity (96.75%), specificity
(97.18%), precision (97.56%), accuracy (96.75%), and F1 score (96.97%), while also exhibiting a
decreased computational time (0.4336). This demonstrates the efficacy of CBFS in improving
classification accuracy and reducing computing load. Overall, this study highlights the effectiveness
of CBFS in cancer research and underscores the importance of carefully choosing the most
pertinent variables to enhance classification outcomes.
Keywords: machine learning, classification, feature selection, pattern recognition