A comparison of Boosting techniques for Classification of Microarray data
Loading...
Date
2023-03-29
Journal Title
Journal ISSN
Volume Title
Publisher
Ilorin Journal of Computer Science and Information Technology
Abstract
Context: The advancements in technology, particularly microarrays, have played a pivotal role in enhancing crucial aspects
within the domains of genomics and bioinformatics. These advancements have significantly contributed to the enhancement of
illness diagnosis, evaluation of therapy response in patients, and advancements in cancer research. Microarray data often
exhibits a substantial likelihood of encompassing extraneous and duplicative factors, hence introducing noise into the dataset.
Consequently, the process of scrutinizing the data to identify significant patterns for diagnosis can be quite daunting when
employing conventional statistical approaches. Numerous studies are currently being conducted to enhance the analysis of
microarray data, with the aim of enhancing performance and prediction accuracy at an accelerated pace. Most of these earlier
methods are limited in their predictive capacity and are characterised by high computational time and algorithm complexity.
Objective: This research addresses some of these issues by implementing the classification of microarray data using Boosting
algorithms. Method: Benchmarked on a publicly available dataset, the microarray data was cleaned, normalised and salient
features carrying essential information were obtained. Three state of the art boosting algorithms; AdaBoost, Gradient Boost,
and XGBoost were used in classifying the microarray data and the performance result of each was compared. Results: The
experimental findings indicate that XGBoost demonstrates superior performance compared to other boosting approaches, with
a classification accuracy rate of 98.18% and training time of 11seconds. Conclusions: The novelty of the experiment compared
to earlier work is evident in the training time reported which is an information not frequently explicit in other report of findings.
Keywords: Classification, Boosting techniques, and Microarray data