A Comparative Analysis of Feature Selection and Feature Extraction Models for Classifying Microarray Dataset
Loading...
Date
2018
Journal Title
Journal ISSN
Volume Title
Publisher
School of Engineering and Computing, University of the West of Scotland - Computing and Information Systems Journal, 22(2), 29 – 38
Abstract
Purpose: The purpose of this research is to apply
dimensionality reduction methods to fetch out the
smallest set of genes that contributes to the
efficient performance of classification algorithms
in microarray data.
Design/Methodology/Approach: Using colon
cancer microarray dataset, One-Way- Analysis of
Variance is used as a feature selection
dimensionality reduction technique, due to its
robustness and efficiency to select relevant
information in a high-dimension of colon cancer
microarray dataset. Principal Component Analysis
(PCA) and Partial Least Square (PLS) are used as
feature extraction techniques, by projecting the
reduced high-dimensional data into efficient lowdimensional
space. The classification capability of
colon cancer datasets is carried out using a good
classifier such as Support Vector Machine (SVM).
The study is analyzed using MATLAB 2015.
Findings: The study obtained high accuracies and
the performances of the dimension reduction
techniques used are compared. The PLS-Based
attained 95% accuracy having edge over the other
dimension reduction methods (One-Way- ANOVA
and PCA).
Practical Implications: The major implication of
this research is getting the local dataset in the
environments which lead to the usage of an open
resource dataset.
Originality: This study gives an insight and
implications of high dimensional data in
microarray gene analysis. The application of
dimensionality reduction helps in fetching out
irrelevant information that halts the performance
of a microarray data technology.
Keywords: Dimension Reduction, One-Way-
ANOVA, PCA, PLS, Classification