DSpace Angular :: Browsing by Author "Arowolo, M.O."

Browsing by Author "Arowolo, M.O."

Now showing 1 - 12 of 12

A Chi-Square-SVM Based Pedagogical Rule Extraction Method for Microarray Data Analysis
(Institute of Advanced Engineering and Science - International Journal of Advances in Applied Sciences. 9(2): 93 – 100, 2020) Salawu, M.D.; Arowolo, M.O.; Abdulsalam, S.O.; Isiaka, R.M.; Jimada-Ojuolape, B.; Mudashiru, L.O.; Gbolagade, K.A.
Support Vector Machine (SVM) is currently an efficient classification technique due to its ability to capture nonlinearities in diagnostic systems, but it does not reveal the knowledge learnt during training. It is important to understand of how a decision is reached in the machine learning technology, such as bioinformatics. On the other hand, a decision tree has good comprehensibility; the process of converting such incomprehensible models into an understandable model is often regarded as rule extraction. In this paper we proposed an approach for extracting rules from SVM for microarray dataset by combining the merits of both the SVM and decision tree. The proposed approach consists of three steps; the SVM-CHI-SQUARE is employed to reduce the feature set. Dataset with reduced features is used to obtain SVM model and synthetic data is generated. Classification and Regression Tree (CART) is used to generate Rules as the Last phase. We use breast masses dataset from UCI repository where comprehensibility is a key requirement. From the result of the experiment as the reduced feature dataset is used, the proposed approach extracts smaller length rules, thereby improving the comprehensibility of the system. We obtained accuracy of 93.53%, sensitivity of 89.58%, specificity of 96.70%, and training time of 3.195 seconds. A comparative analysis is carried out done with other algorithms. Keywords: Machine learning, Medical diagnosis, Rule-extraction, SVMs
A Comparative Analysis of Feature Extraction Methods for Classifying Colon Cancer Microarray Data
(EAI Publishing - EAI Endorsed Transactions on Scalable Information Systems, 4(14), 1 – 6, 2017) Arowolo, M.O.; Isiaka, R.M.; Abdulsalam, S.O; Saheed, Y.K.; Gbolagade, K.A.
Feature extraction is a proficient method for reducing dimensions in the analysis and prediction of cancer classification. Microarray procedure has shown great importance in fetching informative genes that needs enhancement in diagnosis. Microarray data is a challenging task due to high dimensional-low sample dataset with a lot of noisy or irrelevant genes and missing data. In this paper, a comparative study to demonstrate the effectiveness of feature extraction as a dimensionality reduction process is proposed, and concludes by investigating the most efficient approach that can be used to enhance classification of microarray. Principal Component Analysis (PCA) as an unsupervised technique and Partial Least Square (PLS) as a supervised technique are considered, Support Vector Machine (SVM) classifier were applied on the dataset. The overall result shows that PLS algorithm provides an improved performance of about 95.2% accuracy compared to PCA algorithms. Keywords: Dimensionality Reduction, Principal Component Analysis, Partial Least Square, Support Vector Machine
A Feature Selection Based on One-Way ANOVA for Microarray Data Classification
(College of Natural Sciences, Al-Hikmah University, Ilorin, Nigeria - Al-Hikmah Journal of Pure and Applied Sciences, 2016) Arowolo, M.O.; Abdulsalam, S.O.; Saheed, Y.K.; Salawu, M.D.
High dimensionality of microarray data and expressions of thousands of features in a much smaller number of samples is a challenge affecting the applicability of the analytical results. However Support Vector Machine (SVM) has been commonly used in the classification of microarray datasets, yet the problem of high dimensionality of the feature space of data still exist. This study deals with the reduction of gene expression data into a minimal subset of genes, by introducing feature selection, to greatly reduce computational burden and noise arising from irrelevant genes that can perform a classification of cancer from microarray data using machine learning. Various statistical theory and Machine Learning (ML) algorithms to select important features, remove redundant and irrelevant features have been proposed, but it is unclear how these algorithms respond to conditions like small sample-sizes. This paper presents combination of Analysis of Variance (ANOVA) for feature selection; to reduce high data dimensionality of feature space and SVM algorithms technique for classification; to reduce computational complexity and effectiveness. Computational burden and noise arising from redundant and irrelevant features are eliminated. It reduces gene expression data to a lesser number of genes rather than thousands of genes, which can drop the cost for cancer testing significantly. The proposed approach selects most informative subset of features for classification to obtain a high performance accuracy, sensitivity, specificity and precision. Key words: Gene expressions, Microarray, One-Way-ANOVA, Support Vector Machines
A KNN and ANN Model for Predicting Heart Diseases, Chapter 12
(The Institution of Engineering and Technology - Explainable Artificial Intelligence in Medical Decision Support Systems, 2022) Abdulsalam, S.O.; Arowolo, M.O.; Udofot, E.O.; Sanni, A.M.; Popoola, D.D.; Adebiyi, M.O.
The heart is the single most important organ in the human body. Patients, professions, and medical systems are all bearing the brunt of heart failure’s devastating effects on contemporary society. Since cardiac arrest may well be demonstrated as a better understanding or conceivably go unobserved, particularly in the vast population of clients that have other cardiovascular disorders, the true prevalence of heart failure is likely to be underestimated, accounting for only 1–4% of all hospitalized patients as test procedures in developed nations.A person with heart failure has a heart that is unable to circulate sufficient blood through the body, but the term“heart failure” does not explain why this happens. The clinical picture is confusing since there are several possible causes of heart problems, many of which are diseases in and of themselves. Many cases of heart failure can be avoided if the underlying medical conditions that cause them are identified and treated promptly. The study and prediction of cardiac conditions must be precise because numerous diseases have been connected to the cardiovascular system. The resolution of this problem requires intensive online research on the relevant topic. Since incorrect illness prognoses are a leading cause of death among heart patients, learning more about effective prediction algorithms is crucial. This research utilizes K-nearest neighbor (KNN) and artificial neural network (ANN) to assess cardiovascular diseases using data collected from Kaggle. The highest accuracy (96%) was achieved by ANN trained with the standard scalar. Medical experts, specialists, and academics can all benefit greatly from this study. Based on the results of this study, cardiologists will be able to make more knowledgeable decisions about the inhibition, analysis, and handling of heart disease. Keywords: Heart; Cardio; Disease; Machine learning; Prediction; KNN; ANN; CNN
An Adaptive Genetic Algorithm with Recursive Feature Elimination Approach for Predicting Malaria Vector Gene Expression Data Classification Using Support Vector Machine
(Walailak University - Walailak Journal of Science and Technology, 2021) Arowolo, M.O.; Adebiyi, M.O.; Nnodim, C.T.; Abdulsalam, S.O.; Adebiyi, A.A.
As mosquito parasites breed across many parts of the sub-Saharan Africa part of the world, infected cells embrace an unpredictable and erratic life period. Millions of individual parasites have gene expressions. Ribonucleic acid sequencing (RNA-seq) is a popular transcriptional technique that has improved the detection of major genetic probes. The RNA-seq analysis generally requires computational improvements of machine learning techniques since it computes interpretations of gene expressions. For this study, an adaptive genetic algorithm (A-GA) with recursive feature elimination (RFE) (A-GA-RFE) feature selection algorithms was utilized to detect important information from a high-dimensional gene expression malaria vector RNA-seq dataset. Support Vector Machine (SVM) kernels were used as the classification algorithms to evaluate its predictive performances. The feasibility of this study was confirmed by using an RNA-seq dataset from the mosquito Anopheles gambiae. The technique results in related performance had 98.3 and 96.7 % accuracy rates, respectively. Keywords: RNA-seq, Adaptive genetic algorithm, Recursive feature elimination, Malaria vector, Support Vector Machine kernels
Classification of Customer Churn Prediction Model for Telecommunication Industry Using Analysis of Variance
(Institute of Advanced Engineering and Science - International Journal of Artificial Intelligence, 12(3), 1323 – 1329, 2023) Babatunde, R; Abdulsalam, S.O.; Abdulsalam, O.A.; Arowolo, M.O.
Customer predictive analytics has shown great potential for effective churn models. Thriving in today's telecommunications industry, discerning between consumers who are likely to migrate to a competitor is enormous. Having reliable predictive client behavior in the future is required. Machine learning algorithms are essential to predict customer turnovers, and researchers have proposed various techniques. Churn prediction is a problem due to the unequal dispersal of classes. Most traditional machine learning algorithms are ineffective in classifying data. Client cluster with a higher risk has been discovered. A support vector machine (SVM) is employed as the foundational learner, and a churn prediction model is constructed based on each analysis of variance (ANOVA). The separation of churn data revealed by experimental assessment is recommended for churn prediction analysis. Customer attrition is high, but an instantaneous support can ensure that customer needs are addressed and assess an employee's capacity to achieve customer satisfaction. This study uses an ANOVA with a SVM, classification in analyzing risks in telecom systems It may be determined that SVM provides the most accurate forecast of customer turnover (95%). The projected outcomes will allow other organizations to assess possible client turnover and collect customer feedback. Keywords: Analysis of variance ,Churn, Machine learning , Support vector machine, Telecommunication
Customer Churn Prediction in Banking Industry Using K-Means and Support Vector Machine Algorithms
(Crown Academic Publishing - International Journal of Multidisciplinary Sciences and Advanced Technology, 1(1), 48 – 54, 2020) Abdulsalam, S.O.; Arowolo, M.O.; Jimada-Ojuolape, B.; Saheed, Y.K.
This study proposes a customer churn mining structure based on data mining methods in a banking sector. This study predicts the behavior of customers by using clustering technique to analyze customer’s competence and continuity with the sector using k-means clustering algorithm. The data is clustered into 3 labels, on the basis of the transaction in and outflow. The clustering results were classified using Support Vector Machine (SVM), an Accuracy of 97% was achieved. This study enables the banking administrators to mine the conduct of their customers and may prompt proper strategies as per engaging quality and improve proper conducts of administrator capacities in customer relationship. . Keywords: Customer Churn, Banks, K-Means and SVM
Customer Churn Prediction in Telecommunication Industry Using Classification and Regression Trees and Artificial Neural Network Algorithms
(Institute of Advanced Engineering and Science - Indonesian Journal of Electrical Engineering and Informatics, 10(2), 431 - 440, 2022) Abdulsalam, S.O.; Arowolo, M.O.; Saheed, Y.K.; Afolayan, J.O.
Customer churn is a serious problem, which is a critical issue encountered by large businesses and organizations. Due to the direct impact on the company's revenues, particularly in sectors such as the telecommunications as well as the banking, companies are working to promote ways to identify the churn of prospective consumers. Hence it is vital to investigate issues that influence customer churn to yield appropriate measures to diminish churn. The major objective of this work is to advance a model of churn prediction that helps telecom operatives to envisage clients that are most probable to be subjected to churn. The experimental approach for this study uses the machine learning procedures on the telecom churn dataset, using an improved Relief-F feature selection algorithm to pick related features from the huge dataset. To quantify the model's performance, the result of classification uses CART and ANN, the accuracy shows that ANN has a high predictive capacity of 93.88% compared to the 91.60% CART classifier. Keywords: Telecoms, Relief-F, ANN, CART, Churn
Development of Iris Biometric Template Security Using Steganography
(School of Engineering and Computing, University of the West of Scotland - Computing and Information Systems Journal, 22(3), 8 – 17, 2018) Saheed, Y.K.; Abdulsalam, S.O.; Arowolo, M.O.; Babatunde, A.N.
Purpose: Traditional iris segmentation methods and strategies regularly contain an exhaustive search of a large parameter space, which is sensitive to noise, time-consuming and no longer secured enough. To address these challenges, this paper proposes a secured iris template. Approach: This paper proposes a technique to secure the iris template using steganography. The experimental analysis was carried out on matrix laboratory (MATLABR2015A) environment. The segmented iris region was normalized to decrease the dimensional inconsistencies between iris region areas with the aid of the usage of Hough transform (HT). The features of the iris were encoded by convolving the normalized iris region with 1D Log- Gabor filters in order to generate a bit-wise biometric template. Then, least significant bit (LSB) was used to secure the iris template. The Hamming distance was chosen as a matching metric, which gives the measure of how many bits disagreed between the templates of the iris. Findings: The system operated at a very good training time and high level of conditional testing signifying high optimization with recognition accuracy of 92% and error of 1.7%. The proposed system is reliable, secure and efficient with the computational complexity significantly reduced. Originality/value: The proposed technique provides an efficient approach for securing the iris template. Keywords: Biometric, Iris Recognition System (IRS), Least significant bit (LSB), Hough Transform, 1D Log-Gabor filters
Performance Evaluation of ANOVA and RFE Algorithms for Classifying Microarray Dataset Using SVM
(Springer Nature Switzerland, Lecture Notes in Business Information Processing 402: 480 – 492 - Proceedings of 17th European, Mediterranean, and Middle Eastern Conference, EMCIS 2020, Dubai, United Arab Emirates, November 25–26, 2020, 2020) Abdulsalam, S.O.; Abubakar, A.M.; Ajao, J.F.; Babatunde, R.S.; Ogundokun, R.O.; Nnodim, C.T.; Arowolo, M.O.
A significant application of microarray gene expression data is the classification and prediction of biological models. An essential component of data analysis is dimension reduction. This study presents a comparison study on a reduced data using Analysis of Variance (ANOVA) and Recursive Feature Elimination (RFE) feature selection dimension reduction techniques, and evaluates the relative performance evaluation of classification procedures of Support Vector Machine (SVM) classification technique. In this study, an accuracy and computational performance metrics of the processes were carried out on a microarray colon cancer dataset for classification, SVM-RFE achieved 93% compared to ANOVA with 87% accuracy in the classification output result. Keywords: SVM-RFE; ANOVA; Microarray; SVM; Cancer
Performance Evaluation of Support Vector Machine Kernel Functions on Students’ Educational Data Set, Chapter 7.
(CRC Press, Taylor & Francis Group, London - Social Media and Crowdsourcing: Application and Analytics. (pp. 125-149), 2024) Abdulsalam, S.O.; Ganiyu, R.A.; Omidiora, E.O.; Olabiyisi, S.O.; Arowolo, M.O.
Support Vector Machine (SVM) classifier is currently one of the most popular and extensively used classification algorithm because it produces better classification accuracy performance. However, kernels’ functions parameter optimization during SVM classification process significantly affects its classification accuracy. This research analyses the performance of four SVM kernel functions - Linear Function (LF), Polynomial Function (PF), Radial Basis Function (RBF) and Sigmoid Function (SF) using Grid search method with the intent to determine the best kernel function. The students’ educational dataset used in this research was obtained from the Department of Computer Science in a university in North Central region of Nigeria for a period of 5 years (2009 – 2014). The dataset was a multi-class data comprising records of 153 graduates with 66 predictor variables (sex, age and 64 courses offered by each student) and their final year grade as the class label. The performance of the four kernel functions was evaluated in classification accuracy, sensitivity, specificity, number of support vectors and computation time. The evaluation results of LF, RBF, PF and SF yielded classification accuracies of 79.31, 86.21, 82.76, and 34.5%, respectively. Also, LF, RBF, PF, and SF recorded sensitivities of 71.12, 75.56, 65.56, and 22.20%, respectively. Moreover, LF, RBF, PF, and SF produced specificities of 94.44, 96.34, 95.36, and 74.06%, respectively. Likewise, the number of support vectors generated for LF, RBF, PF, and SF were 78, 80, 82, and 121, respectively. Furthermore, LF, RBF, and PF utilized the same computation time of 0.03 seconds, while SF executed for 0.06 seconds. The findings from the results revealed that RBF kernel is more efficient than the three other kernel functions based on the performance metrics and dataset used in this research. Keywords— Support Vector Machine, Kernels’ functions, Parameter optimization, Grid search method
Stroke Disease Prediction Model Using ANOVA with Classification Algorithms, Chapter 8
(Springer Nature Publishing Singapore - Artificial Intelligence in Medical Virology, Springer Nature Singapore, (pp.117 – 134), 2023) Abdulsalam, S.O.; Arowolo, M.O.; Oroghi, R.
Stroke is a health ailment where the brain plasma blood vessel is ruptured, triggering impairment to the brain. Symptoms may appear when the brain's blood flow and other nutrients are disrupted. Stroke is the leading cause of bereavement and disability universally, according to the World Health Organization. To predict a patient’s risk of having stroke, this project used machine learning (ML) approach on a stroke dataset obtained from Kaggle, the ANOVA (Analysis of Variance) feature selection method with and without the following four Classification procedures; Logistic Regression, K-Nearest Neighbor, Naïve Bayes, and Decision Tree, after which the dataset was preprocessed. The K-Nearest Neighbor algorithm gave the best performance accuracy of approximately 97% without ANOVA and Decision Tree algorithm with ANOVA method gave 96%. The accuracy of the developed models employed in this study is substantially better, showing that the models employed in this study are much more reliable. And it can be deduced from previous existing works. Keywords: ANOVA, Stroke, KNN, Decision tree, Machine learning

Browsing by Author "Arowolo, M.O."

Results Per Page

Sort Options