DSpace Angular :: Browsing by Author "Saheed, Y.K."

Browsing by Author "Saheed, Y.K."

Now showing 1 - 8 of 8

A Comparative Analysis of Feature Extraction Methods for Classifying Colon Cancer Microarray Data
(EAI Publishing - EAI Endorsed Transactions on Scalable Information Systems, 4(14), 1 – 6, 2017) Arowolo, M.O.; Isiaka, R.M.; Abdulsalam, S.O; Saheed, Y.K.; Gbolagade, K.A.
Feature extraction is a proficient method for reducing dimensions in the analysis and prediction of cancer classification. Microarray procedure has shown great importance in fetching informative genes that needs enhancement in diagnosis. Microarray data is a challenging task due to high dimensional-low sample dataset with a lot of noisy or irrelevant genes and missing data. In this paper, a comparative study to demonstrate the effectiveness of feature extraction as a dimensionality reduction process is proposed, and concludes by investigating the most efficient approach that can be used to enhance classification of microarray. Principal Component Analysis (PCA) as an unsupervised technique and Partial Least Square (PLS) as a supervised technique are considered, Support Vector Machine (SVM) classifier were applied on the dataset. The overall result shows that PLS algorithm provides an improved performance of about 95.2% accuracy compared to PCA algorithms. Keywords: Dimensionality Reduction, Principal Component Analysis, Partial Least Square, Support Vector Machine
A Feature Selection Based on One-Way ANOVA for Microarray Data Classification
(College of Natural Sciences, Al-Hikmah University, Ilorin, Nigeria - Al-Hikmah Journal of Pure and Applied Sciences, 2016) Arowolo, M.O.; Abdulsalam, S.O.; Saheed, Y.K.; Salawu, M.D.
High dimensionality of microarray data and expressions of thousands of features in a much smaller number of samples is a challenge affecting the applicability of the analytical results. However Support Vector Machine (SVM) has been commonly used in the classification of microarray datasets, yet the problem of high dimensionality of the feature space of data still exist. This study deals with the reduction of gene expression data into a minimal subset of genes, by introducing feature selection, to greatly reduce computational burden and noise arising from irrelevant genes that can perform a classification of cancer from microarray data using machine learning. Various statistical theory and Machine Learning (ML) algorithms to select important features, remove redundant and irrelevant features have been proposed, but it is unclear how these algorithms respond to conditions like small sample-sizes. This paper presents combination of Analysis of Variance (ANOVA) for feature selection; to reduce high data dimensionality of feature space and SVM algorithms technique for classification; to reduce computational complexity and effectiveness. Computational burden and noise arising from redundant and irrelevant features are eliminated. It reduces gene expression data to a lesser number of genes rather than thousands of genes, which can drop the cost for cancer testing significantly. The proposed approach selects most informative subset of features for classification to obtain a high performance accuracy, sensitivity, specificity and precision. Key words: Gene expressions, Microarray, One-Way-ANOVA, Support Vector Machines
Customer Churn Prediction in Banking Industry Using K-Means and Support Vector Machine Algorithms
(Crown Academic Publishing - International Journal of Multidisciplinary Sciences and Advanced Technology, 1(1), 48 – 54, 2020) Abdulsalam, S.O.; Arowolo, M.O.; Jimada-Ojuolape, B.; Saheed, Y.K.
This study proposes a customer churn mining structure based on data mining methods in a banking sector. This study predicts the behavior of customers by using clustering technique to analyze customer’s competence and continuity with the sector using k-means clustering algorithm. The data is clustered into 3 labels, on the basis of the transaction in and outflow. The clustering results were classified using Support Vector Machine (SVM), an Accuracy of 97% was achieved. This study enables the banking administrators to mine the conduct of their customers and may prompt proper strategies as per engaging quality and improve proper conducts of administrator capacities in customer relationship. . Keywords: Customer Churn, Banks, K-Means and SVM
Customer Churn Prediction in Telecommunication Industry Using Classification and Regression Trees and Artificial Neural Network Algorithms
(Institute of Advanced Engineering and Science - Indonesian Journal of Electrical Engineering and Informatics, 10(2), 431 - 440, 2022) Abdulsalam, S.O.; Arowolo, M.O.; Saheed, Y.K.; Afolayan, J.O.
Customer churn is a serious problem, which is a critical issue encountered by large businesses and organizations. Due to the direct impact on the company's revenues, particularly in sectors such as the telecommunications as well as the banking, companies are working to promote ways to identify the churn of prospective consumers. Hence it is vital to investigate issues that influence customer churn to yield appropriate measures to diminish churn. The major objective of this work is to advance a model of churn prediction that helps telecom operatives to envisage clients that are most probable to be subjected to churn. The experimental approach for this study uses the machine learning procedures on the telecom churn dataset, using an improved Relief-F feature selection algorithm to pick related features from the huge dataset. To quantify the model's performance, the result of classification uses CART and ANN, the accuracy shows that ANN has a high predictive capacity of 93.88% compared to the 91.60% CART classifier. Keywords: Telecoms, Relief-F, ANN, CART, Churn
Development of Iris Biometric Template Security Using Steganography
(School of Engineering and Computing, University of the West of Scotland - Computing and Information Systems Journal, 22(3), 8 – 17, 2018) Saheed, Y.K.; Abdulsalam, S.O.; Arowolo, M.O.; Babatunde, A.N.
Purpose: Traditional iris segmentation methods and strategies regularly contain an exhaustive search of a large parameter space, which is sensitive to noise, time-consuming and no longer secured enough. To address these challenges, this paper proposes a secured iris template. Approach: This paper proposes a technique to secure the iris template using steganography. The experimental analysis was carried out on matrix laboratory (MATLABR2015A) environment. The segmented iris region was normalized to decrease the dimensional inconsistencies between iris region areas with the aid of the usage of Hough transform (HT). The features of the iris were encoded by convolving the normalized iris region with 1D Log- Gabor filters in order to generate a bit-wise biometric template. Then, least significant bit (LSB) was used to secure the iris template. The Hamming distance was chosen as a matching metric, which gives the measure of how many bits disagreed between the templates of the iris. Findings: The system operated at a very good training time and high level of conditional testing signifying high optimization with recognition accuracy of 92% and error of 1.7%. The proposed system is reliable, secure and efficient with the computational complexity significantly reduced. Originality/value: The proposed technique provides an efficient approach for securing the iris template. Keywords: Biometric, Iris Recognition System (IRS), Least significant bit (LSB), Hough Transform, 1D Log-Gabor filters
Knowledge Discovery from Educational Database Using Apriori Algorithm
(Georgian Technical University an Niko Muskhelishvili Institute of Computational Mathematics, Georgia - Georgian Electronic Scientific Journals (GESJ): Computer Science and Telecommunications, 1(51): 41 – 51, 2017) Abdulsalam, S.O.; Hambali, M.A.; Salau-Ibrahim, T.T.; Saheed, Y.K.; Babatunde, A.N.
Ability to predict student’s performance has become very crucial in educational environments and plays important role in producing the best quality graduates. There are several statistical tools for analyzing students' performance for knowledge discovery from available data. This study presents data mining in educational sector that identifies students’ failure pattern using Apriori algorithm. The results of 20 students in 25 courses taken in their 100 and 200 level of an educational institute in North Central Nigeria were considered as a case study. The patterns discovered were used to provide recommendations to academic planners so as to improve their level of decision making, restructuring of curriculum, and modifying the prerequisites of various courses. This study revealed some interesting patterns in failed courses as some failed courses have a relationship with other failed courses. A data mining software for mining student failed courses was developed, used to mine students result, and the analysis were described. Keywords: Association Rule Mining, Apriori Algorithm, Academic performance, Educational data mining, Curriculum, Educational database, Students' result repository
Student’s Performance Analysis Using Decision Tree Algorithms
(Faculty of Computers and Applied Computer Science, Tibiscus University of Trinisoara, Romania - Annals Computer Science Series Journal, Faculty of Computers and Applied Computer Science, Tibiscus University of Trinisoara, Romania, 15, 55– 62, 2017) Abdulsalam, S.O.; Saheed, Y.K.; Hambali, M.A.; Salau-Ibrahim, T.T.; Babatunde, A.N.
Educational Data Mining (EDM) is concerns with developing and modeling methods that discover knowledge from data originating from educational environments. This paper presents the use of data mining approach to study students’ performance in CSC207 (Internet Technology and Programming I) a 200 level course in the department of Computer, Library and Information Science. Data mining provides many approaches that could be used to study the students’ performance, classification task is used in this work to evaluate the student’s performance and as there are numbers of approaches that can be used for data classification, including decision tree method. In this work, decision trees were used which include BFTree, J48 and CART. Students’ attribute such as Attendance, Class test, Lab work, Assignment, Previous Semester Marks and End Semester Marks were collected from the students’ management system, to predict the performance at the end of semester examination. This paper also investigates the accuracy of different Decision tree algorithms used. The experimental results show that BFtree is the best algorithm for classification with correctly classified instance of 67.07% and incorrectly classified instance of 32.93%. KEYWORDS: Classification, Decision tree, Students’ Performance, Educational Data Mining
Towards a New Hybrid Synthetic Minority Oversampling Technique for Imbalanced Problem in Software Defect Prediction
(IEEE Xplore - Proceedings of 5th International Conference on Data Analytics for Business and Industry (ICDABI), University of Bahrain, October 23-24, 2024, 2024) Saheed, Y.K.; Abdulsalam, S.O.; Ibrahim, M.B.; Baba, U.A.
The software industry strives to improve software quality through continuous bug prediction, bug elimination, and module fault prediction. This issue has piqued researchers’ interest because of its significant relevance in the software industry. Frequently, Software Defect Prediction (SDP) models contain significantly skewed data, making it difficult for classifiers to recognize defective occurrences. The machine learning (ML) community has put a lot of effort into solving the problem of learning from imbalanced SDP data, though less so in empirical software engineering. The over-sampling strategy is one of many recent solutions to this problem. This strategy balances the number of defective and non-defective cases by creating new defective instances. Unfortunately, these methods would result in non-diverse synthetic instances as well as a large number of unneeded noise instances, creating an imbalanced class problem. As a result, we propose a hybrid synthetic minority oversampling (HSMOTE) to address the problem of imbalance in SDP. In this paper, we introduce the hybrid Synthetic Minority Oversampling Technique (HSMOTE), a method that utilizes Extra Tree, Random Forest (RF), and Extreme Gradient Boosting (Xgboost) for classification. We develop and deploy the proposed method on the National Aeronautics and Space Administration (NASA) dataset, evaluating its performance on three datasets: JM1, KC1, and PC3. We compared the parameters for accuracy, precision, AUC, recall, F-measure, and Mathew Correlation Coefficient to those of the existing SDP. The findings from the simulations on the JM1 data showed that the proposed techniques work better than the current best models. The proposed SMOTE+RF technique surpasses the existing techniques with an accuracy of 93.69%, an AUC of 82.70%, and an F-measure of 32.98%. Similarly, the proposed SMOTE+Xgboost method outperforms the existing techniques with an accuracy of 93.432%, an AUC of 82.64%, and an F-measure of 34.13%, while SMOTE+ET achieved an accuracy of 93.43%, an AUC of 77.68%, and an F-measure of 31.90%. Keywords—Synthetic Minority Oversampling Technique, Class Imbalance, Oversampling Method, Software Defect Prediction, Imbalance Data

Browsing by Author "Saheed, Y.K."

Results Per Page

Sort Options