A Novel Variable Selection Procedure for Binary Logistic Regression Using Akaike Information Criteria Testing: An Example in Breast Cancer Prediction.

Loading...
Thumbnail Image
Date
2023-07-13
Journal Title
Journal ISSN
Volume Title
Publisher
Turkiye klinikleri
Abstract
Breast cancer is a leading cause of cancer-related death among women worldwide, with approximately 2.3 million new cases and 685,000 deaths reported in 2020 alone. One critical step in developing effective classification and prediction models is variable selection, which involves identifying a subset of relevant variables from a larger set of potential predictors. Accurate variable selection is crucial for building interpretable and robust models that are not overfit to noise, leading to improved model performance and generalization ability. In this paper, we proposed an alternative objective approach for comparing two Akaike Information Criterions (AIC) that originated from two competing models, such that the magnitude of the difference is subjected to the statistical test of significance. Material and Methods: We developed a new backward elimination variable selection procedure similar in spirit to the existing “step AIC” within the environment of R statistical software. We used both simulated and Wisconsin breast cancer diagnostic datasets to compare the proposed method's variable selection and predictive performances with “step AIC” and LASSO. Results: The simulation showed that the proposed AIC procedure achieved higher variable selection sensitivity, specificity and accu racy when compared to stepAIC and LASSO. Also, the proposed AIC method's prediction results are relatively comparable with ste pAIC and LASSO at various simulated data dimensions. Similar supremacy results were observed with the breast cancer dataset used. Conclusion: The AIC-based variable selection approach pro posed is a promising method that integrates AIC with statistical testing for improved variable selection in breast cancer classifica tion and predictio
Description
Keywords
Citation
Olaniran O.R. & Olaniran S.F. (2023). A Novel Variable Selection Procedure for Binary Logistic Regression Using Akaike Information Criteria Testing: An Example in Breast Cancer Prediction. Turkiye Klinkleri Journal of Biostatistics. https://doi.org/10.5336/biostatic.2023-97597