A Hybridized Feature Extraction Model for Offline Yorùbá Document Recognition
Loading...
Date
2023-03-22
Journal Title
Journal ISSN
Volume Title
Publisher
Asian Journal of Research in Computer Science
Abstract
Document recognition is required to convert handwritten and text documents into digital equivalents, making them more easily accessible and convenient to store. This study combined feature extraction techniques for recognizing Yorùbá documents in an effort to preserve the cultural values and heritages of the Yorùbá people. Ten Yorùbá documents were acquired from Kwara State University’s Library, and ten indigenous literate writers wrote the handwritten version of the documents. These were digitized using HP Scanjet300 and pre-processed. The pre-processed image served as input to the Local Binary Pattern, Speeded-Up-Robust-Features and Histogram of Gradient. The combined extracted feature vectors were input into the Genetic Algorithm. The reduced feature vector was fed into Support Vector Machine. A 10-folds cross-validation was used to train the model: LBP-GA, SURF-GA, HOG-GA, LBP-SURF-GA, HOG-SURF-GA, LBP-HOG-GA and LBP-HOG-SURF-GA. LBP-HOG-SURF-GA for Yorùbá printed text gave 90.0% precision, 90.3% accuracy and 15.5% FPR. LBP-HOG-SURF-GA for Handwritten Yorùbá document showed 80.9% precision, 82.6% accuracy and 20.4% (FPR) LBP-HOG-SURF-GA for CEDAR gave 98.0% precision, 98.4% accuracy and 2.6% FPR. LBP-HOG-SURF-GA for MNIST gave 99% precision, 99.5% accuracy, 99.0% and 1.1% FPR. The results of the hybridized feature extractions (LBP-HOG-SURF) demonstrated that the proposed work improves significantly on the various classification metrics.
Description
Keywords
Citation
Jumoke F. Ajao, Rafiu M. Isiaka and Ronke S. Babatunde