论文标题

孟加拉手写数字识别的手工制作特征提取技术的经典方法

A Classical Approach to Handcrafted Feature Extraction Techniques for Bangla Handwritten Digit Recognition

论文作者

Wahid, Md. Ferdous, Shahriar, Md. Fahim, Sobuj, Md. Shohanur Islam

论文摘要

孟加拉手写数字识别是孟加拉OCR发展的重要一步。但是,孟加拉数字的复杂形状,结构相似和独特的构图风格使区分相对挑战。因此,在本文中,我们对四个严格的分类器进行了测试,以识别孟加拉手写数字:K-nearest邻居(KNN),支持矢量机(SVM),随机森林(RF)和梯度提高决策树(GBDT),基于三个手工制作的特征提取技术:正式班级(HOG)bility and bily Binaly(Hog)bility(Hog),lbty(hog)bily bility(hog),lb。 Bangla手写数字数据集:Numtadb,Cmartdb,Ekush和BDRW。在这里,手工制作的特征提取方法用于从数据集图像中提取功能,然后将其用于训练机器学习分类器来识别孟加拉手写数字。我们进一步微调了分类算法的超参数,以便从这些算法中获取最好的孟加拉手写数字识别性能,在我们使用的所有型号中,HOG功能与SVM型号(HOG+SVM)相结合,达到了所有数据集的最佳性能指标。在NumtadB,CmartdB,Ekush和BDRW数据集上,HOG+SVM方法的识别精度达到了93.32%,98.08%,95.68%和89.68%的识别精度,并分别将模型性能与最新的ART-ART方法进行了比较。

Bangla Handwritten Digit recognition is a significant step forward in the development of Bangla OCR. However, intricate shape, structural likeness and distinctive composition style of Bangla digits makes it relatively challenging to distinguish. Thus, in this paper, we benchmarked four rigorous classifiers to recognize Bangla Handwritten Digit: K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest (RF), and Gradient-Boosted Decision Trees (GBDT) based on three handcrafted feature extraction techniques: Histogram of Oriented Gradients (HOG), Local Binary Pattern (LBP), and Gabor filter on four publicly available Bangla handwriting digits datasets: NumtaDB, CMARTdb, Ekush and BDRW. Here, handcrafted feature extraction methods are used to extract features from the dataset image, which are then utilized to train machine learning classifiers to identify Bangla handwritten digits. We further fine-tuned the hyperparameters of the classification algorithms in order to acquire the finest Bangla handwritten digits recognition performance from these algorithms, and among all the models we employed, the HOG features combined with SVM model (HOG+SVM) attained the best performance metrics across all datasets. The recognition accuracy of the HOG+SVM method on the NumtaDB, CMARTdb, Ekush and BDRW datasets reached 93.32%, 98.08%, 95.68% and 89.68%, respectively as well as we compared the model performance with recent state-of-art methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源