MULTI-ASPECT URDU HANDWRITING DATA COLLECTION

MUHAMMAD FAHAD, MALIK MUHAMMAD SAAD MISSEN, MUJTABA HUSNAIN, ALISAMAD, DALER ALI, ASAD ALI

Manuscript Title:

MULTI-ASPECT URDU HANDWRITING DATA COLLECTION

Author:

MUHAMMAD FAHAD, MALIK MUHAMMAD SAAD MISSEN, MUJTABA HUSNAIN, ALISAMAD, DALER ALI, ASAD ALI

DOI Number:

DOI:10.17605/OSF.IO/U8NEJ

Published : 2023-03-23

About the author(s)

1. MUHAMMAD FAHAD - Government Graduate College Civil Lines Multan, Pakistan.
2. MALIK MUHAMMAD SAAD MISSEN - Faculty of Computing, The Islamia University of Bahawalpur, Pakistan.
3. MUJTABA HUSNAIN - Faculty of Computing, The Islamia University of Bahawalpur, Pakistan.
4. ALISAMAD - Faculty of Computing, The Islamia University of Bahawalpur, Pakistan.
5. DALER ALI - Faculty of Computing, The Islamia University of Bahawalpur, Pakistan.
6. ASAD ALI - Department of Computer Science and Information Technology, National College of Business Administration and Economics Bahawalpur Campus, Pakistan.

Full Text : PDF

Abstract

Urdu script is categorized as one of the cursive and bidirectional script derivedfrom Arabic and Persian script; this is the reason Urdu script shares almost similar challenges and issues but with higher complexity. There is a lack of freely available public datasets for the research in the field of Urdu handwriting recognition. In this paper, we propose a multi-aspect Urdu handwriting data collection by inviting a number of native Urdu speakers from different social groups. To make the dataset more comprehensive, both the isolated charactersand the ligatures are included in the dataset. Furthermore, the persons having physical disability are also invited for data collection to make the corpus more comprehensive. We also give a review of existing data collections for Urdu handwriting recognition and give a comparison of the proposed data collectionwith existing ones.

Keywords

Urdu Handwritten Text, Intelligent CharacterRecognition, Multi-Aspect Data Generation.