Offline handwritten telugu character dataset and recognition

dc.contributor.author Negi, Atul
dc.contributor.author Rao, Anish M.
dc.date.accessioned 2022-03-27T05:52:37Z
dc.date.available 2022-03-27T05:52:37Z
dc.date.issued 2019-12-01
dc.description.abstract Telugu is a Dravidian Language spoken mainly in Southern parts of India. It has close to 81 million native speakers, making it the fifteenth most widely-spoken language in the world. Here we present a comprehensive database of handwritten Telugu characters to drive progress in handwriting recognition for this script. We claim that this is significant since we have put together the largest set of vowel, consonant, vowel-consonant and consonant-consonant pairs of the Telugu orthography. This work produces such a database with real-world offline handwritten characters extracted from scanned documents, making it the largest and most varied database in this domain. The method of collecting data, preprocessing steps, as well as the extraction approach to obtain individual Telugu characters is explained in detail. The dataset is also made open to use as a test set to evaluate handwriting recognition approaches and other related tasks. This work also presents a method of handwritten Telugu character recognition using Convolutional Neural Networks as a baseline classifier, as well as Visual Attention Networks as a more advanced and effective solution. Finally, the proposed architecture is compared with previous solutions and the results are discussed.
dc.identifier.citation 2019 IEEE 16th India Council International Conference, INDICON 2019 - Symposium Proceedings
dc.identifier.uri 10.1109/INDICON47234.2019.9028977
dc.identifier.uri https://ieeexplore.ieee.org/document/9028977/
dc.identifier.uri https://dspace.uohyd.ac.in/handle/1/8547
dc.subject Convolutional Neural Networks
dc.subject Dataset
dc.subject Offline Handwritten Telugu Character Recognition
dc.subject Optical Character Recognition
dc.subject Visual Attention Networks
dc.title Offline handwritten telugu character dataset and recognition
dc.type Conference Proceeding. Conference Paper
dspace.entity.type
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: