UHTelPCC: A Dataset for Telugu Printed Character Recognition

Kummari, Rakesh; Bhagvati, Chakravarthy

UHTelPCC: A Dataset for Telugu Printed Character Recognition

dc.contributor.author	Kummari, Rakesh
dc.contributor.author	Bhagvati, Chakravarthy
dc.date.accessioned	2022-03-27T05:54:18Z
dc.date.available	2022-03-27T05:54:18Z
dc.date.issued	2019-01-01
dc.description.abstract	This paper describes how UHTelPCC, a dataset for Telugu printed character recognition, is created and its characteristics. The dataset is created from characters extracted from images of printed Telugu texts from the period 1950–1990. Thus, it is hoped that the dataset provides the basis for developing practical Telugu OCR systems. UHTelPCC is to provide a standard benchmark for comparing different algorithms for Telugu OCR and helps in research and development of Telugu OCR systems. UHTelPCC contains 70K samples of 325 classes, and these samples are divided into 50K, 10K, 10K training, validation, and test sets respectively. It is hoped that UHTelPCC serves like MNIST, a dataset for handwritten digit recognition, for Telugu printed character recognition. The baseline performances on the test set using KNN, MLP, and CNN are 98.85%, 99.52%, and 99.68% respectively. UHTelPCC is available at http://scis.uohyd.ac.in/~chakcs/UHTelPCC.html.
dc.identifier.citation	Communications in Computer and Information Science. v.1037
dc.identifier.issn	18650929
dc.identifier.uri	10.1007/978-981-13-9187-3_3
dc.identifier.uri	http://link.springer.com/10.1007/978-981-13-9187-3_3
dc.identifier.uri	https://dspace.uohyd.ac.in/handle/1/8703
dc.subject	OCR
dc.subject	OCR dataset
dc.subject	Optical Character Recognition
dc.subject	Printed Telugu OCR
dc.subject	Telugu character dataset
dc.subject	Telugu dataset
dc.subject	UHTelPCC
dc.title	UHTelPCC: A Dataset for Telugu Printed Character Recognition
dc.type	Book Series. Conference Paper
dspace.entity.type

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer and Information Sciences - Publications