Multi-font telugu text recognition using hidden Markov models and Akshara Bi-grams

Devarapalli, Koteswara Rao; Negi, Atul

Multi-font telugu text recognition using hidden Markov models and Akshara Bi-grams

dc.contributor.author	Devarapalli, Koteswara Rao
dc.contributor.author	Negi, Atul
dc.date.accessioned	2022-03-27T05:52:56Z
dc.date.available	2022-03-27T05:52:56Z
dc.date.issued	2016-01-01
dc.description.abstract	Recent advances in the information technology made possible to introduce many Unicode Telugu fonts for the documentation needs of present society. But the recognition of documents printed in a variety of fonts poses new challenges in building Telugu OCR systems. In this paper, we demonstrate multi-font Telugu printed word recognition using implicit segmentation approach that provides segmentation as a by-product of recognition. Our word recognition approach relies on Hidden Markov Models and akshara bi-gram language model to recognize word images in terms of aksharas (characters). The training set of word images is prepared from document images of popular books and the synthetic document images generated using 8 different Unicode fonts. The testing involves matching the feature vector sequence against sequence of akshara HMMs based on bi-grams. The CER and WER of this system are 21% and 37% respectively. The performance of our system is very encouraging.
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). v.10481 LNCS
dc.identifier.issn	03029743
dc.identifier.uri	10.1007/978-3-319-68124-5_21
dc.identifier.uri	http://link.springer.com/10.1007/978-3-319-68124-5_21
dc.identifier.uri	https://dspace.uohyd.ac.in/handle/1/8580
dc.subject	Akshara
dc.subject	Bi-gram
dc.subject	DCT
dc.subject	HMM
dc.subject	Telugu OCR
dc.subject	Word recognition
dc.title	Multi-font telugu text recognition using hidden Markov models and Akshara Bi-grams
dc.type	Book Series. Conference Paper
dspace.entity.type

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer and Information Sciences - Publications