Orthographic Properties Based Telugu Text Recognition Using Hidden Markov Models

dc.contributor.author Rao, Devarapalli Koteswara
dc.contributor.author Negi, Atul
dc.date.accessioned 2022-03-27T05:52:47Z
dc.date.available 2022-03-27T05:52:47Z
dc.date.issued 2018-01-25
dc.description.abstract Telugu script has the glyphs for vowels, consonants and modifiers to form orthographic units called aksharas (characters). In this paper, we present the Telugu printed text recognition based on orthographic properties of the script and Hidden Markov Models (HMMs). One of the orthographic properties is that the consonant modifiers always appear spatially in the middle and lower zones. The concept of peak fringe numbers (PFNs) is used to define a first level classifier. The purpose of the first level classification is to classify the Telugu word images based on the data in lower zone for better modeling. Since conventional Telugu OCR applications facing segmentation difficulties at various levels of segmentation such as akshara and connected-component. We use HMMs for modeling Telugu akshara shapes and also apply bi-grams of aksharas at the recognition stage. Our approach aims to overcome the segmentation problems by attempting a segmentation-free method for Telugu printed text recognition. With the suitability of HMMs for modeling data with variations, our data set also includes document images with different DPIs such as 200, 250 and 300. We measure character error rate (CER) to observe the system performance. The recognition capability of the system is encouraging and the CER of our system is 15 percent.
dc.identifier.citation Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. v.5
dc.identifier.issn 15205363
dc.identifier.uri 10.1109/ICDAR.2017.327
dc.identifier.uri http://ieeexplore.ieee.org/document/8270273/
dc.identifier.uri https://dspace.uohyd.ac.in/handle/1/8564
dc.subject Akshara recognition
dc.subject HMM
dc.subject Telugu OCR
dc.subject Word recognition
dc.title Orthographic Properties Based Telugu Text Recognition Using Hidden Markov Models
dc.type Conference Proceeding. Conference Paper
dspace.entity.type
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: