Localization, extraction and recognition of text in Telugu document images

Negi, Atul; Nikhil Shanker, K.; Chereddi, Chandra Kanth

Localization, extraction and recognition of text in Telugu document images

dc.contributor.author	Negi, Atul
dc.contributor.author	Nikhil Shanker, K.
dc.contributor.author	Chereddi, Chandra Kanth
dc.date.accessioned	2022-03-27T05:53:57Z
dc.date.available	2022-03-27T05:53:57Z
dc.date.issued	2003-01-01
dc.description.abstract	In this paper we present a system to locate, extract and recognize Telugu text. The circular nature of Telugu script is exploited for segmenting text regions using the Hough Transform. First, the Hough Transform for circles is performed on the Sobel gradient magnitude of the image to locate text. The located circles are filled to yield text regions, followed by Recursive XY Cuts to segment the regions into paragraphs, lines and word regions. A region merging process with a bottom-up approach envelopes individual words. Local binarization of the word MBRs yields connected components containing glyphs for recognition. The recognition process first identifies candidate characters by a zoning technique and then constructs structural feature vectors by cavity analysis. Finally, if required, crossing count based non-linear normalization and scaling is performed before template matching. The segmentation process succeeds in extracting text from images with complex Non-Manhattan layouts. The recognition process gave a character recognition accuracy of 97%-98%.
dc.identifier.citation	Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. v.2003-January
dc.identifier.issn	15205363
dc.identifier.uri	10.1109/ICDAR.2003.1227846
dc.identifier.uri	http://ieeexplore.ieee.org/document/1227846/
dc.identifier.uri	https://dspace.uohyd.ac.in/handle/1/8674
dc.title	Localization, extraction and recognition of text in Telugu document images
dc.type	Conference Proceeding. Conference Paper
dspace.entity.type

Files

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Plain Text
Description:

Download

Collections

Computer and Information Sciences - Publications