Document Layout Analysis Using Multigaussian Fitting

dc.contributor.author Melinda, Laiphangbam
dc.contributor.author Ghanapuram, Raghu
dc.contributor.author Bhagvati, Chakravarthy
dc.date.accessioned 2022-03-27T05:54:21Z
dc.date.available 2022-03-27T05:54:21Z
dc.date.issued 2018-01-25
dc.description.abstract This paper proposes a novel technique for layout analysis of documents with complex Manhattan layouts. The technique is designed for Indic script newspapers and works on many types of documents not necessarily with Indic scripts with Manhattan layout. The main idea behind the algorithm is to categorise the physical elements of a document into noise, text, titles and graphics based on their heights. A histogram of heights is computed from the bounding boxes of connected components and a multigaussian fit is used to discover optimal split points between the categories. The gaussian with the highest peak is assumed to correspond to running text. Running text regions are grouped into blocks using nearest neighbour analysis. These initial regions are further refined using a second-level classification of the other elements into graphics, light-coloured text on a dark background, and graphical separators. The resulting layouts show accuracies comparable to some of the best and most popular algorithms such as MHS (winner of ICDAR-RDCL2015 competition) and PRImA's Aletheia (tool developed by PRImA Research Lab). Results of testing on many Indic script newspapers and other documents, and comparison with Aletheia and MHS on ICDAR dataset show its performance. Our initial results on an Indic document dataset show high performance in identifying running text ( > 98%) with an accuracy of 82% on identifying the other elements. Ground truth data for the Indic script newspaper documents is being generated for a more extensive quantitative testing. The strength of our algorithm is that it requires only one parameter - the number of gaussians to fit the height histogram data and is therefore easy to automate and adapt to many documents.
dc.identifier.citation Proceedings of the International Conference on Document Analysis and Recognition, ICDAR. v.1
dc.identifier.issn 15205363
dc.identifier.uri 10.1109/ICDAR.2017.127
dc.identifier.uri http://ieeexplore.ieee.org/document/8270058/
dc.identifier.uri https://dspace.uohyd.ac.in/handle/1/8707
dc.subject Bounding Boxes
dc.subject Document Layout Analysis
dc.subject Height Histogram
dc.subject Multigaussian
dc.subject Nearest Neighbor
dc.title Document Layout Analysis Using Multigaussian Fitting
dc.type Conference Proceeding. Conference Paper
dspace.entity.type
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: