Parameter-free table detection method

dc.contributor.author Melinda, Laiphangbam
dc.contributor.author Bhagvati, Chakravarthy
dc.date.accessioned 2022-03-27T05:54:18Z
dc.date.available 2022-03-27T05:54:18Z
dc.date.issued 2019-09-01
dc.description.abstract In this paper, we propose two parameter-free table detection methods: one for the closed tables and other for open tables. The unifying idea is multigaussian analysis. Multigaussian analysis of text height histograms classifies the document content into text and non-text blocks. Closed tables are classified as non-text and their identification from the non-text blocks is similar to many earlier methods that remove the separators. We do not need any parameters to identify rows and columns and discriminate them from text blocks because of multigaussian analysis. Open tables are initially classified as text blocks and are detected by extending the multigaussian analysis to the heights and widths of text blocks. The text-blocks are grouped into three categories by multigaussian analysis. These groups are used to classify table cells and distinguish them from text blocks. Table blocks are merged to obtain the table region. Evaluation on various Indic script newspapers and ICDAR2013 table competition dataset shows that our methods achieve more than 90% in table recognition. The strength of our algorithm is that it is a parameter-free approach and requires no training dataset.
dc.identifier.citation Proceedings of the International Conference on Document Analysis and Recognition, ICDAR
dc.identifier.issn 15205363
dc.identifier.uri 10.1109/ICDAR.2019.00079
dc.identifier.uri https://ieeexplore.ieee.org/document/8977981/
dc.identifier.uri https://dspace.uohyd.ac.in/handle/1/8702
dc.subject Bounding Boxes
dc.subject Close table
dc.subject Column separators
dc.subject Expectation-Maximization
dc.subject Gaussian Mixture Model
dc.subject Height histogram
dc.subject Horizontal Projection Profile
dc.subject Open table
dc.subject Row separators
dc.subject Table detection
dc.subject Width histogram
dc.title Parameter-free table detection method
dc.type Conference Proceeding. Conference Paper
dspace.entity.type
Files
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Plain Text
Description: