Matches in SemOpenAlex for { <https://semopenalex.org/work/W2022032126> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W2022032126 abstract "Before the image of a document enter an OCR module, it should undergo Preprocessing and Document Layout Analysis steps. Document layout analysis usually comes after preprocessing. Noise removal and skew correction are two major preprocessing operations. Document layout analysis itself is divided into physical and logical layout analysis. Physical layout analysis decomposes the image of a document into homogenous regions such as text, graphics, and lines. In physical layout analysis, first, the image is segmented to homogenous regions, and then each homogenous region is classified into one of the present classes. On the other hand, logical layout analysis tries to assign functional labels (such as title, author, and footnote) to some of the classified regions to find relationship between some regions, and to discover reading order of different parts of a document. This article presents an innovative method for preprocessing and physical layout analysis of binary documents. Although, most of the present systems give the result of preprocessing to document layout analysis; state-of-the-art algorithms try to postpone the processing operations as much as possible in order to prevent irreparable mistakes. These two steps are incorporated in our approach. This is achieved through using segmentation results for noise removal. One reason of effectiveness of this approach is appropriate arrangement of procedures. Also, a neural classifier is so trained that the output is robust to the skew. A two stage classification is used for determining pixel classes. In the first step, the Haar wavelet transform is computed on resized and gray leveled image. The coefficients are normalized, and then 10% of them are picked up randomly. Selected coefficients are clustered into 4 groups using Kmeans. A novel algorithm is introduced for assigning these 4 clusters to a background, a vertical, and two horizontal classes. Other wavelet coefficients are also classified to one of these classes by KNN algorithm. The results of this stage as well as other features help a MLP network to perform the classification of the second stage. As well as regular classes, an ambiguous class is considered to take the regions that are the result of erroneous segmentation. Using the statistics of connected component sizes and horizontal projection profile, the regions are re-segmented and reclassified by another neural network. The presented approach is designed for textual documents with horizontal text extensions, and is applicable for vertical text extension manuscripts by a little change. As well as the proposed method has a fair computational complexity and robustness to skew, it has offered satisfactory results on different types of databases such as magazine, book, newspaper, and official letters." @default.
- W2022032126 created "2016-06-24" @default.
- W2022032126 creator A5001237522 @default.
- W2022032126 creator A5018923095 @default.
- W2022032126 creator A5086178151 @default.
- W2022032126 date "2010-05-01" @default.
- W2022032126 modified "2023-09-23" @default.
- W2022032126 title "Incorporated preprocessing and physical layout analysis of a binary document image using a two stage classification" @default.
- W2022032126 cites W2019273017 @default.
- W2022032126 cites W2055408294 @default.
- W2022032126 cites W2072393922 @default.
- W2022032126 cites W2108959246 @default.
- W2022032126 cites W2135085165 @default.
- W2022032126 cites W2135164809 @default.
- W2022032126 cites W2145370477 @default.
- W2022032126 cites W2153418513 @default.
- W2022032126 doi "https://doi.org/10.1109/iccce.2010.5556766" @default.
- W2022032126 hasPublicationYear "2010" @default.
- W2022032126 type Work @default.
- W2022032126 sameAs 2022032126 @default.
- W2022032126 citedByCount "2" @default.
- W2022032126 countsByYear W20220321262013 @default.
- W2022032126 crossrefType "proceedings-article" @default.
- W2022032126 hasAuthorship W2022032126A5001237522 @default.
- W2022032126 hasAuthorship W2022032126A5018923095 @default.
- W2022032126 hasAuthorship W2022032126A5086178151 @default.
- W2022032126 hasConcept C115961682 @default.
- W2022032126 hasConcept C121684516 @default.
- W2022032126 hasConcept C124504099 @default.
- W2022032126 hasConcept C153180895 @default.
- W2022032126 hasConcept C154945302 @default.
- W2022032126 hasConcept C160633673 @default.
- W2022032126 hasConcept C193828747 @default.
- W2022032126 hasConcept C21442007 @default.
- W2022032126 hasConcept C2778371909 @default.
- W2022032126 hasConcept C34736171 @default.
- W2022032126 hasConcept C41008148 @default.
- W2022032126 hasConcept C43711488 @default.
- W2022032126 hasConcept C546480517 @default.
- W2022032126 hasConcept C72773152 @default.
- W2022032126 hasConcept C76155785 @default.
- W2022032126 hasConcept C89600930 @default.
- W2022032126 hasConcept C9417928 @default.
- W2022032126 hasConceptScore W2022032126C115961682 @default.
- W2022032126 hasConceptScore W2022032126C121684516 @default.
- W2022032126 hasConceptScore W2022032126C124504099 @default.
- W2022032126 hasConceptScore W2022032126C153180895 @default.
- W2022032126 hasConceptScore W2022032126C154945302 @default.
- W2022032126 hasConceptScore W2022032126C160633673 @default.
- W2022032126 hasConceptScore W2022032126C193828747 @default.
- W2022032126 hasConceptScore W2022032126C21442007 @default.
- W2022032126 hasConceptScore W2022032126C2778371909 @default.
- W2022032126 hasConceptScore W2022032126C34736171 @default.
- W2022032126 hasConceptScore W2022032126C41008148 @default.
- W2022032126 hasConceptScore W2022032126C43711488 @default.
- W2022032126 hasConceptScore W2022032126C546480517 @default.
- W2022032126 hasConceptScore W2022032126C72773152 @default.
- W2022032126 hasConceptScore W2022032126C76155785 @default.
- W2022032126 hasConceptScore W2022032126C89600930 @default.
- W2022032126 hasConceptScore W2022032126C9417928 @default.
- W2022032126 hasLocation W20220321261 @default.
- W2022032126 hasOpenAccess W2022032126 @default.
- W2022032126 hasPrimaryLocation W20220321261 @default.
- W2022032126 hasRelatedWork W1582206143 @default.
- W2022032126 hasRelatedWork W1912918750 @default.
- W2022032126 hasRelatedWork W1964863806 @default.
- W2022032126 hasRelatedWork W2022032126 @default.
- W2022032126 hasRelatedWork W2051072213 @default.
- W2022032126 hasRelatedWork W2067985181 @default.
- W2022032126 hasRelatedWork W2131474594 @default.
- W2022032126 hasRelatedWork W2235797036 @default.
- W2022032126 hasRelatedWork W2274232996 @default.
- W2022032126 hasRelatedWork W2347558612 @default.
- W2022032126 isParatext "false" @default.
- W2022032126 isRetracted "false" @default.
- W2022032126 magId "2022032126" @default.
- W2022032126 workType "article" @default.