Matches in SemOpenAlex for { <https://semopenalex.org/work/W2006014785> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W2006014785 endingPage "186" @default.
- W2006014785 startingPage "165" @default.
- W2006014785 abstract "A new method for information extraction from document images is proposed in this paper as the basis for a document reader which can extract the required keywords and their logical relationship from various printed documents. The proposed method consists of robust keyword matching, global document matching, and postprocessing for matching errors. First, robust keyword matching between two-dimensional OCR results consisting of a set of possible character candidate lists and a set of keywords defined in the keyword dictionary is carried out. This keyword dictionary includes incorrect words with typical OCR errors and segments of words in order to deal with OCR errors. Next, document matching is invoked between keyword matching results in an input document and document models. Each document model consists of a set of word models with their logical relationship described in terms of a tree structure. This model matching extracts the required keywords and their logical relationship from the input document and determines the most suitable model for the input document. Finally, postprocessing using heuristic rules defined in the model is applied to document matching results to recover keyword matching errors and to modify keyword matching results. This comprehensive approach solves word segmentation problems accurately even if a document has unknown words, compound words, or incorrect words due to OCR errors. Experimental results obtained for 100 documents show that the method is robust and effective for various document structures." @default.
- W2006014785 created "2016-06-24" @default.
- W2006014785 creator A5037103116 @default.
- W2006014785 date "2002-06-01" @default.
- W2006014785 modified "2023-09-24" @default.
- W2006014785 title "Model-based Information Extraction Method Tolerant of OCR Errors for Document Images" @default.
- W2006014785 cites W1974213704 @default.
- W2006014785 cites W1986282506 @default.
- W2006014785 cites W1993324373 @default.
- W2006014785 cites W2017766816 @default.
- W2006014785 cites W2099399408 @default.
- W2006014785 cites W2142069714 @default.
- W2006014785 cites W2165107146 @default.
- W2006014785 doi "https://doi.org/10.1142/s0219427902000583" @default.
- W2006014785 hasPublicationYear "2002" @default.
- W2006014785 type Work @default.
- W2006014785 sameAs 2006014785 @default.
- W2006014785 citedByCount "4" @default.
- W2006014785 countsByYear W20060147852020 @default.
- W2006014785 countsByYear W20060147852021 @default.
- W2006014785 crossrefType "journal-article" @default.
- W2006014785 hasAuthorship W2006014785A5037103116 @default.
- W2006014785 hasConcept C115961682 @default.
- W2006014785 hasConcept C124101348 @default.
- W2006014785 hasConcept C153180895 @default.
- W2006014785 hasConcept C154945302 @default.
- W2006014785 hasConcept C185592680 @default.
- W2006014785 hasConcept C195807954 @default.
- W2006014785 hasConcept C204321447 @default.
- W2006014785 hasConcept C23123220 @default.
- W2006014785 hasConcept C41008148 @default.
- W2006014785 hasConcept C43617362 @default.
- W2006014785 hasConcept C4725764 @default.
- W2006014785 hasConcept C546480517 @default.
- W2006014785 hasConceptScore W2006014785C115961682 @default.
- W2006014785 hasConceptScore W2006014785C124101348 @default.
- W2006014785 hasConceptScore W2006014785C153180895 @default.
- W2006014785 hasConceptScore W2006014785C154945302 @default.
- W2006014785 hasConceptScore W2006014785C185592680 @default.
- W2006014785 hasConceptScore W2006014785C195807954 @default.
- W2006014785 hasConceptScore W2006014785C204321447 @default.
- W2006014785 hasConceptScore W2006014785C23123220 @default.
- W2006014785 hasConceptScore W2006014785C41008148 @default.
- W2006014785 hasConceptScore W2006014785C43617362 @default.
- W2006014785 hasConceptScore W2006014785C4725764 @default.
- W2006014785 hasConceptScore W2006014785C546480517 @default.
- W2006014785 hasIssue "02" @default.
- W2006014785 hasLocation W20060147851 @default.
- W2006014785 hasOpenAccess W2006014785 @default.
- W2006014785 hasPrimaryLocation W20060147851 @default.
- W2006014785 hasRelatedWork W104581431 @default.
- W2006014785 hasRelatedWork W1548492051 @default.
- W2006014785 hasRelatedWork W1561729373 @default.
- W2006014785 hasRelatedWork W1788528807 @default.
- W2006014785 hasRelatedWork W1975174578 @default.
- W2006014785 hasRelatedWork W2368651715 @default.
- W2006014785 hasRelatedWork W2393978999 @default.
- W2006014785 hasRelatedWork W2725657302 @default.
- W2006014785 hasRelatedWork W2747680751 @default.
- W2006014785 hasRelatedWork W3107474891 @default.
- W2006014785 hasVolume "15" @default.
- W2006014785 isParatext "false" @default.
- W2006014785 isRetracted "false" @default.
- W2006014785 magId "2006014785" @default.
- W2006014785 workType "article" @default.