Matches in SemOpenAlex for { <https://semopenalex.org/work/W1874540038> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W1874540038 abstract "Large scale document digitization projects continue to motivate interesting document understanding technologies such as script and language identification, page classification, segmentation and enhancement. Typically, however, solutions are still limited to narrow domains or regular formats such as books, forms, articles or letters and operate best on clean documents scanned in a controlled environment. More general collections of heterogeneous documents challenge the basic assumptions of state-of-the-art technology regarding quality, script, content and layout. Our work explores the use of adaptive algorithms for the automated analysis of noisy and complex document collections. We first propose, implement and evaluate an adaptive clutter detection and removal technique for complex binary documents. Our distance transform based technique aims to remove irregular and independent unwanted foreground content while leaving text content untouched. The novelty of this approach is in its determination of best approximation to clutter-content boundary with text like structures. Second, we describe a page segmentation technique called Voronoi++ for complex layouts which builds upon the state-of-the-art method proposed by Kise [Kise1999]. Our approach does not assume structured text zones and is designed to handle multi-lingual text in both handwritten and printed form. Voronoi++ is a dynamically adaptive and contextually aware approach that considers components' separation features combined with Docstrum [O'Gorman1993] based angular and neighborhood features to form provisional zone hypotheses. These provisional zones are then verified based on the context built from local separation and high-level content features. Finally, our research proposes a generic model to segment and to recognize characters for any complex syllabic or non-syllabic script, using font-models. This concept is based on the fact that font files contain all the information necessary to render text and thus a model for how to decompose them. Instead of script-specific routines, this work is a step towards a generic character and recognition scheme for both Latin and non-Latin scripts." @default.
- W1874540038 created "2016-06-24" @default.
- W1874540038 creator A5003875781 @default.
- W1874540038 creator A5023731595 @default.
- W1874540038 creator A5055264952 @default.
- W1874540038 date "2011-01-01" @default.
- W1874540038 modified "2023-09-26" @default.
- W1874540038 title "Adaptive algorithms for automated processing of document images" @default.
- W1874540038 hasPublicationYear "2011" @default.
- W1874540038 type Work @default.
- W1874540038 sameAs 1874540038 @default.
- W1874540038 citedByCount "0" @default.
- W1874540038 crossrefType "dissertation" @default.
- W1874540038 hasAuthorship W1874540038A5003875781 @default.
- W1874540038 hasAuthorship W1874540038A5023731595 @default.
- W1874540038 hasAuthorship W1874540038A5055264952 @default.
- W1874540038 hasConcept C115961682 @default.
- W1874540038 hasConcept C124101348 @default.
- W1874540038 hasConcept C132094186 @default.
- W1874540038 hasConcept C151730666 @default.
- W1874540038 hasConcept C153180895 @default.
- W1874540038 hasConcept C154945302 @default.
- W1874540038 hasConcept C23123220 @default.
- W1874540038 hasConcept C24881265 @default.
- W1874540038 hasConcept C2524010 @default.
- W1874540038 hasConcept C2779308522 @default.
- W1874540038 hasConcept C2779343474 @default.
- W1874540038 hasConcept C31972630 @default.
- W1874540038 hasConcept C33923547 @default.
- W1874540038 hasConcept C41008148 @default.
- W1874540038 hasConcept C546480517 @default.
- W1874540038 hasConcept C554190296 @default.
- W1874540038 hasConcept C76155785 @default.
- W1874540038 hasConcept C86803240 @default.
- W1874540038 hasConcept C89600930 @default.
- W1874540038 hasConcept C99498987 @default.
- W1874540038 hasConceptScore W1874540038C115961682 @default.
- W1874540038 hasConceptScore W1874540038C124101348 @default.
- W1874540038 hasConceptScore W1874540038C132094186 @default.
- W1874540038 hasConceptScore W1874540038C151730666 @default.
- W1874540038 hasConceptScore W1874540038C153180895 @default.
- W1874540038 hasConceptScore W1874540038C154945302 @default.
- W1874540038 hasConceptScore W1874540038C23123220 @default.
- W1874540038 hasConceptScore W1874540038C24881265 @default.
- W1874540038 hasConceptScore W1874540038C2524010 @default.
- W1874540038 hasConceptScore W1874540038C2779308522 @default.
- W1874540038 hasConceptScore W1874540038C2779343474 @default.
- W1874540038 hasConceptScore W1874540038C31972630 @default.
- W1874540038 hasConceptScore W1874540038C33923547 @default.
- W1874540038 hasConceptScore W1874540038C41008148 @default.
- W1874540038 hasConceptScore W1874540038C546480517 @default.
- W1874540038 hasConceptScore W1874540038C554190296 @default.
- W1874540038 hasConceptScore W1874540038C76155785 @default.
- W1874540038 hasConceptScore W1874540038C86803240 @default.
- W1874540038 hasConceptScore W1874540038C89600930 @default.
- W1874540038 hasConceptScore W1874540038C99498987 @default.
- W1874540038 hasLocation W18745400381 @default.
- W1874540038 hasOpenAccess W1874540038 @default.
- W1874540038 hasPrimaryLocation W18745400381 @default.
- W1874540038 hasRelatedWork W1500590527 @default.
- W1874540038 hasRelatedWork W2002393437 @default.
- W1874540038 hasRelatedWork W2003200069 @default.
- W1874540038 hasRelatedWork W2045018318 @default.
- W1874540038 hasRelatedWork W2062237607 @default.
- W1874540038 hasRelatedWork W2092981524 @default.
- W1874540038 hasRelatedWork W2119260721 @default.
- W1874540038 hasRelatedWork W2137959438 @default.
- W1874540038 hasRelatedWork W2169748977 @default.
- W1874540038 hasRelatedWork W2274953456 @default.
- W1874540038 hasRelatedWork W2409759487 @default.
- W1874540038 hasRelatedWork W2558615893 @default.
- W1874540038 hasRelatedWork W2573700214 @default.
- W1874540038 hasRelatedWork W2902697987 @default.
- W1874540038 hasRelatedWork W2952974774 @default.
- W1874540038 hasRelatedWork W2983383722 @default.
- W1874540038 hasRelatedWork W3020541548 @default.
- W1874540038 hasRelatedWork W35027485 @default.
- W1874540038 hasRelatedWork W44597921 @default.
- W1874540038 hasRelatedWork W2661427188 @default.
- W1874540038 isParatext "false" @default.
- W1874540038 isRetracted "false" @default.
- W1874540038 magId "1874540038" @default.
- W1874540038 workType "dissertation" @default.