Matches in SemOpenAlex for { <https://semopenalex.org/work/W2071214273> ?p ?o ?g. }
Showing items 1 to 84 of
84
with 100 items per page.
- W2071214273 abstract "For a long time, automatic reading has been limited to optical character recognition. One year ago, except for one high end product, all industrial software or hardware products where limited to the reading of mono-column texts without images. This does not correspond to real life needs. In a current,company, pages which need to be transformed into electronic form are not only typewritten pages, but also complex pages from professional magazines, technical manuals, financial reports and tables, administrative documents, various directories, lists of spare parts etc...The real problem of automatic reading is to transform such complex paper pages including columns, images, drawings, titles, footnotes, legends, tables, occasionally in landscape format, into a computer file without the help of an operator. Moreover, the problem is to perform this operation at an economical cost with limited computer resources in terms of processor and memory.A lot of difficulties need to be overcome to solve this problem :1.1. First of all, the system has to be able to segment the page into homogeneous zones, and to label these zones in terms of text image, drawing, title, in order to adapt downstream processing to each case, avoiding, for example, to search for lines inside an image. This classification may be very difficult when images or drawings contain letters, words, or legends. This has to be performed in complex cases with, for instance, heading paragraphs covering several columns.1.2. Adjacent columns have to be separated, that is to be recognized as separate blocks. This preliminary separation is needed by the fact that facing lines of adjacent blocks generally are not on the same base line due to variable line-spacing from block to block. During the recognition phase, the position of letters or signs with respect to the base line will have to be considered in several cases, like for : p 1 j ' , - etc. . . So an accurate determination of the base line is a necessity. This separation is sometimes made difficult by narrow margins between blocks, tilt of the page, embedded blocks etc. . .1.3. Inside each block, each line has to be located in spite of misleading pertubations like :- drop caps facing several textlines;- underspacing of lines which causes the bottom of some letters like p, q, j to be below the top of letters on the next line, needing an intelligent separation between lines;- in the preceding case, vertically touching letters like a g above an 1, which is a more severe situation.- non-parallelism or inclination of lines by several degrees.1.4. Inside each line, each letter or sign has to be isolated before recognition, and that is not easy when letters are overlapping like with italic characters, f, T etc... The difficulty comes from the fact that with diacritic signs like i ; : and European accented letters like i, 6, n, the separation algorithm has to keep the two or three components of the sign grouped together while in other cases it has to separate nested characters. This problem is referred to as the kerning problem." @default.
- W2071214273 created "2016-06-24" @default.
- W2071214273 creator A5022218435 @default.
- W2071214273 date "1989-07-24" @default.
- W2071214273 modified "2023-09-23" @default.
- W2071214273 title "Progress In Automatic Reading Of Complex Typeset Pages" @default.
- W2071214273 doi "https://doi.org/10.1117/12.952585" @default.
- W2071214273 hasPublicationYear "1989" @default.
- W2071214273 type Work @default.
- W2071214273 sameAs 2071214273 @default.
- W2071214273 citedByCount "0" @default.
- W2071214273 crossrefType "proceedings-article" @default.
- W2071214273 hasAuthorship W2071214273A5022218435 @default.
- W2071214273 hasConcept C115961682 @default.
- W2071214273 hasConcept C121684516 @default.
- W2071214273 hasConcept C126042441 @default.
- W2071214273 hasConcept C127413603 @default.
- W2071214273 hasConcept C138885662 @default.
- W2071214273 hasConcept C154945302 @default.
- W2071214273 hasConcept C194648553 @default.
- W2071214273 hasConcept C199360897 @default.
- W2071214273 hasConcept C199639397 @default.
- W2071214273 hasConcept C23123220 @default.
- W2071214273 hasConcept C2524010 @default.
- W2071214273 hasConcept C2777737414 @default.
- W2071214273 hasConcept C2777904410 @default.
- W2071214273 hasConcept C2780551164 @default.
- W2071214273 hasConcept C33923547 @default.
- W2071214273 hasConcept C41008148 @default.
- W2071214273 hasConcept C41895202 @default.
- W2071214273 hasConcept C546480517 @default.
- W2071214273 hasConcept C554936623 @default.
- W2071214273 hasConcept C76155785 @default.
- W2071214273 hasConcept C78519656 @default.
- W2071214273 hasConcept C90673727 @default.
- W2071214273 hasConceptScore W2071214273C115961682 @default.
- W2071214273 hasConceptScore W2071214273C121684516 @default.
- W2071214273 hasConceptScore W2071214273C126042441 @default.
- W2071214273 hasConceptScore W2071214273C127413603 @default.
- W2071214273 hasConceptScore W2071214273C138885662 @default.
- W2071214273 hasConceptScore W2071214273C154945302 @default.
- W2071214273 hasConceptScore W2071214273C194648553 @default.
- W2071214273 hasConceptScore W2071214273C199360897 @default.
- W2071214273 hasConceptScore W2071214273C199639397 @default.
- W2071214273 hasConceptScore W2071214273C23123220 @default.
- W2071214273 hasConceptScore W2071214273C2524010 @default.
- W2071214273 hasConceptScore W2071214273C2777737414 @default.
- W2071214273 hasConceptScore W2071214273C2777904410 @default.
- W2071214273 hasConceptScore W2071214273C2780551164 @default.
- W2071214273 hasConceptScore W2071214273C33923547 @default.
- W2071214273 hasConceptScore W2071214273C41008148 @default.
- W2071214273 hasConceptScore W2071214273C41895202 @default.
- W2071214273 hasConceptScore W2071214273C546480517 @default.
- W2071214273 hasConceptScore W2071214273C554936623 @default.
- W2071214273 hasConceptScore W2071214273C76155785 @default.
- W2071214273 hasConceptScore W2071214273C78519656 @default.
- W2071214273 hasConceptScore W2071214273C90673727 @default.
- W2071214273 hasLocation W20712142731 @default.
- W2071214273 hasOpenAccess W2071214273 @default.
- W2071214273 hasPrimaryLocation W20712142731 @default.
- W2071214273 hasRelatedWork W1967058978 @default.
- W2071214273 hasRelatedWork W2399137940 @default.
- W2071214273 hasRelatedWork W2477383750 @default.
- W2071214273 hasRelatedWork W2483370795 @default.
- W2071214273 hasRelatedWork W3007394890 @default.
- W2071214273 hasRelatedWork W3007479582 @default.
- W2071214273 hasRelatedWork W635412534 @default.
- W2071214273 hasRelatedWork W2556785539 @default.
- W2071214273 hasRelatedWork W2580216647 @default.
- W2071214273 hasRelatedWork W2602957451 @default.
- W2071214273 hasRelatedWork W266618795 @default.
- W2071214273 hasRelatedWork W2778910736 @default.
- W2071214273 hasRelatedWork W2818905296 @default.
- W2071214273 hasRelatedWork W2822340177 @default.
- W2071214273 hasRelatedWork W2836460995 @default.
- W2071214273 hasRelatedWork W2851104414 @default.
- W2071214273 hasRelatedWork W2857708494 @default.
- W2071214273 hasRelatedWork W2859893681 @default.
- W2071214273 hasRelatedWork W3142896680 @default.
- W2071214273 hasRelatedWork W3147918116 @default.
- W2071214273 isParatext "false" @default.
- W2071214273 isRetracted "false" @default.
- W2071214273 magId "2071214273" @default.
- W2071214273 workType "article" @default.