Matches in SemOpenAlex for { <https://semopenalex.org/work/W3081775373> ?p ?o ?g. }
Showing items 1 to 54 of
54
with 100 items per page.
- W3081775373 abstract "In this paper, we study metrics for evaluating OCR performance both in terms of physical segmentation and in terms of textual content recognition. These metrics rely on the OCR output (hypothesis) and the reference (also called ground truth) input format. Two evaluation criteria are considered: the quality of segmentation and the character recognition rate. Three pairs of input formats are selected among two types of inputs: text only (text) and text with spatial information (xml). These pairs of inputs reference-to-hypothesis are: 1) text-to-text, 2) xml-to-xml and 3) text-to-xml. For the text-to-text pair, we selected the RETAS method to perform experiments and show its limits. Regarding text-to-xml, a new method based on unique word anchors is proposed to solve the problem of aligning texts with different information. We define the ZoneMapAltCnt metric for the xml-to-xml approach and show that it offers the most reliable and complete evaluation compared to the other two. Open source OCRs like Tesseract and OCRopus are selected to perform experiments. The datasets used are a collection of documents from the ISTEX 1 document database, from French newspaper Le Nouvel Observateur as well as invoices and administrative document gathered from different collaborations." @default.
- W3081775373 created "2020-09-08" @default.
- W3081775373 creator A5002810227 @default.
- W3081775373 creator A5030996783 @default.
- W3081775373 creator A5063621495 @default.
- W3081775373 date "2018-07-30" @default.
- W3081775373 modified "2023-10-07" @default.
- W3081775373 title "Metrics for Complete Evaluation of OCR Performance" @default.
- W3081775373 cites W1966950310 @default.
- W3081775373 cites W2002006695 @default.
- W3081775373 cites W2004165551 @default.
- W3081775373 cites W2013848960 @default.
- W3081775373 cites W2035347361 @default.
- W3081775373 cites W2056518953 @default.
- W3081775373 cites W2106215871 @default.
- W3081775373 cites W2131132193 @default.
- W3081775373 cites W2149551320 @default.
- W3081775373 cites W2283184170 @default.
- W3081775373 hasPublicationYear "2018" @default.
- W3081775373 type Work @default.
- W3081775373 sameAs 3081775373 @default.
- W3081775373 citedByCount "1" @default.
- W3081775373 countsByYear W30817753732021 @default.
- W3081775373 crossrefType "proceedings-article" @default.
- W3081775373 hasAuthorship W3081775373A5002810227 @default.
- W3081775373 hasAuthorship W3081775373A5030996783 @default.
- W3081775373 hasAuthorship W3081775373A5063621495 @default.
- W3081775373 hasBestOaLocation W30817753731 @default.
- W3081775373 hasConcept C115961682 @default.
- W3081775373 hasConcept C154945302 @default.
- W3081775373 hasConcept C41008148 @default.
- W3081775373 hasConcept C546480517 @default.
- W3081775373 hasConceptScore W3081775373C115961682 @default.
- W3081775373 hasConceptScore W3081775373C154945302 @default.
- W3081775373 hasConceptScore W3081775373C41008148 @default.
- W3081775373 hasConceptScore W3081775373C546480517 @default.
- W3081775373 hasLocation W30817753731 @default.
- W3081775373 hasLocation W30817753732 @default.
- W3081775373 hasOpenAccess W3081775373 @default.
- W3081775373 hasPrimaryLocation W30817753731 @default.
- W3081775373 hasRelatedWork W1844973080 @default.
- W3081775373 hasRelatedWork W2022010866 @default.
- W3081775373 hasRelatedWork W2028958034 @default.
- W3081775373 hasRelatedWork W2142932873 @default.
- W3081775373 hasRelatedWork W2405914773 @default.
- W3081775373 hasRelatedWork W2948131761 @default.
- W3081775373 hasRelatedWork W3092287996 @default.
- W3081775373 hasRelatedWork W3107474891 @default.
- W3081775373 hasRelatedWork W3216174593 @default.
- W3081775373 hasRelatedWork W4226285292 @default.
- W3081775373 isParatext "false" @default.
- W3081775373 isRetracted "false" @default.
- W3081775373 magId "3081775373" @default.
- W3081775373 workType "article" @default.