Matches in SemOpenAlex for { <https://semopenalex.org/work/W2063828767> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W2063828767 abstract "For noisy, historical documents, a high optical character recognition (OCR) word error rate (WER) can render the OCR text unusable. Since image binarization is often the method used to identify foreground pixels, a body of research seeks to improve image-wide binarization directly. Instead of relying on any one imperfect binarization technique, our method incorporates information from multiple simple thresholding binarizations of the same image to improve text output. Using a new corpus of 19th century newspaper grayscale images for which the text transcription is known, we observe WERs of 13.8% and higher using current binarization techniques and a state-of-the-art OCR engine. Our novel approach combines the OCR outputs from multiple thresholded images by aligning the text output and producing a lattice of word alternatives from which a lattice word error rate (LWER) is calculated. Our results show a LWER of 7.6% when aligning two threshold images and a LWER of 6.8% when aligning five. From the word lattice we commit to one hypothesis by applying the methods of Lund et al. (2011) achieving an improvement over the original OCR output and a 8.41% WER result on this data set." @default.
- W2063828767 created "2016-06-24" @default.
- W2063828767 creator A5065220846 @default.
- W2063828767 creator A5077722320 @default.
- W2063828767 creator A5090791372 @default.
- W2063828767 date "2013-02-04" @default.
- W2063828767 modified "2023-09-23" @default.
- W2063828767 title "Combining multiple thresholding binarization values to improve OCR output" @default.
- W2063828767 cites W1603541793 @default.
- W2063828767 cites W1972016971 @default.
- W2063828767 cites W1975679321 @default.
- W2063828767 cites W2002638840 @default.
- W2063828767 cites W2015230810 @default.
- W2063828767 cites W2024348326 @default.
- W2063828767 cites W2033339460 @default.
- W2063828767 cites W2045566081 @default.
- W2063828767 cites W2069189382 @default.
- W2063828767 cites W2095581275 @default.
- W2063828767 cites W2100347543 @default.
- W2063828767 cites W2104543051 @default.
- W2063828767 cites W2108414239 @default.
- W2063828767 cites W2114781615 @default.
- W2063828767 cites W2118756867 @default.
- W2063828767 cites W2124990912 @default.
- W2063828767 cites W2127319406 @default.
- W2063828767 cites W2128060444 @default.
- W2063828767 cites W2128613007 @default.
- W2063828767 cites W2132886386 @default.
- W2063828767 cites W2133059825 @default.
- W2063828767 cites W2144872023 @default.
- W2063828767 cites W2158275940 @default.
- W2063828767 cites W2158698381 @default.
- W2063828767 cites W34011602 @default.
- W2063828767 cites W2256616724 @default.
- W2063828767 doi "https://doi.org/10.1117/12.2006228" @default.
- W2063828767 hasPublicationYear "2013" @default.
- W2063828767 type Work @default.
- W2063828767 sameAs 2063828767 @default.
- W2063828767 citedByCount "26" @default.
- W2063828767 countsByYear W20638287672013 @default.
- W2063828767 countsByYear W20638287672015 @default.
- W2063828767 countsByYear W20638287672016 @default.
- W2063828767 countsByYear W20638287672017 @default.
- W2063828767 countsByYear W20638287672018 @default.
- W2063828767 countsByYear W20638287672019 @default.
- W2063828767 countsByYear W20638287672020 @default.
- W2063828767 countsByYear W20638287672021 @default.
- W2063828767 countsByYear W20638287672022 @default.
- W2063828767 countsByYear W20638287672023 @default.
- W2063828767 crossrefType "proceedings-article" @default.
- W2063828767 hasAuthorship W2063828767A5065220846 @default.
- W2063828767 hasAuthorship W2063828767A5077722320 @default.
- W2063828767 hasAuthorship W2063828767A5090791372 @default.
- W2063828767 hasConcept C115961682 @default.
- W2063828767 hasConcept C153180895 @default.
- W2063828767 hasConcept C154945302 @default.
- W2063828767 hasConcept C160633673 @default.
- W2063828767 hasConcept C191178318 @default.
- W2063828767 hasConcept C2524010 @default.
- W2063828767 hasConcept C28490314 @default.
- W2063828767 hasConcept C31972630 @default.
- W2063828767 hasConcept C33923547 @default.
- W2063828767 hasConcept C40969351 @default.
- W2063828767 hasConcept C41008148 @default.
- W2063828767 hasConcept C546480517 @default.
- W2063828767 hasConcept C78201319 @default.
- W2063828767 hasConcept C90805587 @default.
- W2063828767 hasConceptScore W2063828767C115961682 @default.
- W2063828767 hasConceptScore W2063828767C153180895 @default.
- W2063828767 hasConceptScore W2063828767C154945302 @default.
- W2063828767 hasConceptScore W2063828767C160633673 @default.
- W2063828767 hasConceptScore W2063828767C191178318 @default.
- W2063828767 hasConceptScore W2063828767C2524010 @default.
- W2063828767 hasConceptScore W2063828767C28490314 @default.
- W2063828767 hasConceptScore W2063828767C31972630 @default.
- W2063828767 hasConceptScore W2063828767C33923547 @default.
- W2063828767 hasConceptScore W2063828767C40969351 @default.
- W2063828767 hasConceptScore W2063828767C41008148 @default.
- W2063828767 hasConceptScore W2063828767C546480517 @default.
- W2063828767 hasConceptScore W2063828767C78201319 @default.
- W2063828767 hasConceptScore W2063828767C90805587 @default.
- W2063828767 hasLocation W20638287671 @default.
- W2063828767 hasOpenAccess W2063828767 @default.
- W2063828767 hasPrimaryLocation W20638287671 @default.
- W2063828767 hasRelatedWork W2063828767 @default.
- W2063828767 hasRelatedWork W2107877995 @default.
- W2063828767 hasRelatedWork W2157071234 @default.
- W2063828767 hasRelatedWork W2184652563 @default.
- W2063828767 hasRelatedWork W2346354572 @default.
- W2063828767 hasRelatedWork W2533404752 @default.
- W2063828767 hasRelatedWork W2542932817 @default.
- W2063828767 hasRelatedWork W3216174593 @default.
- W2063828767 hasRelatedWork W4226285292 @default.
- W2063828767 hasRelatedWork W981941798 @default.
- W2063828767 isParatext "false" @default.
- W2063828767 isRetracted "false" @default.
- W2063828767 magId "2063828767" @default.
- W2063828767 workType "article" @default.