Matches in SemOpenAlex for { <https://semopenalex.org/work/W3178709966> ?p ?o ?g. }
- W3178709966 endingPage "37" @default.
- W3178709966 startingPage "1" @default.
- W3178709966 abstract "Optical character recognition (OCR) is one of the most popular techniques used for converting printed documents into machine-readable ones. While OCR engines can do well with modern text, their performance is unfortunately significantly reduced on historical materials. Additionally, many texts have already been processed by various out-of-date digitisation techniques. As a consequence, digitised texts are noisy and need to be post-corrected. This article clarifies the importance of enhancing quality of OCR results by studying their effects on information retrieval and natural language processing applications. We then define the post-OCR processing problem, illustrate its typical pipeline, and review the state-of-the-art post-OCR processing approaches. Evaluation metrics, accessible datasets, language resources, and useful toolkits are also reported. Furthermore, the work identifies the current trend and outlines some research directions of this field." @default.
- W3178709966 created "2021-07-19" @default.
- W3178709966 creator A5033491986 @default.
- W3178709966 creator A5061375391 @default.
- W3178709966 creator A5063992777 @default.
- W3178709966 creator A5079733597 @default.
- W3178709966 date "2021-07-13" @default.
- W3178709966 modified "2023-10-18" @default.
- W3178709966 title "Survey of Post-OCR Processing Approaches" @default.
- W3178709966 cites W1482504747 @default.
- W3178709966 cites W1573345323 @default.
- W3178709966 cites W1612759903 @default.
- W3178709966 cites W1699166917 @default.
- W3178709966 cites W1771572948 @default.
- W3178709966 cites W1976951044 @default.
- W3178709966 cites W1987932675 @default.
- W3178709966 cites W1990871427 @default.
- W3178709966 cites W2002006695 @default.
- W3178709966 cites W2004920872 @default.
- W3178709966 cites W2011084371 @default.
- W3178709966 cites W2015991179 @default.
- W3178709966 cites W2016909568 @default.
- W3178709966 cites W2018616927 @default.
- W3178709966 cites W2019096529 @default.
- W3178709966 cites W2023291240 @default.
- W3178709966 cites W2036430923 @default.
- W3178709966 cites W2040062114 @default.
- W3178709966 cites W2045566081 @default.
- W3178709966 cites W2047479497 @default.
- W3178709966 cites W2052227241 @default.
- W3178709966 cites W2053276657 @default.
- W3178709966 cites W2056471870 @default.
- W3178709966 cites W2059513841 @default.
- W3178709966 cites W2060380454 @default.
- W3178709966 cites W2064675550 @default.
- W3178709966 cites W2066102695 @default.
- W3178709966 cites W2066792529 @default.
- W3178709966 cites W2069172670 @default.
- W3178709966 cites W2069189382 @default.
- W3178709966 cites W2071003614 @default.
- W3178709966 cites W2080627698 @default.
- W3178709966 cites W2090755665 @default.
- W3178709966 cites W2093931624 @default.
- W3178709966 cites W2110783993 @default.
- W3178709966 cites W2124807415 @default.
- W3178709966 cites W2126888240 @default.
- W3178709966 cites W2142268730 @default.
- W3178709966 cites W2150429754 @default.
- W3178709966 cites W2157224915 @default.
- W3178709966 cites W2158874261 @default.
- W3178709966 cites W2171950509 @default.
- W3178709966 cites W2250539671 @default.
- W3178709966 cites W2251752871 @default.
- W3178709966 cites W2337045451 @default.
- W3178709966 cites W2493916176 @default.
- W3178709966 cites W2510006552 @default.
- W3178709966 cites W2559950387 @default.
- W3178709966 cites W2564757993 @default.
- W3178709966 cites W2582406784 @default.
- W3178709966 cites W2594229957 @default.
- W3178709966 cites W2737639484 @default.
- W3178709966 cites W2753240039 @default.
- W3178709966 cites W2756610685 @default.
- W3178709966 cites W2768817781 @default.
- W3178709966 cites W2786850497 @default.
- W3178709966 cites W2787677190 @default.
- W3178709966 cites W2787747535 @default.
- W3178709966 cites W27881537 @default.
- W3178709966 cites W2793452059 @default.
- W3178709966 cites W2798485145 @default.
- W3178709966 cites W2805313330 @default.
- W3178709966 cites W2809468489 @default.
- W3178709966 cites W2809806860 @default.
- W3178709966 cites W2891810409 @default.
- W3178709966 cites W2900753330 @default.
- W3178709966 cites W2901878905 @default.
- W3178709966 cites W2954289647 @default.
- W3178709966 cites W2963212250 @default.
- W3178709966 cites W2964322605 @default.
- W3178709966 cites W2967691513 @default.
- W3178709966 cites W2968567788 @default.
- W3178709966 cites W2982862540 @default.
- W3178709966 cites W2997591727 @default.
- W3178709966 cites W3004106466 @default.
- W3178709966 cites W3012289950 @default.
- W3178709966 cites W3015310959 @default.
- W3178709966 cites W3039590646 @default.
- W3178709966 cites W3046591412 @default.
- W3178709966 cites W3081407480 @default.
- W3178709966 cites W3110596726 @default.
- W3178709966 cites W3116485719 @default.
- W3178709966 cites W4230543443 @default.
- W3178709966 cites W4256120941 @default.
- W3178709966 cites W4312290170 @default.
- W3178709966 cites W792063712 @default.
- W3178709966 doi "https://doi.org/10.1145/3453476" @default.
- W3178709966 hasPublicationYear "2021" @default.
- W3178709966 type Work @default.