Matches in SemOpenAlex for { <https://semopenalex.org/work/W3111983786> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W3111983786 abstract "Traditional Optical Character Recognition (OCR) systems that generate text of highly inflectional Indic languages like Hindi tend to suffer from poor accuracy due to a wide alphabet set, compound characters and difficulty in segmenting characters in a word. Automatic spelling error detection and context-sensitive error correction can be used to improve accuracy by post-processing the text generated by these OCR systems. A majority of previously developed language models for error correction of Hindi spelling have been context-free. In this paper, we present Vartani Spellcheck - a context-sensitive approach for spelling correction of Hindi text using a state-of-the-art transformer - BERT in conjunction with the Levenshtein distance algorithm, popularly known as Edit Distance. We use a lookup dictionary and context-based named entity recognition (NER) for detection of possible spelling errors in the text. Our proposed technique has been tested on a large corpus of text generated by the widely used Tesseract OCR on the Hindi epic Ramayana. With an accuracy of 81%, the results show a significant improvement over some of the previously established context-sensitive error correction mechanisms for Hindi. We also explain how Vartani Spellcheck may be used for on-the-fly autocorrect suggestion during continuous typing in a text editor environment." @default.
- W3111983786 created "2020-12-21" @default.
- W3111983786 creator A5000613715 @default.
- W3111983786 creator A5045293768 @default.
- W3111983786 date "2020-12-14" @default.
- W3111983786 modified "2023-09-27" @default.
- W3111983786 title "Vartani Spellcheck - Automatic Context-Sensitive Spelling Correction of OCR-generated Hindi Text Using BERT and Levenshtein Distance." @default.
- W3111983786 cites W1990871427 @default.
- W3111983786 cites W2155591437 @default.
- W3111983786 cites W2197489394 @default.
- W3111983786 cites W2618478662 @default.
- W3111983786 cites W2963341956 @default.
- W3111983786 cites W2982421656 @default.
- W3111983786 cites W3098824823 @default.
- W3111983786 cites W2781518928 @default.
- W3111983786 hasPublicationYear "2020" @default.
- W3111983786 type Work @default.
- W3111983786 sameAs 3111983786 @default.
- W3111983786 citedByCount "0" @default.
- W3111983786 crossrefType "posted-content" @default.
- W3111983786 hasAuthorship W3111983786A5000613715 @default.
- W3111983786 hasAuthorship W3111983786A5045293768 @default.
- W3111983786 hasConcept C103088060 @default.
- W3111983786 hasConcept C11413529 @default.
- W3111983786 hasConcept C115961682 @default.
- W3111983786 hasConcept C138885662 @default.
- W3111983786 hasConcept C151730666 @default.
- W3111983786 hasConcept C154945302 @default.
- W3111983786 hasConcept C204321447 @default.
- W3111983786 hasConcept C2777515626 @default.
- W3111983786 hasConcept C2777801307 @default.
- W3111983786 hasConcept C2779343474 @default.
- W3111983786 hasConcept C28490314 @default.
- W3111983786 hasConcept C41008148 @default.
- W3111983786 hasConcept C41895202 @default.
- W3111983786 hasConcept C44359876 @default.
- W3111983786 hasConcept C519982507 @default.
- W3111983786 hasConcept C546480517 @default.
- W3111983786 hasConcept C86803240 @default.
- W3111983786 hasConceptScore W3111983786C103088060 @default.
- W3111983786 hasConceptScore W3111983786C11413529 @default.
- W3111983786 hasConceptScore W3111983786C115961682 @default.
- W3111983786 hasConceptScore W3111983786C138885662 @default.
- W3111983786 hasConceptScore W3111983786C151730666 @default.
- W3111983786 hasConceptScore W3111983786C154945302 @default.
- W3111983786 hasConceptScore W3111983786C204321447 @default.
- W3111983786 hasConceptScore W3111983786C2777515626 @default.
- W3111983786 hasConceptScore W3111983786C2777801307 @default.
- W3111983786 hasConceptScore W3111983786C2779343474 @default.
- W3111983786 hasConceptScore W3111983786C28490314 @default.
- W3111983786 hasConceptScore W3111983786C41008148 @default.
- W3111983786 hasConceptScore W3111983786C41895202 @default.
- W3111983786 hasConceptScore W3111983786C44359876 @default.
- W3111983786 hasConceptScore W3111983786C519982507 @default.
- W3111983786 hasConceptScore W3111983786C546480517 @default.
- W3111983786 hasConceptScore W3111983786C86803240 @default.
- W3111983786 hasLocation W31119837861 @default.
- W3111983786 hasOpenAccess W3111983786 @default.
- W3111983786 hasPrimaryLocation W31119837861 @default.
- W3111983786 hasRelatedWork W2048782995 @default.
- W3111983786 hasRelatedWork W2083948915 @default.
- W3111983786 hasRelatedWork W2089316759 @default.
- W3111983786 hasRelatedWork W2101911356 @default.
- W3111983786 hasRelatedWork W2121686674 @default.
- W3111983786 hasRelatedWork W2177957627 @default.
- W3111983786 hasRelatedWork W2313713951 @default.
- W3111983786 hasRelatedWork W2391408780 @default.
- W3111983786 hasRelatedWork W2616171564 @default.
- W3111983786 hasRelatedWork W2754274582 @default.
- W3111983786 hasRelatedWork W2903354474 @default.
- W3111983786 hasRelatedWork W2973422500 @default.
- W3111983786 hasRelatedWork W3041181542 @default.
- W3111983786 hasRelatedWork W3092282134 @default.
- W3111983786 hasRelatedWork W3124781663 @default.
- W3111983786 hasRelatedWork W3188418914 @default.
- W3111983786 hasRelatedWork W3192188649 @default.
- W3111983786 hasRelatedWork W71071520 @default.
- W3111983786 hasRelatedWork W2957039981 @default.
- W3111983786 hasRelatedWork W3120861875 @default.
- W3111983786 isParatext "false" @default.
- W3111983786 isRetracted "false" @default.
- W3111983786 magId "3111983786" @default.
- W3111983786 workType "article" @default.