Matches in SemOpenAlex for { <https://semopenalex.org/work/W4310335763> ?p ?o ?g. }
- W4310335763 endingPage "7141" @default.
- W4310335763 startingPage "7123" @default.
- W4310335763 abstract "Scene text spotting is of great importance to the computer vision community due to its wide variety of applications. Recent methods attempt to introduce linguistic knowledge for challenging recognition rather than pure visual classification. However, how to effectively model the linguistic rules in end-to-end deep networks remains a research challenge. In this paper, we argue that the limited capacity of language models comes from 1) implicit language modeling; 2) unidirectional feature representation; and 3) language model with noise input. Correspondingly, we propose an autonomous, bidirectional and iterative ABINet++ for scene text spotting. First, the autonomous suggests enforcing explicitly language modeling by decoupling the recognizer into vision model and language model and blocking gradient flow between both models. Second, a novel bidirectional cloze network (BCN) as the language model is proposed based on bidirectional feature representation. Third, we propose an execution manner of iterative correction for the language model which can effectively alleviate the impact of noise input. Additionally, based on an ensemble of the iterative predictions, a self-training method is developed which can learn from unlabeled images effectively. Finally, to polish ABINet++ in long text recognition, we propose to aggregate horizontal features by embedding Transformer units inside a U-Net, and design a position and content attention module which integrates character order and content to attend to character features precisely. ABINet++ achieves state-of-the-art performance on both scene text recognition and scene text spotting benchmarks, which consistently demonstrates the superiority of our method in various environments especially on low-quality images. Besides, extensive experiments including in English and Chinese also prove that, a text spotter that incorporates our language modeling method can significantly improve its performance both in accuracy and speed compared with commonly used attention-based recognizers. Code is available at <uri xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>https://github.com/FangShancheng/ABINet-PP</uri> ." @default.
- W4310335763 created "2022-12-08" @default.
- W4310335763 creator A5008604905 @default.
- W4310335763 creator A5023341829 @default.
- W4310335763 creator A5046305086 @default.
- W4310335763 creator A5054311881 @default.
- W4310335763 creator A5067500703 @default.
- W4310335763 creator A5078162380 @default.
- W4310335763 date "2023-06-01" @default.
- W4310335763 modified "2023-10-18" @default.
- W4310335763 title "ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting" @default.
- W4310335763 cites W1922126009 @default.
- W4310335763 cites W1971822075 @default.
- W4310335763 cites W1981283549 @default.
- W4310335763 cites W2008806374 @default.
- W4310335763 cites W2122585011 @default.
- W4310335763 cites W2127141656 @default.
- W4310335763 cites W2135231474 @default.
- W4310335763 cites W2144554289 @default.
- W4310335763 cites W2146835493 @default.
- W4310335763 cites W2194187530 @default.
- W4310335763 cites W2194775991 @default.
- W4310335763 cites W2294053032 @default.
- W4310335763 cites W2343052201 @default.
- W4310335763 cites W2402268235 @default.
- W4310335763 cites W2532759528 @default.
- W4310335763 cites W2555182955 @default.
- W4310335763 cites W2740767790 @default.
- W4310335763 cites W2750938222 @default.
- W4310335763 cites W2777652944 @default.
- W4310335763 cites W2785383245 @default.
- W4310335763 cites W2795619303 @default.
- W4310335763 cites W2810983211 @default.
- W4310335763 cites W2875814315 @default.
- W4310335763 cites W2896034938 @default.
- W4310335763 cites W2914492226 @default.
- W4310335763 cites W2962739339 @default.
- W4310335763 cites W2962790387 @default.
- W4310335763 cites W2962986948 @default.
- W4310335763 cites W2963150697 @default.
- W4310335763 cites W2963233387 @default.
- W4310335763 cites W2963517393 @default.
- W4310335763 cites W2963712589 @default.
- W4310335763 cites W2964018263 @default.
- W4310335763 cites W2964296749 @default.
- W4310335763 cites W2964312704 @default.
- W4310335763 cites W2965066169 @default.
- W4310335763 cites W2965463054 @default.
- W4310335763 cites W2981969038 @default.
- W4310335763 cites W2982770724 @default.
- W4310335763 cites W2983626510 @default.
- W4310335763 cites W2988098900 @default.
- W4310335763 cites W2996956254 @default.
- W4310335763 cites W2997371611 @default.
- W4310335763 cites W2997749585 @default.
- W4310335763 cites W2997864923 @default.
- W4310335763 cites W2998382406 @default.
- W4310335763 cites W2999874489 @default.
- W4310335763 cites W3003218881 @default.
- W4310335763 cites W3003642782 @default.
- W4310335763 cites W3003711889 @default.
- W4310335763 cites W3003921261 @default.
- W4310335763 cites W3004846386 @default.
- W4310335763 cites W3005436539 @default.
- W4310335763 cites W3034447740 @default.
- W4310335763 cites W3034792612 @default.
- W4310335763 cites W3034971973 @default.
- W4310335763 cites W3035106683 @default.
- W4310335763 cites W3035160371 @default.
- W4310335763 cites W3035449864 @default.
- W4310335763 cites W3082397598 @default.
- W4310335763 cites W3093587902 @default.
- W4310335763 cites W3097932944 @default.
- W4310335763 cites W3110267192 @default.
- W4310335763 cites W3110398855 @default.
- W4310335763 cites W3111172959 @default.
- W4310335763 cites W3152831436 @default.
- W4310335763 cites W3177318507 @default.
- W4310335763 cites W3179426054 @default.
- W4310335763 cites W3181186176 @default.
- W4310335763 cites W3202415716 @default.
- W4310335763 cites W3202912918 @default.
- W4310335763 cites W3206651063 @default.
- W4310335763 cites W70975097 @default.
- W4310335763 doi "https://doi.org/10.1109/tpami.2022.3223908" @default.
- W4310335763 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/36417745" @default.
- W4310335763 hasPublicationYear "2023" @default.
- W4310335763 type Work @default.
- W4310335763 citedByCount "2" @default.
- W4310335763 countsByYear W43103357632023 @default.
- W4310335763 crossrefType "journal-article" @default.
- W4310335763 hasAuthorship W4310335763A5008604905 @default.
- W4310335763 hasAuthorship W4310335763A5023341829 @default.
- W4310335763 hasAuthorship W4310335763A5046305086 @default.
- W4310335763 hasAuthorship W4310335763A5054311881 @default.
- W4310335763 hasAuthorship W4310335763A5067500703 @default.
- W4310335763 hasAuthorship W4310335763A5078162380 @default.
- W4310335763 hasBestOaLocation W43103357632 @default.