Matches in SemOpenAlex for { <https://semopenalex.org/work/W3216857888> ?p ?o ?g. }
Showing items 1 to 74 of
74
with 100 items per page.
- W3216857888 endingPage "108455" @default.
- W3216857888 startingPage "108455" @default.
- W3216857888 abstract "It is time to stop neglecting the text around your world. In VQA, the surrounding text helps humans to understand complete visual scenes and reason question semantics efficiently. Here, we address the challenging Text-based Visual Question Answering (TextVQA) problem, which requires a model to answer the VQA questions with text reading ability. Existing TextVQA methods mainly focus on the latent relationships between detected object instances and scene texts with the given question, but ignore spatial location relationships and complex relational semantics between visual object instances and OCR texts (e.g. the A of B on C). To deal with these challenges, we propose a novel Text-Instance Graph (TIG) network for TextVQA. The TIG builds an OCR-OBJ graph for overlapping relationships modeling, where each node of graph is updated by utilizing relative objects or OCR texts. To deal with the question with complex logic, we propose a dynamic OCR-OBJ graph network to extend the perception space of graph nodes, which grasps the information of non-directly adjacent node features. Considering a scene about “the brand of the computer on the table”, the model would build correlations between “brand” and “table” using “the computer” node as the intermediate node. Extensive experiments on three benchmarks demonstrate the effectiveness and superiority of the proposed method. In addition, our TIG achieves 0.505 ANLS on ST-VQA challenge leaderboard and sets a new state-of-the-art." @default.
- W3216857888 created "2021-12-06" @default.
- W3216857888 creator A5029423014 @default.
- W3216857888 creator A5036987388 @default.
- W3216857888 creator A5040877128 @default.
- W3216857888 creator A5059368105 @default.
- W3216857888 creator A5066645546 @default.
- W3216857888 creator A5087623065 @default.
- W3216857888 date "2022-04-01" @default.
- W3216857888 modified "2023-10-01" @default.
- W3216857888 title "Text-instance graph: Exploring the relational semantics for text-based visual question answering" @default.
- W3216857888 cites W2747623286 @default.
- W3216857888 cites W2751445731 @default.
- W3216857888 cites W2913618459 @default.
- W3216857888 cites W3044175177 @default.
- W3216857888 cites W3070078896 @default.
- W3216857888 cites W3118856670 @default.
- W3216857888 cites W3124559592 @default.
- W3216857888 doi "https://doi.org/10.1016/j.patcog.2021.108455" @default.
- W3216857888 hasPublicationYear "2022" @default.
- W3216857888 type Work @default.
- W3216857888 sameAs 3216857888 @default.
- W3216857888 citedByCount "13" @default.
- W3216857888 countsByYear W32168578882022 @default.
- W3216857888 countsByYear W32168578882023 @default.
- W3216857888 crossrefType "journal-article" @default.
- W3216857888 hasAuthorship W3216857888A5029423014 @default.
- W3216857888 hasAuthorship W3216857888A5036987388 @default.
- W3216857888 hasAuthorship W3216857888A5040877128 @default.
- W3216857888 hasAuthorship W3216857888A5059368105 @default.
- W3216857888 hasAuthorship W3216857888A5066645546 @default.
- W3216857888 hasAuthorship W3216857888A5087623065 @default.
- W3216857888 hasConcept C132525143 @default.
- W3216857888 hasConcept C154945302 @default.
- W3216857888 hasConcept C176225458 @default.
- W3216857888 hasConcept C184337299 @default.
- W3216857888 hasConcept C199360897 @default.
- W3216857888 hasConcept C204321447 @default.
- W3216857888 hasConcept C23123220 @default.
- W3216857888 hasConcept C36464697 @default.
- W3216857888 hasConcept C41008148 @default.
- W3216857888 hasConcept C44291984 @default.
- W3216857888 hasConcept C80444323 @default.
- W3216857888 hasConceptScore W3216857888C132525143 @default.
- W3216857888 hasConceptScore W3216857888C154945302 @default.
- W3216857888 hasConceptScore W3216857888C176225458 @default.
- W3216857888 hasConceptScore W3216857888C184337299 @default.
- W3216857888 hasConceptScore W3216857888C199360897 @default.
- W3216857888 hasConceptScore W3216857888C204321447 @default.
- W3216857888 hasConceptScore W3216857888C23123220 @default.
- W3216857888 hasConceptScore W3216857888C36464697 @default.
- W3216857888 hasConceptScore W3216857888C41008148 @default.
- W3216857888 hasConceptScore W3216857888C44291984 @default.
- W3216857888 hasConceptScore W3216857888C80444323 @default.
- W3216857888 hasLocation W32168578881 @default.
- W3216857888 hasOpenAccess W3216857888 @default.
- W3216857888 hasPrimaryLocation W32168578881 @default.
- W3216857888 hasRelatedWork W128392744 @default.
- W3216857888 hasRelatedWork W1568866260 @default.
- W3216857888 hasRelatedWork W1846541313 @default.
- W3216857888 hasRelatedWork W207304934 @default.
- W3216857888 hasRelatedWork W2233955765 @default.
- W3216857888 hasRelatedWork W2747680751 @default.
- W3216857888 hasRelatedWork W3097853387 @default.
- W3216857888 hasRelatedWork W3107474891 @default.
- W3216857888 hasRelatedWork W3214915308 @default.
- W3216857888 hasRelatedWork W4293657183 @default.
- W3216857888 hasVolume "124" @default.
- W3216857888 isParatext "false" @default.
- W3216857888 isRetracted "false" @default.
- W3216857888 magId "3216857888" @default.
- W3216857888 workType "article" @default.