Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385804984> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W4385804984 abstract "Text-based VQA aims at answering questions by reading the text present in the images. It requires a large amount of scene-text relationship understanding compared to the VQA task. Recent studies have shown that the question-answer pairs in the dataset are more focused on the text present in the image but less importance is given to visual features and some questions do not require understanding the image. The models trained on this dataset predict biased answers due to the lack of understanding of visual context. For example, in questions like What is written on the signboard?, the answer predicted by the model is always STOP which makes the model to ignore the image. To address these issues, we propose a method to learn visual features (making V matter in TextVQA) along with the OCR features and question features using VQA dataset as external knowledge for Text-based VQA. Specifically, we combine the TextVQA dataset and VQA dataset and train the model on this combined dataset. Such a simple, yet effective approach increases the understanding and correlation between the image features and text present in the image, which helps in the better answering of questions. We further test the model on different datasets and compare their qualitative and quantitative results." @default.
- W4385804984 created "2023-08-15" @default.
- W4385804984 creator A5004782135 @default.
- W4385804984 creator A5059164512 @default.
- W4385804984 creator A5082105653 @default.
- W4385804984 date "2023-06-01" @default.
- W4385804984 modified "2023-09-26" @default.
- W4385804984 title "Making the V in Text-VQA Matter" @default.
- W4385804984 cites W1933349210 @default.
- W4385804984 cites W2809273748 @default.
- W4385804984 cites W2963115613 @default.
- W4385804984 cites W2979382951 @default.
- W4385804984 cites W2983256121 @default.
- W4385804984 cites W2988326850 @default.
- W4385804984 cites W3004268082 @default.
- W4385804984 cites W3034287395 @default.
- W4385804984 cites W3034336960 @default.
- W4385804984 cites W3035145964 @default.
- W4385804984 cites W3099884329 @default.
- W4385804984 cites W3110575265 @default.
- W4385804984 cites W3120043490 @default.
- W4385804984 cites W3177445028 @default.
- W4385804984 cites W3177934633 @default.
- W4385804984 cites W3181159501 @default.
- W4385804984 cites W3196405434 @default.
- W4385804984 cites W3204222276 @default.
- W4385804984 cites W3215633354 @default.
- W4385804984 cites W4213057961 @default.
- W4385804984 cites W4319300063 @default.
- W4385804984 doi "https://doi.org/10.1109/cvprw59228.2023.00590" @default.
- W4385804984 hasPublicationYear "2023" @default.
- W4385804984 type Work @default.
- W4385804984 citedByCount "0" @default.
- W4385804984 crossrefType "proceedings-article" @default.
- W4385804984 hasAuthorship W4385804984A5004782135 @default.
- W4385804984 hasAuthorship W4385804984A5059164512 @default.
- W4385804984 hasAuthorship W4385804984A5082105653 @default.
- W4385804984 hasConcept C115961682 @default.
- W4385804984 hasConcept C119857082 @default.
- W4385804984 hasConcept C151730666 @default.
- W4385804984 hasConcept C153180895 @default.
- W4385804984 hasConcept C154945302 @default.
- W4385804984 hasConcept C162324750 @default.
- W4385804984 hasConcept C17744445 @default.
- W4385804984 hasConcept C187736073 @default.
- W4385804984 hasConcept C199539241 @default.
- W4385804984 hasConcept C204321447 @default.
- W4385804984 hasConcept C23123220 @default.
- W4385804984 hasConcept C2779343474 @default.
- W4385804984 hasConcept C2780451532 @default.
- W4385804984 hasConcept C41008148 @default.
- W4385804984 hasConcept C44291984 @default.
- W4385804984 hasConcept C554936623 @default.
- W4385804984 hasConcept C86803240 @default.
- W4385804984 hasConceptScore W4385804984C115961682 @default.
- W4385804984 hasConceptScore W4385804984C119857082 @default.
- W4385804984 hasConceptScore W4385804984C151730666 @default.
- W4385804984 hasConceptScore W4385804984C153180895 @default.
- W4385804984 hasConceptScore W4385804984C154945302 @default.
- W4385804984 hasConceptScore W4385804984C162324750 @default.
- W4385804984 hasConceptScore W4385804984C17744445 @default.
- W4385804984 hasConceptScore W4385804984C187736073 @default.
- W4385804984 hasConceptScore W4385804984C199539241 @default.
- W4385804984 hasConceptScore W4385804984C204321447 @default.
- W4385804984 hasConceptScore W4385804984C23123220 @default.
- W4385804984 hasConceptScore W4385804984C2779343474 @default.
- W4385804984 hasConceptScore W4385804984C2780451532 @default.
- W4385804984 hasConceptScore W4385804984C41008148 @default.
- W4385804984 hasConceptScore W4385804984C44291984 @default.
- W4385804984 hasConceptScore W4385804984C554936623 @default.
- W4385804984 hasConceptScore W4385804984C86803240 @default.
- W4385804984 hasLocation W43858049841 @default.
- W4385804984 hasOpenAccess W4385804984 @default.
- W4385804984 hasPrimaryLocation W43858049841 @default.
- W4385804984 hasRelatedWork W15319282 @default.
- W4385804984 hasRelatedWork W1560657467 @default.
- W4385804984 hasRelatedWork W2051167396 @default.
- W4385804984 hasRelatedWork W207304934 @default.
- W4385804984 hasRelatedWork W2151407063 @default.
- W4385804984 hasRelatedWork W2153711059 @default.
- W4385804984 hasRelatedWork W219090214 @default.
- W4385804984 hasRelatedWork W2915781047 @default.
- W4385804984 hasRelatedWork W2970044932 @default.
- W4385804984 hasRelatedWork W4377703168 @default.
- W4385804984 isParatext "false" @default.
- W4385804984 isRetracted "false" @default.
- W4385804984 workType "article" @default.