Matches in SemOpenAlex for { <https://semopenalex.org/work/W2952372688> ?p ?o ?g. }
Showing items 1 to 80 of
80
with 100 items per page.
- W2952372688 abstract "Paragraph-style image captions describe diverse aspects of an image as opposed to the more common single-sentence captions that only provide an abstract description of the image. These paragraph captions can hence contain substantial information of the image for tasks such as visual question answering. Moreover, this textual information is complementary with visual information present in the image because it can discuss both more abstract concepts and more explicit, intermediate symbolic information about objects, events, and scenes that can directly be matched with the textual question and copied into the textual answer (i.e., via easier modality match). Hence, we propose a combined Visual and Textual Question Answering (VTQA) model which takes as input a paragraph caption as well as the corresponding image, and answers the given question based on both inputs. In our model, the inputs are fused to extract related information by cross-attention (early fusion), then fused again in the form of consensus (late fusion), and finally expected answers are given an extra score to enhance the chance of selection (later fusion). Empirical results show that paragraph captions, even when automatically generated (via an RL-based encoder-decoder model), help correctly answer more visual questions. Overall, our joint model, when trained on the Visual Genome dataset, significantly improves the VQA performance over a strong baseline model." @default.
- W2952372688 created "2019-06-27" @default.
- W2952372688 creator A5001987532 @default.
- W2952372688 creator A5041514597 @default.
- W2952372688 date "2019-01-01" @default.
- W2952372688 modified "2023-10-12" @default.
- W2952372688 title "Improving Visual Question Answering by Referring to Generated Paragraph Captions" @default.
- W2952372688 cites W1895641373 @default.
- W2952372688 cites W1933349210 @default.
- W2952372688 cites W1956340063 @default.
- W2952372688 cites W2157331557 @default.
- W2952372688 cites W2250539671 @default.
- W2952372688 cites W2277195237 @default.
- W2952372688 cites W2412400526 @default.
- W2952372688 cites W2463565445 @default.
- W2952372688 cites W2507009361 @default.
- W2952372688 cites W2549599535 @default.
- W2952372688 cites W2551396370 @default.
- W2952372688 cites W2560730294 @default.
- W2952372688 cites W2613718673 @default.
- W2952372688 cites W2745461083 @default.
- W2952372688 cites W2890781596 @default.
- W2952372688 cites W2951684117 @default.
- W2952372688 cites W2962749469 @default.
- W2952372688 cites W2963084599 @default.
- W2952372688 cites W2963150162 @default.
- W2952372688 cites W2963656855 @default.
- W2952372688 cites W2963758027 @default.
- W2952372688 cites W2963954913 @default.
- W2952372688 cites W2964121744 @default.
- W2952372688 doi "https://doi.org/10.18653/v1/p19-1351" @default.
- W2952372688 hasPublicationYear "2019" @default.
- W2952372688 type Work @default.
- W2952372688 sameAs 2952372688 @default.
- W2952372688 citedByCount "15" @default.
- W2952372688 countsByYear W29523726882018 @default.
- W2952372688 countsByYear W29523726882020 @default.
- W2952372688 countsByYear W29523726882021 @default.
- W2952372688 countsByYear W29523726882022 @default.
- W2952372688 countsByYear W29523726882023 @default.
- W2952372688 crossrefType "proceedings-article" @default.
- W2952372688 hasAuthorship W2952372688A5001987532 @default.
- W2952372688 hasAuthorship W2952372688A5041514597 @default.
- W2952372688 hasBestOaLocation W29523726881 @default.
- W2952372688 hasConcept C136764020 @default.
- W2952372688 hasConcept C138885662 @default.
- W2952372688 hasConcept C154945302 @default.
- W2952372688 hasConcept C204321447 @default.
- W2952372688 hasConcept C23123220 @default.
- W2952372688 hasConcept C2777206241 @default.
- W2952372688 hasConcept C41008148 @default.
- W2952372688 hasConcept C41895202 @default.
- W2952372688 hasConcept C44291984 @default.
- W2952372688 hasConceptScore W2952372688C136764020 @default.
- W2952372688 hasConceptScore W2952372688C138885662 @default.
- W2952372688 hasConceptScore W2952372688C154945302 @default.
- W2952372688 hasConceptScore W2952372688C204321447 @default.
- W2952372688 hasConceptScore W2952372688C23123220 @default.
- W2952372688 hasConceptScore W2952372688C2777206241 @default.
- W2952372688 hasConceptScore W2952372688C41008148 @default.
- W2952372688 hasConceptScore W2952372688C41895202 @default.
- W2952372688 hasConceptScore W2952372688C44291984 @default.
- W2952372688 hasLocation W29523726881 @default.
- W2952372688 hasLocation W29523726882 @default.
- W2952372688 hasOpenAccess W2952372688 @default.
- W2952372688 hasPrimaryLocation W29523726881 @default.
- W2952372688 hasRelatedWork W128392744 @default.
- W2952372688 hasRelatedWork W1584662895 @default.
- W2952372688 hasRelatedWork W207304934 @default.
- W2952372688 hasRelatedWork W2233955765 @default.
- W2952372688 hasRelatedWork W2429147410 @default.
- W2952372688 hasRelatedWork W2747680751 @default.
- W2952372688 hasRelatedWork W3107474891 @default.
- W2952372688 hasRelatedWork W3158716790 @default.
- W2952372688 hasRelatedWork W3160526049 @default.
- W2952372688 hasRelatedWork W2952565226 @default.
- W2952372688 isParatext "false" @default.
- W2952372688 isRetracted "false" @default.
- W2952372688 magId "2952372688" @default.
- W2952372688 workType "article" @default.