Matches in SemOpenAlex for { <https://semopenalex.org/work/W3115476810> ?p ?o ?g. }
- W3115476810 endingPage "2767" @default.
- W3115476810 startingPage "2758" @default.
- W3115476810 abstract "Though beneficial for encouraging the visual question answering (VQA) models to discover the underlying knowledge by exploiting the input-output correlation beyond image and text contexts, the existing knowledge VQA data sets are mostly annotated in a crowdsource way, e.g., collecting questions and external reasons from different users via the Internet. In addition to the challenge of knowledge reasoning, how to deal with the annotator bias also remains unsolved, which often leads to superficial overfitted correlations between questions and answers. To address this issue, we propose a novel data set named knowledge-routed visual question reasoning for VQA model evaluation. Considering that a desirable VQA model should correctly perceive the image context, understand the question, and incorporate its learned knowledge, our proposed data set aims to cut off the shortcut learning exploited by the current deep embedding models and push the research boundary of the knowledge-based visual question reasoning. Specifically, we generate the question-answer pair based on both the visual genome scene graph and an external knowledge base with controlled programs to disentangle the knowledge from other biases. The programs can select one or two triplets from the scene graph or knowledge base to push multistep reasoning, avoid answer ambiguity, and balance the answer distribution. In contrast to the existing VQA data sets, we further imply the following two major constraints on the programs to incorporate knowledge reasoning. First, multiple knowledge triplets can be related to the question, but only one knowledge relates to the image object. This can enforce the VQA model to correctly perceive the image instead of guessing the knowledge based on the given question solely. Second, all questions are based on different knowledge, but the candidate answers are the same for both the training and test sets. We make the testing knowledge unused during training to evaluate whether a model can understand question words and handle unseen combinations. Extensive experiments with various baselines and state-of-the-art VQA models are conducted to demonstrate that there still exists a big gap between the model with and without groundtruth supporting triplets when given the embedded knowledge base. This reveals the weakness of the current deep embedding models on the knowledge reasoning problem." @default.
- W3115476810 created "2021-01-05" @default.
- W3115476810 creator A5053115593 @default.
- W3115476810 creator A5059253391 @default.
- W3115476810 creator A5079046860 @default.
- W3115476810 creator A5088124671 @default.
- W3115476810 date "2022-07-01" @default.
- W3115476810 modified "2023-10-17" @default.
- W3115476810 title "Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding" @default.
- W3115476810 cites W1933349210 @default.
- W3115476810 cites W2083897630 @default.
- W3115476810 cites W2277195237 @default.
- W3115476810 cites W2471094925 @default.
- W3115476810 cites W2560730294 @default.
- W3115476810 cites W2561529111 @default.
- W3115476810 cites W2561715562 @default.
- W3115476810 cites W2735608653 @default.
- W3115476810 cites W2745461083 @default.
- W3115476810 cites W2747623286 @default.
- W3115476810 cites W2760103357 @default.
- W3115476810 cites W2891394954 @default.
- W3115476810 cites W2947312908 @default.
- W3115476810 cites W2950339735 @default.
- W3115476810 cites W2962684798 @default.
- W3115476810 cites W2962749469 @default.
- W3115476810 cites W2963012286 @default.
- W3115476810 cites W2963115613 @default.
- W3115476810 cites W2963143606 @default.
- W3115476810 cites W2963150162 @default.
- W3115476810 cites W2963191264 @default.
- W3115476810 cites W2963224792 @default.
- W3115476810 cites W2963383024 @default.
- W3115476810 cites W2963420691 @default.
- W3115476810 cites W2963518342 @default.
- W3115476810 cites W2963656855 @default.
- W3115476810 cites W2963717374 @default.
- W3115476810 cites W2963954913 @default.
- W3115476810 cites W2963976294 @default.
- W3115476810 cites W2964118342 @default.
- W3115476810 cites W2964245290 @default.
- W3115476810 cites W2964303913 @default.
- W3115476810 cites W2964345214 @default.
- W3115476810 cites W2966683369 @default.
- W3115476810 cites W2970231061 @default.
- W3115476810 cites W2998385486 @default.
- W3115476810 doi "https://doi.org/10.1109/tnnls.2020.3045034" @default.
- W3115476810 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/33385313" @default.
- W3115476810 hasPublicationYear "2022" @default.
- W3115476810 type Work @default.
- W3115476810 sameAs 3115476810 @default.
- W3115476810 citedByCount "8" @default.
- W3115476810 countsByYear W31154768102019 @default.
- W3115476810 countsByYear W31154768102021 @default.
- W3115476810 countsByYear W31154768102022 @default.
- W3115476810 countsByYear W31154768102023 @default.
- W3115476810 crossrefType "journal-article" @default.
- W3115476810 hasAuthorship W3115476810A5053115593 @default.
- W3115476810 hasAuthorship W3115476810A5059253391 @default.
- W3115476810 hasAuthorship W3115476810A5079046860 @default.
- W3115476810 hasAuthorship W3115476810A5088124671 @default.
- W3115476810 hasBestOaLocation W31154768102 @default.
- W3115476810 hasConcept C119857082 @default.
- W3115476810 hasConcept C120567893 @default.
- W3115476810 hasConcept C151730666 @default.
- W3115476810 hasConcept C154945302 @default.
- W3115476810 hasConcept C161301231 @default.
- W3115476810 hasConcept C177264268 @default.
- W3115476810 hasConcept C199360897 @default.
- W3115476810 hasConcept C23123220 @default.
- W3115476810 hasConcept C2777508537 @default.
- W3115476810 hasConcept C2779343474 @default.
- W3115476810 hasConcept C2780522230 @default.
- W3115476810 hasConcept C30542707 @default.
- W3115476810 hasConcept C41008148 @default.
- W3115476810 hasConcept C41608201 @default.
- W3115476810 hasConcept C44291984 @default.
- W3115476810 hasConcept C4554734 @default.
- W3115476810 hasConcept C86803240 @default.
- W3115476810 hasConceptScore W3115476810C119857082 @default.
- W3115476810 hasConceptScore W3115476810C120567893 @default.
- W3115476810 hasConceptScore W3115476810C151730666 @default.
- W3115476810 hasConceptScore W3115476810C154945302 @default.
- W3115476810 hasConceptScore W3115476810C161301231 @default.
- W3115476810 hasConceptScore W3115476810C177264268 @default.
- W3115476810 hasConceptScore W3115476810C199360897 @default.
- W3115476810 hasConceptScore W3115476810C23123220 @default.
- W3115476810 hasConceptScore W3115476810C2777508537 @default.
- W3115476810 hasConceptScore W3115476810C2779343474 @default.
- W3115476810 hasConceptScore W3115476810C2780522230 @default.
- W3115476810 hasConceptScore W3115476810C30542707 @default.
- W3115476810 hasConceptScore W3115476810C41008148 @default.
- W3115476810 hasConceptScore W3115476810C41608201 @default.
- W3115476810 hasConceptScore W3115476810C44291984 @default.
- W3115476810 hasConceptScore W3115476810C4554734 @default.
- W3115476810 hasConceptScore W3115476810C86803240 @default.
- W3115476810 hasFunder F4320321001 @default.
- W3115476810 hasFunder F4320321921 @default.
- W3115476810 hasIssue "7" @default.