Matches in SemOpenAlex for { <https://semopenalex.org/work/W4286751465> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W4286751465 endingPage "108898" @default.
- W4286751465 startingPage "108898" @default.
- W4286751465 abstract "In this paper, we aim to obtain improved attention for a visual question answering (VQA) task. It is challenging to provide supervision for attention. An observation we make is that visual explanations as obtained through class activation mappings (specifically Grad-CAM) that are meant to explain the performance of various networks could form a means of supervision. However, as the distributions of attention maps and that of Grad-CAMs differ, it would not be suitable to directly use these as a form of supervision. Rather, we propose the use of a discriminator that aims to distinguish samples of visual explanation and attention maps. The use of adversarial training of the attention regions as a two-player game between attention and explanation serves to bring the distributions of attention maps and visual explanations closer. Significantly, we observe that providing such a means of supervision also results in attention maps that are more closely related to human attention resulting in a substantial improvement over baseline stacked attention network (SAN) models. It also results in a good improvement in rank correlation metric on the VQA task. This method can also be combined with recent MCB based methods and results in consistent improvement. We also provide comparisons with other means for learning distributions such as based on Correlation Alignment (Coral), Maximum Mean Discrepancy (MMD) and Mean Square Error (MSE) losses and observe that the adversarial loss outperforms the other forms of learning the attention maps. A generalization of the work is also provided by extending our approach to the task of ‘Visual Dialog’ where the attention is more contextual. Thorough evaluation for this task is also provided. Visualization of the results confirms our hypothesis that attention maps improve using the proposed form of supervision." @default.
- W4286751465 created "2022-07-23" @default.
- W4286751465 creator A5007109424 @default.
- W4286751465 creator A5028945877 @default.
- W4286751465 creator A5035714586 @default.
- W4286751465 date "2022-12-01" @default.
- W4286751465 modified "2023-10-02" @default.
- W4286751465 title "Explanation vs. attention: A two-player game to obtain attention for VQA and visual dialog" @default.
- W4286751465 cites W1965555277 @default.
- W4286751465 cites W1983927101 @default.
- W4286751465 cites W2056046632 @default.
- W4286751465 cites W2067050450 @default.
- W4286751465 cites W2149557440 @default.
- W4286751465 cites W2560920409 @default.
- W4286751465 cites W2913943056 @default.
- W4286751465 cites W2964065333 @default.
- W4286751465 cites W2964303913 @default.
- W4286751465 cites W3041222599 @default.
- W4286751465 cites W3044175177 @default.
- W4286751465 cites W3049582337 @default.
- W4286751465 cites W3118856670 @default.
- W4286751465 cites W3152635550 @default.
- W4286751465 cites W3166099811 @default.
- W4286751465 cites W3201068857 @default.
- W4286751465 cites W3216857888 @default.
- W4286751465 doi "https://doi.org/10.1016/j.patcog.2022.108898" @default.
- W4286751465 hasPublicationYear "2022" @default.
- W4286751465 type Work @default.
- W4286751465 citedByCount "2" @default.
- W4286751465 countsByYear W42867514652023 @default.
- W4286751465 crossrefType "journal-article" @default.
- W4286751465 hasAuthorship W4286751465A5007109424 @default.
- W4286751465 hasAuthorship W4286751465A5028945877 @default.
- W4286751465 hasAuthorship W4286751465A5035714586 @default.
- W4286751465 hasConcept C111368507 @default.
- W4286751465 hasConcept C114614502 @default.
- W4286751465 hasConcept C119857082 @default.
- W4286751465 hasConcept C12725497 @default.
- W4286751465 hasConcept C127313418 @default.
- W4286751465 hasConcept C134306372 @default.
- W4286751465 hasConcept C154945302 @default.
- W4286751465 hasConcept C162324750 @default.
- W4286751465 hasConcept C164226766 @default.
- W4286751465 hasConcept C176217482 @default.
- W4286751465 hasConcept C177148314 @default.
- W4286751465 hasConcept C187736073 @default.
- W4286751465 hasConcept C21547014 @default.
- W4286751465 hasConcept C2777212361 @default.
- W4286751465 hasConcept C2779803651 @default.
- W4286751465 hasConcept C2780451532 @default.
- W4286751465 hasConcept C33923547 @default.
- W4286751465 hasConcept C41008148 @default.
- W4286751465 hasConcept C76155785 @default.
- W4286751465 hasConcept C94915269 @default.
- W4286751465 hasConceptScore W4286751465C111368507 @default.
- W4286751465 hasConceptScore W4286751465C114614502 @default.
- W4286751465 hasConceptScore W4286751465C119857082 @default.
- W4286751465 hasConceptScore W4286751465C12725497 @default.
- W4286751465 hasConceptScore W4286751465C127313418 @default.
- W4286751465 hasConceptScore W4286751465C134306372 @default.
- W4286751465 hasConceptScore W4286751465C154945302 @default.
- W4286751465 hasConceptScore W4286751465C162324750 @default.
- W4286751465 hasConceptScore W4286751465C164226766 @default.
- W4286751465 hasConceptScore W4286751465C176217482 @default.
- W4286751465 hasConceptScore W4286751465C177148314 @default.
- W4286751465 hasConceptScore W4286751465C187736073 @default.
- W4286751465 hasConceptScore W4286751465C21547014 @default.
- W4286751465 hasConceptScore W4286751465C2777212361 @default.
- W4286751465 hasConceptScore W4286751465C2779803651 @default.
- W4286751465 hasConceptScore W4286751465C2780451532 @default.
- W4286751465 hasConceptScore W4286751465C33923547 @default.
- W4286751465 hasConceptScore W4286751465C41008148 @default.
- W4286751465 hasConceptScore W4286751465C76155785 @default.
- W4286751465 hasConceptScore W4286751465C94915269 @default.
- W4286751465 hasLocation W42867514651 @default.
- W4286751465 hasOpenAccess W4286751465 @default.
- W4286751465 hasPrimaryLocation W42867514651 @default.
- W4286751465 hasRelatedWork W2906643110 @default.
- W4286751465 hasRelatedWork W2933063649 @default.
- W4286751465 hasRelatedWork W2951337574 @default.
- W4286751465 hasRelatedWork W2961085424 @default.
- W4286751465 hasRelatedWork W2975513049 @default.
- W4286751465 hasRelatedWork W4280544492 @default.
- W4286751465 hasRelatedWork W4286629047 @default.
- W4286751465 hasRelatedWork W4306674287 @default.
- W4286751465 hasRelatedWork W4319453497 @default.
- W4286751465 hasRelatedWork W4224009465 @default.
- W4286751465 hasVolume "132" @default.
- W4286751465 isParatext "false" @default.
- W4286751465 isRetracted "false" @default.
- W4286751465 workType "article" @default.