Matches in SemOpenAlex for { <https://semopenalex.org/work/W2968388725> ?p ?o ?g. }
- W2968388725 abstract "Exploiting relationships between visual regions and question words have achieved great success in learning multi-modality features for Visual Question Answering (VQA). However, we argue that existing methods mostly model relations between individual visual regions and words, which are not enough to correctly answer the question. From humans' perspective, answering a visual question requires understanding the summarizations of visual and language information. In this paper, we proposed the Multi-modality Latent Interaction module (MLI) to tackle this problem. The proposed module learns the cross-modality relationships between latent visual and language summarizations, which summarize visual regions and question into a small number of latent representations to avoid modeling uninformative individual region-word relations. The cross-modality information between the latent summarizations are propagated to fuse valuable information from both modalities and are used to update the visual and word features. Such MLI modules can be stacked for several stages to model complex and latent relations between the two modalities and achieves highly competitive performance on public VQA benchmarks, VQA v2.0 and TDIUC . In addition, we show that the performance of our methods could be significantly improved by combining with pre-trained language model BERT." @default.
- W2968388725 created "2019-08-22" @default.
- W2968388725 creator A5028272399 @default.
- W2968388725 creator A5050749894 @default.
- W2968388725 creator A5065073978 @default.
- W2968388725 creator A5079410647 @default.
- W2968388725 creator A5084753767 @default.
- W2968388725 date "2019-08-10" @default.
- W2968388725 modified "2023-10-16" @default.
- W2968388725 title "Multi-modality Latent Interaction Network for Visual Question Answering" @default.
- W2968388725 cites W1486649854 @default.
- W2968388725 cites W1514535095 @default.
- W2968388725 cites W1522301498 @default.
- W2968388725 cites W1686810756 @default.
- W2968388725 cites W1861492603 @default.
- W2968388725 cites W1933349210 @default.
- W2968388725 cites W2108598243 @default.
- W2968388725 cites W2153579005 @default.
- W2968388725 cites W2163605009 @default.
- W2968388725 cites W2194775991 @default.
- W2968388725 cites W2250539671 @default.
- W2968388725 cites W2273038706 @default.
- W2968388725 cites W2412400526 @default.
- W2968388725 cites W2439787475 @default.
- W2968388725 cites W2550553598 @default.
- W2968388725 cites W2560730294 @default.
- W2968388725 cites W2561715562 @default.
- W2968388725 cites W2597425697 @default.
- W2968388725 cites W2745461083 @default.
- W2968388725 cites W2747623286 @default.
- W2968388725 cites W2787119853 @default.
- W2968388725 cites W2808877322 @default.
- W2968388725 cites W2885156408 @default.
- W2968388725 cites W2890531016 @default.
- W2968388725 cites W2899771611 @default.
- W2968388725 cites W2903867219 @default.
- W2968388725 cites W2904617485 @default.
- W2968388725 cites W2906305374 @default.
- W2968388725 cites W2908336025 @default.
- W2968388725 cites W2949197413 @default.
- W2968388725 cites W2949269099 @default.
- W2968388725 cites W2949402865 @default.
- W2968388725 cites W2952524542 @default.
- W2968388725 cites W2953106684 @default.
- W2968388725 cites W2956354060 @default.
- W2968388725 cites W2962739339 @default.
- W2968388725 cites W2962933067 @default.
- W2968388725 cites W2963066927 @default.
- W2968388725 cites W2963091558 @default.
- W2968388725 cites W2963176022 @default.
- W2968388725 cites W2963341956 @default.
- W2968388725 cites W2963403868 @default.
- W2968388725 cites W2963446712 @default.
- W2968388725 cites W2963521239 @default.
- W2968388725 cites W2963532541 @default.
- W2968388725 cites W2963668159 @default.
- W2968388725 cites W2963717374 @default.
- W2968388725 cites W2963770662 @default.
- W2968388725 cites W2963907629 @default.
- W2968388725 cites W2963921132 @default.
- W2968388725 cites W2963921921 @default.
- W2968388725 cites W2963954913 @default.
- W2968388725 cites W2964067226 @default.
- W2968388725 cites W2964080601 @default.
- W2968388725 cites W2964138017 @default.
- W2968388725 cites W2969541233 @default.
- W2968388725 cites W2970608575 @default.
- W2968388725 doi "https://doi.org/10.48550/arxiv.1908.04289" @default.
- W2968388725 hasPublicationYear "2019" @default.
- W2968388725 type Work @default.
- W2968388725 sameAs 2968388725 @default.
- W2968388725 citedByCount "19" @default.
- W2968388725 countsByYear W29683887252019 @default.
- W2968388725 countsByYear W29683887252020 @default.
- W2968388725 countsByYear W29683887252021 @default.
- W2968388725 countsByYear W29683887252023 @default.
- W2968388725 crossrefType "posted-content" @default.
- W2968388725 hasAuthorship W2968388725A5028272399 @default.
- W2968388725 hasAuthorship W2968388725A5050749894 @default.
- W2968388725 hasAuthorship W2968388725A5065073978 @default.
- W2968388725 hasAuthorship W2968388725A5079410647 @default.
- W2968388725 hasAuthorship W2968388725A5084753767 @default.
- W2968388725 hasBestOaLocation W29683887251 @default.
- W2968388725 hasConcept C12713177 @default.
- W2968388725 hasConcept C138885662 @default.
- W2968388725 hasConcept C144024400 @default.
- W2968388725 hasConcept C154945302 @default.
- W2968388725 hasConcept C204321447 @default.
- W2968388725 hasConcept C2779903281 @default.
- W2968388725 hasConcept C2780226545 @default.
- W2968388725 hasConcept C36289849 @default.
- W2968388725 hasConcept C41008148 @default.
- W2968388725 hasConcept C41895202 @default.
- W2968388725 hasConcept C44291984 @default.
- W2968388725 hasConcept C59650362 @default.
- W2968388725 hasConcept C90805587 @default.
- W2968388725 hasConceptScore W2968388725C12713177 @default.
- W2968388725 hasConceptScore W2968388725C138885662 @default.
- W2968388725 hasConceptScore W2968388725C144024400 @default.
- W2968388725 hasConceptScore W2968388725C154945302 @default.