Matches in SemOpenAlex for { <https://semopenalex.org/work/W2904291752> ?p ?o ?g. }
- W2904291752 endingPage "6398" @default.
- W2904291752 startingPage "6391" @default.
- W2904291752 abstract "To date, visual question answering (VQA) (i.e., image QA and video QA) is still a holy grail in vision and language understanding, especially for video QA. Compared with image QA that focuses primarily on understanding the associations between image region-level details and corresponding questions, video QA requires a model to jointly reason across both spatial and long-range temporal structures of a video as well as text to provide an accurate answer. In this paper, we specifically tackle the problem of video QA by proposing a Structured Two-stream Attention network, namely STA, to answer a free-form or open-ended natural language question about the content of a given video. First, we infer rich longrange temporal structures in videos using our structured segment component and encode text features. Then, our structured two-stream attention component simultaneously localizes important visual instance, reduces the influence of background video and focuses on the relevant text. Finally, the structured two-stream fusion component incorporates different segments of query and video aware context representation and infers the answers. Experiments on the large-scale video QA dataset TGIF-QA show that our proposed method significantly surpasses the best counterpart (i.e., with one representation for the video input) by 13.0%, 13.5%, 11.0% and 0.3 for Action, Trans., TrameQA and Count tasks. It also outperforms the best competitor (i.e., with two representations) on the Action, Trans., TrameQA tasks by 4.1%, 4.7%, and 5.1%." @default.
- W2904291752 created "2018-12-22" @default.
- W2904291752 creator A5017597537 @default.
- W2904291752 creator A5017943466 @default.
- W2904291752 creator A5036987388 @default.
- W2904291752 creator A5052993469 @default.
- W2904291752 creator A5066645546 @default.
- W2904291752 creator A5068917997 @default.
- W2904291752 creator A5087623065 @default.
- W2904291752 date "2019-07-17" @default.
- W2904291752 modified "2023-09-29" @default.
- W2904291752 title "Structured Two-Stream Attention Network for Video Question Answering" @default.
- W2904291752 cites W1488163396 @default.
- W2904291752 cites W1933349210 @default.
- W2904291752 cites W2117539524 @default.
- W2904291752 cites W2194775991 @default.
- W2904291752 cites W2250539671 @default.
- W2904291752 cites W2396147015 @default.
- W2904291752 cites W2463565445 @default.
- W2904291752 cites W2546696630 @default.
- W2904291752 cites W2565656701 @default.
- W2904291752 cites W2621571501 @default.
- W2904291752 cites W2788527657 @default.
- W2904291752 cites W2807941860 @default.
- W2904291752 cites W2895902247 @default.
- W2904291752 cites W2914699769 @default.
- W2904291752 cites W2949197413 @default.
- W2904291752 cites W2949218037 @default.
- W2904291752 cites W2962949233 @default.
- W2904291752 cites W2962958773 @default.
- W2904291752 cites W2963066927 @default.
- W2904291752 cites W2963150162 @default.
- W2904291752 cites W2963176022 @default.
- W2904291752 cites W2963293463 @default.
- W2904291752 cites W2963383024 @default.
- W2904291752 cites W2963466731 @default.
- W2904291752 cites W2963656855 @default.
- W2904291752 cites W2963890755 @default.
- W2904291752 cites W2963954913 @default.
- W2904291752 cites W2964345214 @default.
- W2904291752 cites W3098019619 @default.
- W2904291752 doi "https://doi.org/10.1609/aaai.v33i01.33016391" @default.
- W2904291752 hasPublicationYear "2019" @default.
- W2904291752 type Work @default.
- W2904291752 sameAs 2904291752 @default.
- W2904291752 citedByCount "33" @default.
- W2904291752 countsByYear W29042917522020 @default.
- W2904291752 countsByYear W29042917522021 @default.
- W2904291752 countsByYear W29042917522022 @default.
- W2904291752 countsByYear W29042917522023 @default.
- W2904291752 crossrefType "journal-article" @default.
- W2904291752 hasAuthorship W2904291752A5017597537 @default.
- W2904291752 hasAuthorship W2904291752A5017943466 @default.
- W2904291752 hasAuthorship W2904291752A5036987388 @default.
- W2904291752 hasAuthorship W2904291752A5052993469 @default.
- W2904291752 hasAuthorship W2904291752A5066645546 @default.
- W2904291752 hasAuthorship W2904291752A5068917997 @default.
- W2904291752 hasAuthorship W2904291752A5087623065 @default.
- W2904291752 hasBestOaLocation W29042917521 @default.
- W2904291752 hasConcept C104317684 @default.
- W2904291752 hasConcept C121332964 @default.
- W2904291752 hasConcept C151730666 @default.
- W2904291752 hasConcept C154945302 @default.
- W2904291752 hasConcept C168167062 @default.
- W2904291752 hasConcept C17744445 @default.
- W2904291752 hasConcept C185592680 @default.
- W2904291752 hasConcept C199539241 @default.
- W2904291752 hasConcept C204321447 @default.
- W2904291752 hasConcept C23123220 @default.
- W2904291752 hasConcept C2776359362 @default.
- W2904291752 hasConcept C2779343474 @default.
- W2904291752 hasConcept C2780791683 @default.
- W2904291752 hasConcept C41008148 @default.
- W2904291752 hasConcept C44291984 @default.
- W2904291752 hasConcept C55493867 @default.
- W2904291752 hasConcept C62520636 @default.
- W2904291752 hasConcept C66746571 @default.
- W2904291752 hasConcept C86803240 @default.
- W2904291752 hasConcept C94625758 @default.
- W2904291752 hasConcept C97355855 @default.
- W2904291752 hasConceptScore W2904291752C104317684 @default.
- W2904291752 hasConceptScore W2904291752C121332964 @default.
- W2904291752 hasConceptScore W2904291752C151730666 @default.
- W2904291752 hasConceptScore W2904291752C154945302 @default.
- W2904291752 hasConceptScore W2904291752C168167062 @default.
- W2904291752 hasConceptScore W2904291752C17744445 @default.
- W2904291752 hasConceptScore W2904291752C185592680 @default.
- W2904291752 hasConceptScore W2904291752C199539241 @default.
- W2904291752 hasConceptScore W2904291752C204321447 @default.
- W2904291752 hasConceptScore W2904291752C23123220 @default.
- W2904291752 hasConceptScore W2904291752C2776359362 @default.
- W2904291752 hasConceptScore W2904291752C2779343474 @default.
- W2904291752 hasConceptScore W2904291752C2780791683 @default.
- W2904291752 hasConceptScore W2904291752C41008148 @default.
- W2904291752 hasConceptScore W2904291752C44291984 @default.
- W2904291752 hasConceptScore W2904291752C55493867 @default.
- W2904291752 hasConceptScore W2904291752C62520636 @default.
- W2904291752 hasConceptScore W2904291752C66746571 @default.