Matches in SemOpenAlex for { <https://semopenalex.org/work/W3000042222> ?p ?o ?g. }
- W3000042222 abstract "Generating video descriptions automatically is a challenging task that involves a complex interplay between spatio-temporal visual features and language models. Given that videos consist of spatial (frame-level) features and their temporal evolutions, an effective captioning model should be able to attend to these different cues selectively. To this end, we propose a Spatio-Temporal and Temporo-Spatial (STaTS) attention model which, conditioned on the language state, hierarchically combines spatial and temporal attention to videos in two different orders: (i) a spatio-temporal (ST) sub-model, which first attends to regions that have temporal evolution, then temporally pools the features from these regions; and (ii) a temporo-spatial (TS) sub-model, which first decides a single frame to attend to, then applies spatial attention within that frame. We propose a novel LSTM-based temporal ranking function, which we call ranked attention, for the ST model to capture action dynamics. Our entire framework is trained end-to-end. We provide experiments on two benchmark datasets: MSVD and MSR-VTT. Our results demonstrate the synergy between the ST and TS modules, outperforming recent state-of-the-art methods." @default.
- W3000042222 created "2020-01-23" @default.
- W3000042222 creator A5001601327 @default.
- W3000042222 creator A5008369672 @default.
- W3000042222 creator A5024613828 @default.
- W3000042222 creator A5076036593 @default.
- W3000042222 date "2020-01-17" @default.
- W3000042222 modified "2023-09-24" @default.
- W3000042222 title "Spatio-Temporal Ranked-Attention Networks for Video Captioning" @default.
- W3000042222 cites W1586939924 @default.
- W3000042222 cites W1601567445 @default.
- W3000042222 cites W1889081078 @default.
- W3000042222 cites W1905882502 @default.
- W3000042222 cites W1923404803 @default.
- W3000042222 cites W1926645898 @default.
- W3000042222 cites W1945129080 @default.
- W3000042222 cites W1956340063 @default.
- W3000042222 cites W1995820507 @default.
- W3000042222 cites W2064675550 @default.
- W3000042222 cites W2072160811 @default.
- W3000042222 cites W2101105183 @default.
- W3000042222 cites W2110933980 @default.
- W3000042222 cites W2130942839 @default.
- W3000042222 cites W2133459682 @default.
- W3000042222 cites W2136985729 @default.
- W3000042222 cites W2139501017 @default.
- W3000042222 cites W2142900973 @default.
- W3000042222 cites W2152984213 @default.
- W3000042222 cites W2154652894 @default.
- W3000042222 cites W2164290393 @default.
- W3000042222 cites W2172226303 @default.
- W3000042222 cites W2176263492 @default.
- W3000042222 cites W2194775991 @default.
- W3000042222 cites W2250539671 @default.
- W3000042222 cites W2251353663 @default.
- W3000042222 cites W2425121537 @default.
- W3000042222 cites W2462996230 @default.
- W3000042222 cites W2471775118 @default.
- W3000042222 cites W2505728881 @default.
- W3000042222 cites W2539222059 @default.
- W3000042222 cites W2553594924 @default.
- W3000042222 cites W2554906389 @default.
- W3000042222 cites W2575842049 @default.
- W3000042222 cites W2584992898 @default.
- W3000042222 cites W2593116425 @default.
- W3000042222 cites W2604141702 @default.
- W3000042222 cites W2606212668 @default.
- W3000042222 cites W2607119937 @default.
- W3000042222 cites W2608022654 @default.
- W3000042222 cites W2613718673 @default.
- W3000042222 cites W2618799552 @default.
- W3000042222 cites W2619947201 @default.
- W3000042222 cites W2737766105 @default.
- W3000042222 cites W2740388348 @default.
- W3000042222 cites W2746726611 @default.
- W3000042222 cites W2755426660 @default.
- W3000042222 cites W2765658575 @default.
- W3000042222 cites W2766375149 @default.
- W3000042222 cites W2766520430 @default.
- W3000042222 cites W2767863271 @default.
- W3000042222 cites W2799261915 @default.
- W3000042222 cites W2948358897 @default.
- W3000042222 cites W2951390634 @default.
- W3000042222 cites W2962756039 @default.
- W3000042222 cites W2962865004 @default.
- W3000042222 cites W2962994439 @default.
- W3000042222 cites W2963084599 @default.
- W3000042222 cites W2963177403 @default.
- W3000042222 cites W2963403868 @default.
- W3000042222 cites W2963465031 @default.
- W3000042222 cites W2963524571 @default.
- W3000042222 cites W2963576560 @default.
- W3000042222 cites W2963758027 @default.
- W3000042222 cites W2963916161 @default.
- W3000042222 cites W2963971014 @default.
- W3000042222 cites W2964241990 @default.
- W3000042222 cites W2964308564 @default.
- W3000042222 cites W3123318516 @default.
- W3000042222 cites W648786980 @default.
- W3000042222 hasPublicationYear "2020" @default.
- W3000042222 type Work @default.
- W3000042222 sameAs 3000042222 @default.
- W3000042222 citedByCount "0" @default.
- W3000042222 crossrefType "posted-content" @default.
- W3000042222 hasAuthorship W3000042222A5001601327 @default.
- W3000042222 hasAuthorship W3000042222A5008369672 @default.
- W3000042222 hasAuthorship W3000042222A5024613828 @default.
- W3000042222 hasAuthorship W3000042222A5076036593 @default.
- W3000042222 hasConcept C115961682 @default.
- W3000042222 hasConcept C126042441 @default.
- W3000042222 hasConcept C137293760 @default.
- W3000042222 hasConcept C154945302 @default.
- W3000042222 hasConcept C157657479 @default.
- W3000042222 hasConcept C162324750 @default.
- W3000042222 hasConcept C185798385 @default.
- W3000042222 hasConcept C187736073 @default.
- W3000042222 hasConcept C189430467 @default.
- W3000042222 hasConcept C205649164 @default.
- W3000042222 hasConcept C2780451532 @default.
- W3000042222 hasConcept C28490314 @default.