Matches in SemOpenAlex for { <https://semopenalex.org/work/W3197026988> ?p ?o ?g. }
Showing items 1 to 99 of
99
with 100 items per page.
- W3197026988 abstract "Video captioning is an essential technology to understand scenes and describe events in natural language. To apply it to real-time monitoring, a system needs not only to describe events accurately but also to produce the captions as soon as possible. Low-latency captioning is needed to realize such functionality, but this research area for online video captioning has not been pursued yet. This paper proposes a novel approach to optimize each caption's output timing based on a trade-off between latency and caption quality. An audio-visual Trans-former is trained to generate ground-truth captions using only a small portion of all video frames, and to mimic outputs of a pre-trained Transformer to which all the frames are given. A CNN-based timing detector is also trained to detect a proper output timing, where the captions generated by the two Trans-formers become sufficiently close to each other. With the jointly trained Transformer and timing detector, a caption can be generated in the early stages of an event-triggered video clip, as soon as an event happens or when it can be forecasted. Experiments with the ActivityNet Captions dataset show that our approach achieves 94% of the caption quality of the upper bound given by the pre-trained Transformer using the entire video clips, using only 28% of frames from the beginning." @default.
- W3197026988 created "2021-09-13" @default.
- W3197026988 creator A5001601327 @default.
- W3197026988 creator A5076453358 @default.
- W3197026988 creator A5087554069 @default.
- W3197026988 date "2021-08-30" @default.
- W3197026988 modified "2023-10-18" @default.
- W3197026988 title "Optimizing Latency for Online Video Captioning Using Audio-Visual Transformers" @default.
- W3197026988 cites W1573040851 @default.
- W3197026988 cites W1586939924 @default.
- W3197026988 cites W1893116441 @default.
- W3197026988 cites W2139501017 @default.
- W3197026988 cites W2142900973 @default.
- W3197026988 cites W2419292002 @default.
- W3197026988 cites W2425121537 @default.
- W3197026988 cites W2490414731 @default.
- W3197026988 cites W2529548870 @default.
- W3197026988 cites W2584992898 @default.
- W3197026988 cites W2810643877 @default.
- W3197026988 cites W2883910824 @default.
- W3197026988 cites W2887005286 @default.
- W3197026988 cites W2892009249 @default.
- W3197026988 cites W2963403868 @default.
- W3197026988 cites W2963524571 @default.
- W3197026988 cites W2963576560 @default.
- W3197026988 cites W2963920996 @default.
- W3197026988 cites W2963971014 @default.
- W3197026988 cites W2964078338 @default.
- W3197026988 cites W2964213933 @default.
- W3197026988 cites W2964241990 @default.
- W3197026988 cites W2964308564 @default.
- W3197026988 cites W2972818416 @default.
- W3197026988 cites W3015583403 @default.
- W3197026988 cites W3015927303 @default.
- W3197026988 cites W3025796084 @default.
- W3197026988 cites W3096524176 @default.
- W3197026988 cites W3119777426 @default.
- W3197026988 doi "https://doi.org/10.21437/interspeech.2021-1975" @default.
- W3197026988 hasPublicationYear "2021" @default.
- W3197026988 type Work @default.
- W3197026988 sameAs 3197026988 @default.
- W3197026988 citedByCount "0" @default.
- W3197026988 crossrefType "proceedings-article" @default.
- W3197026988 hasAuthorship W3197026988A5001601327 @default.
- W3197026988 hasAuthorship W3197026988A5076453358 @default.
- W3197026988 hasAuthorship W3197026988A5087554069 @default.
- W3197026988 hasBestOaLocation W31970269882 @default.
- W3197026988 hasConcept C115961682 @default.
- W3197026988 hasConcept C119599485 @default.
- W3197026988 hasConcept C121332964 @default.
- W3197026988 hasConcept C127413603 @default.
- W3197026988 hasConcept C154945302 @default.
- W3197026988 hasConcept C157657479 @default.
- W3197026988 hasConcept C165801399 @default.
- W3197026988 hasConcept C2779662365 @default.
- W3197026988 hasConcept C28490314 @default.
- W3197026988 hasConcept C31972630 @default.
- W3197026988 hasConcept C41008148 @default.
- W3197026988 hasConcept C62520636 @default.
- W3197026988 hasConcept C66322947 @default.
- W3197026988 hasConcept C76155785 @default.
- W3197026988 hasConcept C79403827 @default.
- W3197026988 hasConcept C82876162 @default.
- W3197026988 hasConcept C94915269 @default.
- W3197026988 hasConceptScore W3197026988C115961682 @default.
- W3197026988 hasConceptScore W3197026988C119599485 @default.
- W3197026988 hasConceptScore W3197026988C121332964 @default.
- W3197026988 hasConceptScore W3197026988C127413603 @default.
- W3197026988 hasConceptScore W3197026988C154945302 @default.
- W3197026988 hasConceptScore W3197026988C157657479 @default.
- W3197026988 hasConceptScore W3197026988C165801399 @default.
- W3197026988 hasConceptScore W3197026988C2779662365 @default.
- W3197026988 hasConceptScore W3197026988C28490314 @default.
- W3197026988 hasConceptScore W3197026988C31972630 @default.
- W3197026988 hasConceptScore W3197026988C41008148 @default.
- W3197026988 hasConceptScore W3197026988C62520636 @default.
- W3197026988 hasConceptScore W3197026988C66322947 @default.
- W3197026988 hasConceptScore W3197026988C76155785 @default.
- W3197026988 hasConceptScore W3197026988C79403827 @default.
- W3197026988 hasConceptScore W3197026988C82876162 @default.
- W3197026988 hasConceptScore W3197026988C94915269 @default.
- W3197026988 hasLocation W31970269881 @default.
- W3197026988 hasLocation W31970269882 @default.
- W3197026988 hasOpenAccess W3197026988 @default.
- W3197026988 hasPrimaryLocation W31970269881 @default.
- W3197026988 hasRelatedWork W10342353 @default.
- W3197026988 hasRelatedWork W2305952 @default.
- W3197026988 hasRelatedWork W3675094 @default.
- W3197026988 hasRelatedWork W4017687 @default.
- W3197026988 hasRelatedWork W4281120 @default.
- W3197026988 hasRelatedWork W5438179 @default.
- W3197026988 hasRelatedWork W743434 @default.
- W3197026988 hasRelatedWork W9122165 @default.
- W3197026988 hasRelatedWork W9767775 @default.
- W3197026988 hasRelatedWork W1587087 @default.
- W3197026988 isParatext "false" @default.
- W3197026988 isRetracted "false" @default.
- W3197026988 magId "3197026988" @default.
- W3197026988 workType "article" @default.