Matches in SemOpenAlex for { <https://semopenalex.org/work/W3167366812> ?p ?o ?g. }
- W3167366812 abstract "Leveraging large-scale unlabeled web videos such as instructional videos for pre-training followed by task-specific finetuning has become the de facto approach for many video-and-language tasks. However, these instructional videos are very noisy, the accompanying ASR narrations are often incomplete, and can be irrelevant to or temporally misaligned with the visual content, limiting the performance of the models trained on such data. To address these issues, we propose an improved video-and-language pre-training method that first adds automatically-extracted dense region captions from the video frames as auxiliary text input, to provide informative visual cues for learning better video and language associations. Second, to alleviate the temporal misalignment issue, our method incorporates an entropy minimization-based constrained attention loss, to encourage the model to automatically focus on the correct caption from a pool of candidate ASR captions. Our overall approach is named DeCEMBERT (Dense Captions and Entropy Minimization). Comprehensive experiments on three video-and-language tasks (text-to-video retrieval, video captioning, and video question answering) across five datasets demonstrate that our approach outperforms previous state-of-the-art methods. Ablation studies on pre-training and downstream tasks show that adding dense captions and constrained attention loss help improve the model performance. Lastly, we also provide attention visualization to show the effect of applying the proposed constrained attention loss." @default.
- W3167366812 created "2021-06-22" @default.
- W3167366812 creator A5001987532 @default.
- W3167366812 creator A5007285444 @default.
- W3167366812 creator A5022885986 @default.
- W3167366812 date "2021-01-01" @default.
- W3167366812 modified "2023-09-30" @default.
- W3167366812 title "DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization" @default.
- W3167366812 cites W1889081078 @default.
- W3167366812 cites W1933349210 @default.
- W3167366812 cites W1956340063 @default.
- W3167366812 cites W1957706851 @default.
- W3167366812 cites W2101105183 @default.
- W3167366812 cites W2108598243 @default.
- W3167366812 cites W2133459682 @default.
- W3167366812 cites W2154652894 @default.
- W3167366812 cites W2194775991 @default.
- W3167366812 cites W2277195237 @default.
- W3167366812 cites W2398104528 @default.
- W3167366812 cites W2549139847 @default.
- W3167366812 cites W2552002300 @default.
- W3167366812 cites W2745461083 @default.
- W3167366812 cites W2765716052 @default.
- W3167366812 cites W2784025607 @default.
- W3167366812 cites W2885775891 @default.
- W3167366812 cites W2886641317 @default.
- W3167366812 cites W2905145027 @default.
- W3167366812 cites W2944815030 @default.
- W3167366812 cites W2948358897 @default.
- W3167366812 cites W2951390634 @default.
- W3167366812 cites W2952372688 @default.
- W3167366812 cites W2954199749 @default.
- W3167366812 cites W2962934715 @default.
- W3167366812 cites W2962949233 @default.
- W3167366812 cites W2963159690 @default.
- W3167366812 cites W2963310665 @default.
- W3167366812 cites W2963341956 @default.
- W3167366812 cites W2963351113 @default.
- W3167366812 cites W2963403868 @default.
- W3167366812 cites W2963530300 @default.
- W3167366812 cites W2963541336 @default.
- W3167366812 cites W2963748441 @default.
- W3167366812 cites W2963758027 @default.
- W3167366812 cites W2963846996 @default.
- W3167366812 cites W2964121744 @default.
- W3167366812 cites W2964274690 @default.
- W3167366812 cites W2964345792 @default.
- W3167366812 cites W2965373594 @default.
- W3167366812 cites W2967052791 @default.
- W3167366812 cites W2970231061 @default.
- W3167366812 cites W2970597249 @default.
- W3167366812 cites W2970608575 @default.
- W3167366812 cites W2971274815 @default.
- W3167366812 cites W2981851019 @default.
- W3167366812 cites W2984008963 @default.
- W3167366812 cites W2984862483 @default.
- W3167366812 cites W2988753485 @default.
- W3167366812 cites W2996035354 @default.
- W3167366812 cites W2996428491 @default.
- W3167366812 cites W2997591391 @default.
- W3167366812 cites W2998356391 @default.
- W3167366812 cites W3006320872 @default.
- W3167366812 cites W3009192917 @default.
- W3167366812 cites W3034188691 @default.
- W3167366812 cites W3034730770 @default.
- W3167366812 cites W3035167603 @default.
- W3167366812 cites W3035237998 @default.
- W3167366812 cites W3035265375 @default.
- W3167366812 cites W3035365026 @default.
- W3167366812 cites W3035635319 @default.
- W3167366812 cites W3045687178 @default.
- W3167366812 cites W3082274269 @default.
- W3167366812 cites W3104862079 @default.
- W3167366812 cites W3105479157 @default.
- W3167366812 cites W3115868806 @default.
- W3167366812 cites W3126464137 @default.
- W3167366812 cites W3168640669 @default.
- W3167366812 cites W3188447078 @default.
- W3167366812 doi "https://doi.org/10.18653/v1/2021.naacl-main.193" @default.
- W3167366812 hasPublicationYear "2021" @default.
- W3167366812 type Work @default.
- W3167366812 sameAs 3167366812 @default.
- W3167366812 citedByCount "22" @default.
- W3167366812 countsByYear W31673668122021 @default.
- W3167366812 countsByYear W31673668122022 @default.
- W3167366812 countsByYear W31673668122023 @default.
- W3167366812 crossrefType "proceedings-article" @default.
- W3167366812 hasAuthorship W3167366812A5001987532 @default.
- W3167366812 hasAuthorship W3167366812A5007285444 @default.
- W3167366812 hasAuthorship W3167366812A5022885986 @default.
- W3167366812 hasBestOaLocation W31673668121 @default.
- W3167366812 hasConcept C106301342 @default.
- W3167366812 hasConcept C115961682 @default.
- W3167366812 hasConcept C119857082 @default.
- W3167366812 hasConcept C121332964 @default.
- W3167366812 hasConcept C136764020 @default.
- W3167366812 hasConcept C137293760 @default.
- W3167366812 hasConcept C147764199 @default.
- W3167366812 hasConcept C154945302 @default.
- W3167366812 hasConcept C157657479 @default.