Matches in SemOpenAlex for { <https://semopenalex.org/work/W3216280672> ?p ?o ?g. }
- W3216280672 abstract "For video captioning, pre-training and fine-tuning has become a de facto paradigm, where ImageNet Pre-training (INP) is usually used to help encode the video content, and a task-oriented network is fine-tuned from scratch to cope with caption generation. Comparing INP with the recently proposed CLIP (Contrastive Language-Image Pre-training), this paper investigates the potential deficiencies of INP for video captioning and explores the key to generating accurate descriptions. Specifically, our empirical study on INP vs. CLIP shows that INP makes video caption models tricky to capture attributes' semantics and sensitive to irrelevant background information. By contrast, CLIP's significant boost in caption quality highlights the importance of attribute-aware representation learning. We are thus motivated to introduce Dual Attribute Prediction, an auxiliary task requiring a video caption model to learn the correspondence between video content and attributes and the co-occurrence relations between attributes. Extensive experiments on benchmark datasets demonstrate that our approach enables better learning of attribute-aware representations, bringing consistent improvements on models with different architectures and decoding algorithms." @default.
- W3216280672 created "2021-12-06" @default.
- W3216280672 creator A5002795838 @default.
- W3216280672 creator A5043445607 @default.
- W3216280672 date "2021-11-30" @default.
- W3216280672 modified "2023-09-24" @default.
- W3216280672 title "CLIP Meets Video Captioners: Attribute-Aware Representation Learning Promotes Accurate Captioning" @default.
- W3216280672 cites W1889081078 @default.
- W3216280672 cites W1956340063 @default.
- W3216280672 cites W2016589492 @default.
- W3216280672 cites W2101105183 @default.
- W3216280672 cites W2108598243 @default.
- W3216280672 cites W2123301721 @default.
- W3216280672 cites W2139501017 @default.
- W3216280672 cites W2154652894 @default.
- W3216280672 cites W2164290393 @default.
- W3216280672 cites W2166010828 @default.
- W3216280672 cites W2187089797 @default.
- W3216280672 cites W2194775991 @default.
- W3216280672 cites W2425121537 @default.
- W3216280672 cites W2526050071 @default.
- W3216280672 cites W2565656701 @default.
- W3216280672 cites W2593116425 @default.
- W3216280672 cites W2619947201 @default.
- W3216280672 cites W2745461083 @default.
- W3216280672 cites W2752191396 @default.
- W3216280672 cites W2914746235 @default.
- W3216280672 cites W2962858109 @default.
- W3216280672 cites W2962934715 @default.
- W3216280672 cites W2963299217 @default.
- W3216280672 cites W2963351448 @default.
- W3216280672 cites W2963403868 @default.
- W3216280672 cites W2963552819 @default.
- W3216280672 cites W2964065937 @default.
- W3216280672 cites W2964121744 @default.
- W3216280672 cites W2964241990 @default.
- W3216280672 cites W2968880719 @default.
- W3216280672 cites W2970231061 @default.
- W3216280672 cites W2970608575 @default.
- W3216280672 cites W2981851019 @default.
- W3216280672 cites W2991391304 @default.
- W3216280672 cites W2998356391 @default.
- W3216280672 cites W3003478339 @default.
- W3216280672 cites W3035160838 @default.
- W3216280672 cites W3035265375 @default.
- W3216280672 cites W3035365026 @default.
- W3216280672 cites W3035392611 @default.
- W3216280672 cites W3038476992 @default.
- W3216280672 cites W3090449556 @default.
- W3216280672 cites W3091588028 @default.
- W3216280672 cites W3105232955 @default.
- W3216280672 cites W3109931228 @default.
- W3216280672 cites W3119786062 @default.
- W3216280672 cites W3133825286 @default.
- W3216280672 cites W3152798676 @default.
- W3216280672 cites W3159875059 @default.
- W3216280672 cites W3166396011 @default.
- W3216280672 cites W3168640669 @default.
- W3216280672 cites W3170928047 @default.
- W3216280672 cites W3174441232 @default.
- W3216280672 cites W3176689360 @default.
- W3216280672 cites W3181158454 @default.
- W3216280672 cites W3182683290 @default.
- W3216280672 cites W3202195611 @default.
- W3216280672 hasPublicationYear "2021" @default.
- W3216280672 type Work @default.
- W3216280672 sameAs 3216280672 @default.
- W3216280672 citedByCount "0" @default.
- W3216280672 crossrefType "posted-content" @default.
- W3216280672 hasAuthorship W3216280672A5002795838 @default.
- W3216280672 hasAuthorship W3216280672A5043445607 @default.
- W3216280672 hasConcept C104317684 @default.
- W3216280672 hasConcept C111919701 @default.
- W3216280672 hasConcept C115961682 @default.
- W3216280672 hasConcept C13280743 @default.
- W3216280672 hasConcept C154945302 @default.
- W3216280672 hasConcept C157657479 @default.
- W3216280672 hasConcept C162324750 @default.
- W3216280672 hasConcept C17744445 @default.
- W3216280672 hasConcept C184337299 @default.
- W3216280672 hasConcept C185592680 @default.
- W3216280672 hasConcept C185798385 @default.
- W3216280672 hasConcept C187736073 @default.
- W3216280672 hasConcept C199360897 @default.
- W3216280672 hasConcept C199539241 @default.
- W3216280672 hasConcept C204321447 @default.
- W3216280672 hasConcept C205649164 @default.
- W3216280672 hasConcept C26517878 @default.
- W3216280672 hasConcept C2776359362 @default.
- W3216280672 hasConcept C2776502983 @default.
- W3216280672 hasConcept C2780451532 @default.
- W3216280672 hasConcept C2781235140 @default.
- W3216280672 hasConcept C38652104 @default.
- W3216280672 hasConcept C41008148 @default.
- W3216280672 hasConcept C49774154 @default.
- W3216280672 hasConcept C55493867 @default.
- W3216280672 hasConcept C57273362 @default.
- W3216280672 hasConcept C66746571 @default.
- W3216280672 hasConcept C76155785 @default.
- W3216280672 hasConcept C94625758 @default.