Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386065554> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W4386065554 abstract "Large-scale multi-modal training with image-text pairs imparts strong generalization to CLIP model. Since training on a similar scale for videos is infeasible, recent approaches focus on the effective transfer of image-based CLIP to the video domain. In this pursuit, new parametric modules are added to learn temporal information and inter-frame relationships which require meticulous design efforts. Furthermore, when the resulting models are learned on videos, they tend to overfit on the given task distribution and lack in generalization aspect. This begs the following question: How to effectively transfer image-level CLIP representations to videos? In this work, we show that a simple Video Fine-tuned CLIP (ViFi-CLIP) baseline is generally sufficient to bridge the domain gap from images to videos. Our qualitative analysis illustrates that the frame-level processing from CLIP image-encoder followed by feature pooling and similarity matching with corresponding text embeddings helps in implicitly modeling the temporal cues within ViFi-CLIP. Such fine-tuning helps the model to focus on scene dynamics, moving objects and inter-object relationships. For low-data regimes where full fine-tuning is not viable, we propose a ‘bridge and prompt’ approach that first uses fine-tuning to bridge the domain gap and then learns prompts on language and vision side to adapt CLIP representations. We extensively evaluate this simple yet strong baseline on zero-shot, base-to-novel generalization, few-shot and fully supervised settings across five video benchmarks. Our code and pre-trained models are available at https://github.com/muzairkhattak/ViFi-CLIP." @default.
- W4386065554 created "2023-08-23" @default.
- W4386065554 creator A5012655817 @default.
- W4386065554 creator A5034370385 @default.
- W4386065554 creator A5064948724 @default.
- W4386065554 creator A5077014645 @default.
- W4386065554 creator A5079112987 @default.
- W4386065554 date "2023-06-01" @default.
- W4386065554 modified "2023-09-26" @default.
- W4386065554 title "Fine-tuned CLIP Models are Efficient Video Learners" @default.
- W4386065554 cites W1572666543 @default.
- W4386065554 cites W2126579184 @default.
- W4386065554 cites W2552383788 @default.
- W4386065554 cites W2625366777 @default.
- W4386065554 cites W2736596806 @default.
- W4386065554 cites W2740825418 @default.
- W4386065554 cites W2962914678 @default.
- W4386065554 cites W2963524571 @default.
- W4386065554 cites W2990503944 @default.
- W4386065554 cites W3035254087 @default.
- W4386065554 cites W3100093508 @default.
- W4386065554 cites W3159619744 @default.
- W4386065554 cites W4214612132 @default.
- W4386065554 cites W4214746887 @default.
- W4386065554 cites W4226058394 @default.
- W4386065554 cites W4312310776 @default.
- W4386065554 cites W4312560592 @default.
- W4386065554 cites W4312658081 @default.
- W4386065554 doi "https://doi.org/10.1109/cvpr52729.2023.00633" @default.
- W4386065554 hasPublicationYear "2023" @default.
- W4386065554 type Work @default.
- W4386065554 citedByCount "0" @default.
- W4386065554 crossrefType "proceedings-article" @default.
- W4386065554 hasAuthorship W4386065554A5012655817 @default.
- W4386065554 hasAuthorship W4386065554A5034370385 @default.
- W4386065554 hasAuthorship W4386065554A5064948724 @default.
- W4386065554 hasAuthorship W4386065554A5077014645 @default.
- W4386065554 hasAuthorship W4386065554A5079112987 @default.
- W4386065554 hasConcept C105795698 @default.
- W4386065554 hasConcept C111919701 @default.
- W4386065554 hasConcept C118505674 @default.
- W4386065554 hasConcept C120665830 @default.
- W4386065554 hasConcept C121332964 @default.
- W4386065554 hasConcept C126042441 @default.
- W4386065554 hasConcept C134306372 @default.
- W4386065554 hasConcept C138885662 @default.
- W4386065554 hasConcept C142575187 @default.
- W4386065554 hasConcept C150899416 @default.
- W4386065554 hasConcept C154945302 @default.
- W4386065554 hasConcept C165064840 @default.
- W4386065554 hasConcept C177148314 @default.
- W4386065554 hasConcept C192209626 @default.
- W4386065554 hasConcept C22019652 @default.
- W4386065554 hasConcept C2776401178 @default.
- W4386065554 hasConcept C31972630 @default.
- W4386065554 hasConcept C33923547 @default.
- W4386065554 hasConcept C41008148 @default.
- W4386065554 hasConcept C41895202 @default.
- W4386065554 hasConcept C50644808 @default.
- W4386065554 hasConcept C76155785 @default.
- W4386065554 hasConceptScore W4386065554C105795698 @default.
- W4386065554 hasConceptScore W4386065554C111919701 @default.
- W4386065554 hasConceptScore W4386065554C118505674 @default.
- W4386065554 hasConceptScore W4386065554C120665830 @default.
- W4386065554 hasConceptScore W4386065554C121332964 @default.
- W4386065554 hasConceptScore W4386065554C126042441 @default.
- W4386065554 hasConceptScore W4386065554C134306372 @default.
- W4386065554 hasConceptScore W4386065554C138885662 @default.
- W4386065554 hasConceptScore W4386065554C142575187 @default.
- W4386065554 hasConceptScore W4386065554C150899416 @default.
- W4386065554 hasConceptScore W4386065554C154945302 @default.
- W4386065554 hasConceptScore W4386065554C165064840 @default.
- W4386065554 hasConceptScore W4386065554C177148314 @default.
- W4386065554 hasConceptScore W4386065554C192209626 @default.
- W4386065554 hasConceptScore W4386065554C22019652 @default.
- W4386065554 hasConceptScore W4386065554C2776401178 @default.
- W4386065554 hasConceptScore W4386065554C31972630 @default.
- W4386065554 hasConceptScore W4386065554C33923547 @default.
- W4386065554 hasConceptScore W4386065554C41008148 @default.
- W4386065554 hasConceptScore W4386065554C41895202 @default.
- W4386065554 hasConceptScore W4386065554C50644808 @default.
- W4386065554 hasConceptScore W4386065554C76155785 @default.
- W4386065554 hasLocation W43860655541 @default.
- W4386065554 hasOpenAccess W4386065554 @default.
- W4386065554 hasPrimaryLocation W43860655541 @default.
- W4386065554 hasRelatedWork W1504288058 @default.
- W4386065554 hasRelatedWork W2048505601 @default.
- W4386065554 hasRelatedWork W2167293474 @default.
- W4386065554 hasRelatedWork W2331674254 @default.
- W4386065554 hasRelatedWork W2391245565 @default.
- W4386065554 hasRelatedWork W2577364290 @default.
- W4386065554 hasRelatedWork W2786306966 @default.
- W4386065554 hasRelatedWork W2990667865 @default.
- W4386065554 hasRelatedWork W3012393889 @default.
- W4386065554 hasRelatedWork W3042897387 @default.
- W4386065554 isParatext "false" @default.
- W4386065554 isRetracted "false" @default.
- W4386065554 workType "article" @default.