Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313483475> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4313483475 abstract "Most existing text-video retrieval methods focus on cross-modal matching between the visual content of videos and textual query sentences. However, in real-world scenarios, online videos are often accompanied by relevant text information such as titles, tags, and even subtitles, which can be utilized to match textual queries. This insight has motivated us to propose a novel approach to text-video retrieval, where we directly generate associated captions from videos using zero-shot video captioning with knowledge from web-scale pre-trained models (e.g., CLIP and GPT-2). Given the generated captions, a natural question arises: what benefits do they bring to text-video retrieval? To answer this, we introduce Cap4Video, a new framework that leverages captions in three ways: i) Input data: video-caption pairs can augment the training data. ii) Intermediate feature interaction: we perform cross-modal feature interaction between the video and caption to produce enhanced video representations. iii) Output score: the Query-Caption matching branch can complement the original Query-Video matching branch for text-video retrieval. We conduct comprehensive ablation studies to demonstrate the effectiveness of our approach. Without any post-processing, Cap4Video achieves state-of-the-art performance on four standard text-video retrieval benchmarks: MSR-VTT (51.4%), VATEX (66.6%), MSVD (51.8%), and DiDeMo (52.0%). The code is available at https://github.com/whwu95/Cap4Video ." @default.
- W4313483475 created "2023-01-06" @default.
- W4313483475 creator A5043412497 @default.
- W4313483475 creator A5049895708 @default.
- W4313483475 creator A5075880303 @default.
- W4313483475 creator A5077878137 @default.
- W4313483475 creator A5087818121 @default.
- W4313483475 date "2022-12-31" @default.
- W4313483475 modified "2023-10-18" @default.
- W4313483475 title "Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?" @default.
- W4313483475 doi "https://doi.org/10.48550/arxiv.2301.00184" @default.
- W4313483475 hasPublicationYear "2022" @default.
- W4313483475 type Work @default.
- W4313483475 citedByCount "0" @default.
- W4313483475 crossrefType "posted-content" @default.
- W4313483475 hasAuthorship W4313483475A5043412497 @default.
- W4313483475 hasAuthorship W4313483475A5049895708 @default.
- W4313483475 hasAuthorship W4313483475A5075880303 @default.
- W4313483475 hasAuthorship W4313483475A5077878137 @default.
- W4313483475 hasAuthorship W4313483475A5087818121 @default.
- W4313483475 hasBestOaLocation W43134834751 @default.
- W4313483475 hasConcept C105795698 @default.
- W4313483475 hasConcept C115961682 @default.
- W4313483475 hasConcept C120665830 @default.
- W4313483475 hasConcept C121332964 @default.
- W4313483475 hasConcept C138885662 @default.
- W4313483475 hasConcept C154945302 @default.
- W4313483475 hasConcept C157657479 @default.
- W4313483475 hasConcept C165064840 @default.
- W4313483475 hasConcept C177264268 @default.
- W4313483475 hasConcept C192209626 @default.
- W4313483475 hasConcept C199360897 @default.
- W4313483475 hasConcept C23123220 @default.
- W4313483475 hasConcept C2776401178 @default.
- W4313483475 hasConcept C2776760102 @default.
- W4313483475 hasConcept C33923547 @default.
- W4313483475 hasConcept C41008148 @default.
- W4313483475 hasConcept C41895202 @default.
- W4313483475 hasConceptScore W4313483475C105795698 @default.
- W4313483475 hasConceptScore W4313483475C115961682 @default.
- W4313483475 hasConceptScore W4313483475C120665830 @default.
- W4313483475 hasConceptScore W4313483475C121332964 @default.
- W4313483475 hasConceptScore W4313483475C138885662 @default.
- W4313483475 hasConceptScore W4313483475C154945302 @default.
- W4313483475 hasConceptScore W4313483475C157657479 @default.
- W4313483475 hasConceptScore W4313483475C165064840 @default.
- W4313483475 hasConceptScore W4313483475C177264268 @default.
- W4313483475 hasConceptScore W4313483475C192209626 @default.
- W4313483475 hasConceptScore W4313483475C199360897 @default.
- W4313483475 hasConceptScore W4313483475C23123220 @default.
- W4313483475 hasConceptScore W4313483475C2776401178 @default.
- W4313483475 hasConceptScore W4313483475C2776760102 @default.
- W4313483475 hasConceptScore W4313483475C33923547 @default.
- W4313483475 hasConceptScore W4313483475C41008148 @default.
- W4313483475 hasConceptScore W4313483475C41895202 @default.
- W4313483475 hasLocation W43134834751 @default.
- W4313483475 hasOpenAccess W4313483475 @default.
- W4313483475 hasPrimaryLocation W43134834751 @default.
- W4313483475 hasRelatedWork W1488266984 @default.
- W4313483475 hasRelatedWork W2765546831 @default.
- W4313483475 hasRelatedWork W2786306966 @default.
- W4313483475 hasRelatedWork W2795359650 @default.
- W4313483475 hasRelatedWork W2809904748 @default.
- W4313483475 hasRelatedWork W3120997353 @default.
- W4313483475 hasRelatedWork W3126399839 @default.
- W4313483475 hasRelatedWork W4307856881 @default.
- W4313483475 hasRelatedWork W4320016076 @default.
- W4313483475 hasRelatedWork W563404 @default.
- W4313483475 isParatext "false" @default.
- W4313483475 isRetracted "false" @default.
- W4313483475 workType "article" @default.