Matches in SemOpenAlex for { <https://semopenalex.org/work/W4380993957> ?p ?o ?g. }
Showing items 1 to 55 of
55
with 100 items per page.
- W4380993957 abstract "Current Vision and Language Models (VLMs) demonstrate strong performance across various vision-language tasks, yet they struggle with fine-grained understanding. This issue stems from weak image-caption alignment in pretraining datasets and a simplified contrastive objective that fails to distinguish nuanced grounding elements such as relations, actions, and attributes. As a result, the models tend to learn bag-of-words representations. To mitigate these challenges, we introduce an intra-modal contrastive loss and a unique cross-modal rank loss with an adaptive threshold that serves as curriculum learning, utilizing our automatically generated hard negatives to augment the model's capacity. Our strategy, which does not necessitate additional annotations or parameters, can be incorporated into any VLM trained with an image-text contrastive loss. Upon application to CLIP, our method leads to significant improvements on four fine-grained benchmarks, and it also enhances the performance of X-VLM, which is the state-of-art moodel on fine-grained reasoning." @default.
- W4380993957 created "2023-06-17" @default.
- W4380993957 creator A5021019207 @default.
- W4380993957 creator A5023174812 @default.
- W4380993957 creator A5063960231 @default.
- W4380993957 date "2023-06-14" @default.
- W4380993957 modified "2023-09-27" @default.
- W4380993957 title "Contrasting Intra-Modal and Ranking Cross-Modal Hard Negatives to Enhance Visio-Linguistic Fine-grained Understanding" @default.
- W4380993957 doi "https://doi.org/10.48550/arxiv.2306.08832" @default.
- W4380993957 hasPublicationYear "2023" @default.
- W4380993957 type Work @default.
- W4380993957 citedByCount "0" @default.
- W4380993957 crossrefType "posted-content" @default.
- W4380993957 hasAuthorship W4380993957A5021019207 @default.
- W4380993957 hasAuthorship W4380993957A5023174812 @default.
- W4380993957 hasAuthorship W4380993957A5063960231 @default.
- W4380993957 hasBestOaLocation W43809939571 @default.
- W4380993957 hasConcept C114614502 @default.
- W4380993957 hasConcept C115961682 @default.
- W4380993957 hasConcept C154945302 @default.
- W4380993957 hasConcept C164226766 @default.
- W4380993957 hasConcept C185592680 @default.
- W4380993957 hasConcept C188027245 @default.
- W4380993957 hasConcept C189430467 @default.
- W4380993957 hasConcept C204321447 @default.
- W4380993957 hasConcept C33923547 @default.
- W4380993957 hasConcept C41008148 @default.
- W4380993957 hasConcept C71139939 @default.
- W4380993957 hasConceptScore W4380993957C114614502 @default.
- W4380993957 hasConceptScore W4380993957C115961682 @default.
- W4380993957 hasConceptScore W4380993957C154945302 @default.
- W4380993957 hasConceptScore W4380993957C164226766 @default.
- W4380993957 hasConceptScore W4380993957C185592680 @default.
- W4380993957 hasConceptScore W4380993957C188027245 @default.
- W4380993957 hasConceptScore W4380993957C189430467 @default.
- W4380993957 hasConceptScore W4380993957C204321447 @default.
- W4380993957 hasConceptScore W4380993957C33923547 @default.
- W4380993957 hasConceptScore W4380993957C41008148 @default.
- W4380993957 hasConceptScore W4380993957C71139939 @default.
- W4380993957 hasLocation W43809939571 @default.
- W4380993957 hasOpenAccess W4380993957 @default.
- W4380993957 hasPrimaryLocation W43809939571 @default.
- W4380993957 hasRelatedWork W1949085824 @default.
- W4380993957 hasRelatedWork W2138279922 @default.
- W4380993957 hasRelatedWork W2368651715 @default.
- W4380993957 hasRelatedWork W2611614995 @default.
- W4380993957 hasRelatedWork W2789919619 @default.
- W4380993957 hasRelatedWork W2949900540 @default.
- W4380993957 hasRelatedWork W2950017439 @default.
- W4380993957 hasRelatedWork W3107474891 @default.
- W4380993957 hasRelatedWork W3160516639 @default.
- W4380993957 hasRelatedWork W4233300542 @default.
- W4380993957 isParatext "false" @default.
- W4380993957 isRetracted "false" @default.
- W4380993957 workType "article" @default.