Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385570037> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W4385570037 abstract "Vision-language tasks, such as VQA, SNLI-VE, and VCR are challenging because they require the model’s reasoning ability to understand the semantics of the visual world and natural language. Supervised methods working for vision-language tasks have been well-studied. However, solving these tasks in a zero-shot setting is less explored. Since Contrastive Language-Image Pre-training (CLIP) has shown remarkable zero-shot performance on image-text matching, previous works utilized its strong zero-shot ability by converting vision-language tasks into an image-text matching problem, and they mainly consider global-level matching (e.g., the whole image or sentence). However, we find visual and textual fine-grained information, e.g., keywords in the sentence and objects in the image, can be fairly informative for semantics understanding. Inspired by this, we propose a unified framework to take advantage of the fine-grained information for zero-shot vision-language learning, covering multiple tasks such as VQA, SNLI-VE, and VCR. Our experiments show that our framework outperforms former zero-shot methods on VQA and achieves substantial improvement on SNLI-VE and VCR. Furthermore, our ablation studies confirm the effectiveness and generalizability of our proposed method." @default.
- W4385570037 created "2023-08-05" @default.
- W4385570037 creator A5008582408 @default.
- W4385570037 creator A5026256051 @default.
- W4385570037 creator A5073027533 @default.
- W4385570037 creator A5084047711 @default.
- W4385570037 creator A5084753767 @default.
- W4385570037 creator A5087096372 @default.
- W4385570037 date "2023-01-01" @default.
- W4385570037 modified "2023-10-16" @default.
- W4385570037 title "UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding" @default.
- W4385570037 doi "https://doi.org/10.18653/v1/2023.findings-acl.49" @default.
- W4385570037 hasPublicationYear "2023" @default.
- W4385570037 type Work @default.
- W4385570037 citedByCount "0" @default.
- W4385570037 crossrefType "proceedings-article" @default.
- W4385570037 hasAuthorship W4385570037A5008582408 @default.
- W4385570037 hasAuthorship W4385570037A5026256051 @default.
- W4385570037 hasAuthorship W4385570037A5073027533 @default.
- W4385570037 hasAuthorship W4385570037A5084047711 @default.
- W4385570037 hasAuthorship W4385570037A5084753767 @default.
- W4385570037 hasAuthorship W4385570037A5087096372 @default.
- W4385570037 hasBestOaLocation W43855700371 @default.
- W4385570037 hasConcept C105795698 @default.
- W4385570037 hasConcept C115961682 @default.
- W4385570037 hasConcept C138885662 @default.
- W4385570037 hasConcept C154945302 @default.
- W4385570037 hasConcept C165064840 @default.
- W4385570037 hasConcept C178790620 @default.
- W4385570037 hasConcept C184337299 @default.
- W4385570037 hasConcept C185592680 @default.
- W4385570037 hasConcept C195324797 @default.
- W4385570037 hasConcept C199360897 @default.
- W4385570037 hasConcept C204321447 @default.
- W4385570037 hasConcept C27158222 @default.
- W4385570037 hasConcept C2777530160 @default.
- W4385570037 hasConcept C2778344882 @default.
- W4385570037 hasConcept C2780813799 @default.
- W4385570037 hasConcept C31972630 @default.
- W4385570037 hasConcept C33923547 @default.
- W4385570037 hasConcept C41008148 @default.
- W4385570037 hasConcept C41895202 @default.
- W4385570037 hasConcept C44291984 @default.
- W4385570037 hasConceptScore W4385570037C105795698 @default.
- W4385570037 hasConceptScore W4385570037C115961682 @default.
- W4385570037 hasConceptScore W4385570037C138885662 @default.
- W4385570037 hasConceptScore W4385570037C154945302 @default.
- W4385570037 hasConceptScore W4385570037C165064840 @default.
- W4385570037 hasConceptScore W4385570037C178790620 @default.
- W4385570037 hasConceptScore W4385570037C184337299 @default.
- W4385570037 hasConceptScore W4385570037C185592680 @default.
- W4385570037 hasConceptScore W4385570037C195324797 @default.
- W4385570037 hasConceptScore W4385570037C199360897 @default.
- W4385570037 hasConceptScore W4385570037C204321447 @default.
- W4385570037 hasConceptScore W4385570037C27158222 @default.
- W4385570037 hasConceptScore W4385570037C2777530160 @default.
- W4385570037 hasConceptScore W4385570037C2778344882 @default.
- W4385570037 hasConceptScore W4385570037C2780813799 @default.
- W4385570037 hasConceptScore W4385570037C31972630 @default.
- W4385570037 hasConceptScore W4385570037C33923547 @default.
- W4385570037 hasConceptScore W4385570037C41008148 @default.
- W4385570037 hasConceptScore W4385570037C41895202 @default.
- W4385570037 hasConceptScore W4385570037C44291984 @default.
- W4385570037 hasLocation W43855700371 @default.
- W4385570037 hasOpenAccess W4385570037 @default.
- W4385570037 hasPrimaryLocation W43855700371 @default.
- W4385570037 hasRelatedWork W159132833 @default.
- W4385570037 hasRelatedWork W2033261979 @default.
- W4385570037 hasRelatedWork W2081764088 @default.
- W4385570037 hasRelatedWork W20999564 @default.
- W4385570037 hasRelatedWork W2391245565 @default.
- W4385570037 hasRelatedWork W2411652523 @default.
- W4385570037 hasRelatedWork W2579427813 @default.
- W4385570037 hasRelatedWork W2772769880 @default.
- W4385570037 hasRelatedWork W3120196704 @default.
- W4385570037 hasRelatedWork W4297803820 @default.
- W4385570037 isParatext "false" @default.
- W4385570037 isRetracted "false" @default.
- W4385570037 workType "article" @default.