Matches in SemOpenAlex for { <https://semopenalex.org/work/W4310515270> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W4310515270 abstract "This paper focuses on analyzing and improving the commonsense ability of recent popular vision-language (VL) models. Despite the great success, we observe that existing VL-models still lack commonsense knowledge/reasoning ability (e.g., Lemons are sour), which is a vital component towards artificial general intelligence. Through our analysis, we find one important reason is that existing large-scale VL datasets do not contain much commonsense knowledge, which motivates us to improve the commonsense of VL-models from the data perspective. Rather than collecting a new VL training dataset, we propose a more scalable strategy, i.e., Data Augmentation with kNowledge graph linearization for CommonsensE capability (DANCE). It can be viewed as one type of data augmentation technique, which can inject commonsense knowledge into existing VL datasets on the fly during training. More specifically, we leverage the commonsense knowledge graph (e.g., ConceptNet) and create variants of text description in VL datasets via bidirectional sub-graph sequentialization. For better commonsense evaluation, we further propose the first retrieval-based commonsense diagnostic benchmark. By conducting extensive experiments on some representative VL-models, we demonstrate that our DANCE technique is able to significantly improve the commonsense ability while maintaining the performance on vanilla retrieval tasks. The code and data are available at https://github.com/pleaseconnectwifi/DANCE" @default.
- W4310515270 created "2022-12-11" @default.
- W4310515270 creator A5011398750 @default.
- W4310515270 creator A5013972536 @default.
- W4310515270 creator A5023158092 @default.
- W4310515270 creator A5029523095 @default.
- W4310515270 creator A5046214153 @default.
- W4310515270 creator A5049664284 @default.
- W4310515270 creator A5070165835 @default.
- W4310515270 date "2022-11-29" @default.
- W4310515270 modified "2023-09-24" @default.
- W4310515270 title "Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles" @default.
- W4310515270 doi "https://doi.org/10.48550/arxiv.2211.16504" @default.
- W4310515270 hasPublicationYear "2022" @default.
- W4310515270 type Work @default.
- W4310515270 citedByCount "1" @default.
- W4310515270 countsByYear W43105152702023 @default.
- W4310515270 crossrefType "posted-content" @default.
- W4310515270 hasAuthorship W4310515270A5011398750 @default.
- W4310515270 hasAuthorship W4310515270A5013972536 @default.
- W4310515270 hasAuthorship W4310515270A5023158092 @default.
- W4310515270 hasAuthorship W4310515270A5029523095 @default.
- W4310515270 hasAuthorship W4310515270A5046214153 @default.
- W4310515270 hasAuthorship W4310515270A5049664284 @default.
- W4310515270 hasAuthorship W4310515270A5070165835 @default.
- W4310515270 hasBestOaLocation W43105152701 @default.
- W4310515270 hasConcept C132525143 @default.
- W4310515270 hasConcept C153083717 @default.
- W4310515270 hasConcept C154945302 @default.
- W4310515270 hasConcept C161301231 @default.
- W4310515270 hasConcept C193221554 @default.
- W4310515270 hasConcept C204321447 @default.
- W4310515270 hasConcept C2987255567 @default.
- W4310515270 hasConcept C30542707 @default.
- W4310515270 hasConcept C41008148 @default.
- W4310515270 hasConcept C48044578 @default.
- W4310515270 hasConcept C77088390 @default.
- W4310515270 hasConcept C80444323 @default.
- W4310515270 hasConceptScore W4310515270C132525143 @default.
- W4310515270 hasConceptScore W4310515270C153083717 @default.
- W4310515270 hasConceptScore W4310515270C154945302 @default.
- W4310515270 hasConceptScore W4310515270C161301231 @default.
- W4310515270 hasConceptScore W4310515270C193221554 @default.
- W4310515270 hasConceptScore W4310515270C204321447 @default.
- W4310515270 hasConceptScore W4310515270C2987255567 @default.
- W4310515270 hasConceptScore W4310515270C30542707 @default.
- W4310515270 hasConceptScore W4310515270C41008148 @default.
- W4310515270 hasConceptScore W4310515270C48044578 @default.
- W4310515270 hasConceptScore W4310515270C77088390 @default.
- W4310515270 hasConceptScore W4310515270C80444323 @default.
- W4310515270 hasLocation W43105152701 @default.
- W4310515270 hasLocation W43105152702 @default.
- W4310515270 hasOpenAccess W4310515270 @default.
- W4310515270 hasPrimaryLocation W43105152701 @default.
- W4310515270 hasRelatedWork W2950339735 @default.
- W4310515270 hasRelatedWork W2971986145 @default.
- W4310515270 hasRelatedWork W3035583586 @default.
- W4310515270 hasRelatedWork W3094328377 @default.
- W4310515270 hasRelatedWork W3160008796 @default.
- W4310515270 hasRelatedWork W3165066581 @default.
- W4310515270 hasRelatedWork W4287759503 @default.
- W4310515270 hasRelatedWork W4287850285 @default.
- W4310515270 hasRelatedWork W4312568808 @default.
- W4310515270 hasRelatedWork W4386075723 @default.
- W4310515270 isParatext "false" @default.
- W4310515270 isRetracted "false" @default.
- W4310515270 workType "article" @default.