Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386075561> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W4386075561 abstract "Open-vocabulary semantic segmentation aims to segment an image into semantic regions according to text descriptions, which may not have been seen during training. Recent two-stage methods first generate class-agnostic mask proposals and then leverage pre-trained vision-language models, e.g., CLIP, to classify masked regions. We identify the performance bottleneck of this paradigm to be the pre-trained CLIP model, since it does not perform well on masked images. To address this, we propose to finetune CLIP on a collection of masked image regions and their corresponding text descriptions. We collect training data by mining an existing image-caption dataset (e.g., COCO Captions), using CLIP to match masked image regions to nouns in the image captions. Compared with the more precise and manually annotated segmentation labels with fixed classes (e.g., COCO-Stuff), we find our noisy but diverse dataset can better retain CLIP's generalization ability. Along with finetuning the entire model, we utilize the “blank” areas in masked images using a method we dub mask prompt tuning. Experiments demonstrate mask prompt tuning brings significant improvement without modifying any weights of CLIP, and it can further improve a fully finetuned model. In particular, when trained on COCO and evaluated on ADE20K-150, our best model achieves 29.6% mIoU, which is +8.5% higher than the previous state-of-the-art. For the first time, open-vocabulary generalist models match the performance of supervised specialist models in 2017 without dataset specific adaptations." @default.
- W4386075561 created "2023-08-23" @default.
- W4386075561 creator A5015536490 @default.
- W4386075561 creator A5027023070 @default.
- W4386075561 creator A5037049534 @default.
- W4386075561 creator A5057613852 @default.
- W4386075561 creator A5065985595 @default.
- W4386075561 creator A5073512879 @default.
- W4386075561 creator A5074743331 @default.
- W4386075561 creator A5080137972 @default.
- W4386075561 creator A5081844561 @default.
- W4386075561 date "2023-06-01" @default.
- W4386075561 modified "2023-09-30" @default.
- W4386075561 title "Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP" @default.
- W4386075561 cites W1903029394 @default.
- W4386075561 cites W2031489346 @default.
- W4386075561 cites W2125215748 @default.
- W4386075561 cites W2156406284 @default.
- W4386075561 cites W2412782625 @default.
- W4386075561 cites W2507296351 @default.
- W4386075561 cites W2560023338 @default.
- W4386075561 cites W2561196672 @default.
- W4386075561 cites W2924485953 @default.
- W4386075561 cites W3138516171 @default.
- W4386075561 cites W3198377975 @default.
- W4386075561 cites W4214926101 @default.
- W4386075561 cites W4312563428 @default.
- W4386075561 cites W4312980231 @default.
- W4386075561 doi "https://doi.org/10.1109/cvpr52729.2023.00682" @default.
- W4386075561 hasPublicationYear "2023" @default.
- W4386075561 type Work @default.
- W4386075561 citedByCount "3" @default.
- W4386075561 countsByYear W43860755612023 @default.
- W4386075561 crossrefType "proceedings-article" @default.
- W4386075561 hasAuthorship W4386075561A5015536490 @default.
- W4386075561 hasAuthorship W4386075561A5027023070 @default.
- W4386075561 hasAuthorship W4386075561A5037049534 @default.
- W4386075561 hasAuthorship W4386075561A5057613852 @default.
- W4386075561 hasAuthorship W4386075561A5065985595 @default.
- W4386075561 hasAuthorship W4386075561A5073512879 @default.
- W4386075561 hasAuthorship W4386075561A5074743331 @default.
- W4386075561 hasAuthorship W4386075561A5080137972 @default.
- W4386075561 hasAuthorship W4386075561A5081844561 @default.
- W4386075561 hasConcept C115961682 @default.
- W4386075561 hasConcept C119857082 @default.
- W4386075561 hasConcept C134306372 @default.
- W4386075561 hasConcept C137293760 @default.
- W4386075561 hasConcept C138885662 @default.
- W4386075561 hasConcept C149635348 @default.
- W4386075561 hasConcept C153083717 @default.
- W4386075561 hasConcept C153180895 @default.
- W4386075561 hasConcept C154945302 @default.
- W4386075561 hasConcept C177148314 @default.
- W4386075561 hasConcept C204321447 @default.
- W4386075561 hasConcept C2777601683 @default.
- W4386075561 hasConcept C2780513914 @default.
- W4386075561 hasConcept C33923547 @default.
- W4386075561 hasConcept C41008148 @default.
- W4386075561 hasConcept C41895202 @default.
- W4386075561 hasConcept C89600930 @default.
- W4386075561 hasConceptScore W4386075561C115961682 @default.
- W4386075561 hasConceptScore W4386075561C119857082 @default.
- W4386075561 hasConceptScore W4386075561C134306372 @default.
- W4386075561 hasConceptScore W4386075561C137293760 @default.
- W4386075561 hasConceptScore W4386075561C138885662 @default.
- W4386075561 hasConceptScore W4386075561C149635348 @default.
- W4386075561 hasConceptScore W4386075561C153083717 @default.
- W4386075561 hasConceptScore W4386075561C153180895 @default.
- W4386075561 hasConceptScore W4386075561C154945302 @default.
- W4386075561 hasConceptScore W4386075561C177148314 @default.
- W4386075561 hasConceptScore W4386075561C204321447 @default.
- W4386075561 hasConceptScore W4386075561C2777601683 @default.
- W4386075561 hasConceptScore W4386075561C2780513914 @default.
- W4386075561 hasConceptScore W4386075561C33923547 @default.
- W4386075561 hasConceptScore W4386075561C41008148 @default.
- W4386075561 hasConceptScore W4386075561C41895202 @default.
- W4386075561 hasConceptScore W4386075561C89600930 @default.
- W4386075561 hasLocation W43860755611 @default.
- W4386075561 hasOpenAccess W4386075561 @default.
- W4386075561 hasPrimaryLocation W43860755611 @default.
- W4386075561 hasRelatedWork W2354251581 @default.
- W4386075561 hasRelatedWork W2357461155 @default.
- W4386075561 hasRelatedWork W2359001871 @default.
- W4386075561 hasRelatedWork W2384129116 @default.
- W4386075561 hasRelatedWork W2603501458 @default.
- W4386075561 hasRelatedWork W2961085424 @default.
- W4386075561 hasRelatedWork W3145924829 @default.
- W4386075561 hasRelatedWork W4300659434 @default.
- W4386075561 hasRelatedWork W4385474580 @default.
- W4386075561 hasRelatedWork W1986205833 @default.
- W4386075561 isParatext "false" @default.
- W4386075561 isRetracted "false" @default.
- W4386075561 workType "article" @default.