Matches in SemOpenAlex for { <https://semopenalex.org/work/W4309398427> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W4309398427 abstract "Masked image modeling (MIM) learns visual representation by masking and reconstructing image patches. Applying the reconstruction supervision on the CLIP representation has been proven effective for MIM. However, it is still under-explored how CLIP supervision in MIM influences performance. To investigate strategies for refining the CLIP-targeted MIM, we study two critical elements in MIM, i.e., the supervision position and the mask ratio, and reveal two interesting perspectives, relying on our developed simple pipeline, context autodecoder with CLIP target (CAE v2). Firstly, we observe that the supervision on visible patches achieves remarkable performance, even better than that on masked patches, where the latter is the standard format in the existing MIM methods. Secondly, the optimal mask ratio positively correlates to the model size. That is to say, the smaller the model, the lower the mask ratio needs to be. Driven by these two discoveries, our simple and concise approach CAE v2 achieves superior performance on a series of downstream tasks. For example, a vanilla ViT-Large model achieves 81.7% and 86.7% top-1 accuracy on linear probing and fine-tuning on ImageNet-1K, and 55.9% mIoU on semantic segmentation on ADE20K with the pre-training for 300 epochs. We hope our findings can be helpful guidelines for the pre-training in the MIM area, especially for the small-scale models." @default.
- W4309398427 created "2022-11-26" @default.
- W4309398427 creator A5002323224 @default.
- W4309398427 creator A5008391873 @default.
- W4309398427 creator A5015595596 @default.
- W4309398427 creator A5032764121 @default.
- W4309398427 creator A5045958583 @default.
- W4309398427 creator A5046173951 @default.
- W4309398427 creator A5049344673 @default.
- W4309398427 creator A5050031109 @default.
- W4309398427 creator A5068877759 @default.
- W4309398427 creator A5072506429 @default.
- W4309398427 creator A5073212256 @default.
- W4309398427 creator A5075880303 @default.
- W4309398427 creator A5079072497 @default.
- W4309398427 date "2022-11-17" @default.
- W4309398427 modified "2023-09-27" @default.
- W4309398427 title "CAE v2: Context Autoencoder with CLIP Target" @default.
- W4309398427 doi "https://doi.org/10.48550/arxiv.2211.09799" @default.
- W4309398427 hasPublicationYear "2022" @default.
- W4309398427 type Work @default.
- W4309398427 citedByCount "0" @default.
- W4309398427 crossrefType "posted-content" @default.
- W4309398427 hasAuthorship W4309398427A5002323224 @default.
- W4309398427 hasAuthorship W4309398427A5008391873 @default.
- W4309398427 hasAuthorship W4309398427A5015595596 @default.
- W4309398427 hasAuthorship W4309398427A5032764121 @default.
- W4309398427 hasAuthorship W4309398427A5045958583 @default.
- W4309398427 hasAuthorship W4309398427A5046173951 @default.
- W4309398427 hasAuthorship W4309398427A5049344673 @default.
- W4309398427 hasAuthorship W4309398427A5050031109 @default.
- W4309398427 hasAuthorship W4309398427A5068877759 @default.
- W4309398427 hasAuthorship W4309398427A5072506429 @default.
- W4309398427 hasAuthorship W4309398427A5073212256 @default.
- W4309398427 hasAuthorship W4309398427A5075880303 @default.
- W4309398427 hasAuthorship W4309398427A5079072497 @default.
- W4309398427 hasBestOaLocation W43093984271 @default.
- W4309398427 hasConcept C101738243 @default.
- W4309398427 hasConcept C108583219 @default.
- W4309398427 hasConcept C111472728 @default.
- W4309398427 hasConcept C138885662 @default.
- W4309398427 hasConcept C142362112 @default.
- W4309398427 hasConcept C151730666 @default.
- W4309398427 hasConcept C153180895 @default.
- W4309398427 hasConcept C153349607 @default.
- W4309398427 hasConcept C154945302 @default.
- W4309398427 hasConcept C17744445 @default.
- W4309398427 hasConcept C199360897 @default.
- W4309398427 hasConcept C199539241 @default.
- W4309398427 hasConcept C2776359362 @default.
- W4309398427 hasConcept C2777402240 @default.
- W4309398427 hasConcept C2779343474 @default.
- W4309398427 hasConcept C2780586882 @default.
- W4309398427 hasConcept C31972630 @default.
- W4309398427 hasConcept C41008148 @default.
- W4309398427 hasConcept C43521106 @default.
- W4309398427 hasConcept C86803240 @default.
- W4309398427 hasConcept C89600930 @default.
- W4309398427 hasConcept C94625758 @default.
- W4309398427 hasConceptScore W4309398427C101738243 @default.
- W4309398427 hasConceptScore W4309398427C108583219 @default.
- W4309398427 hasConceptScore W4309398427C111472728 @default.
- W4309398427 hasConceptScore W4309398427C138885662 @default.
- W4309398427 hasConceptScore W4309398427C142362112 @default.
- W4309398427 hasConceptScore W4309398427C151730666 @default.
- W4309398427 hasConceptScore W4309398427C153180895 @default.
- W4309398427 hasConceptScore W4309398427C153349607 @default.
- W4309398427 hasConceptScore W4309398427C154945302 @default.
- W4309398427 hasConceptScore W4309398427C17744445 @default.
- W4309398427 hasConceptScore W4309398427C199360897 @default.
- W4309398427 hasConceptScore W4309398427C199539241 @default.
- W4309398427 hasConceptScore W4309398427C2776359362 @default.
- W4309398427 hasConceptScore W4309398427C2777402240 @default.
- W4309398427 hasConceptScore W4309398427C2779343474 @default.
- W4309398427 hasConceptScore W4309398427C2780586882 @default.
- W4309398427 hasConceptScore W4309398427C31972630 @default.
- W4309398427 hasConceptScore W4309398427C41008148 @default.
- W4309398427 hasConceptScore W4309398427C43521106 @default.
- W4309398427 hasConceptScore W4309398427C86803240 @default.
- W4309398427 hasConceptScore W4309398427C89600930 @default.
- W4309398427 hasConceptScore W4309398427C94625758 @default.
- W4309398427 hasLocation W43093984271 @default.
- W4309398427 hasLocation W43093984272 @default.
- W4309398427 hasOpenAccess W4309398427 @default.
- W4309398427 hasPrimaryLocation W43093984271 @default.
- W4309398427 hasRelatedWork W1669643531 @default.
- W4309398427 hasRelatedWork W2005437358 @default.
- W4309398427 hasRelatedWork W2008656436 @default.
- W4309398427 hasRelatedWork W2023558673 @default.
- W4309398427 hasRelatedWork W2134924024 @default.
- W4309398427 hasRelatedWork W2292254049 @default.
- W4309398427 hasRelatedWork W2517104666 @default.
- W4309398427 hasRelatedWork W2897995864 @default.
- W4309398427 hasRelatedWork W2998168123 @default.
- W4309398427 hasRelatedWork W4287995534 @default.
- W4309398427 isParatext "false" @default.
- W4309398427 isRetracted "false" @default.
- W4309398427 workType "article" @default.