Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313172969> ?p ?o ?g. }
Showing items 1 to 70 of
70
with 100 items per page.
- W4313172969 abstract "Language modality within the vision language pre-training framework is innately discretized, endowing each word in the language vocabulary a semantic meaning. In contrast, visual modality is inherently continuous and high-dimensional, which potentially prohibits the alignment as well as fusion between vision and language modalities. We therefore propose to discretize the visual representation by joint learning a codebook that imbues each visual token a semantic. We then utilize these discretized visual semantics as self-supervised ground-truths for building our Masked Image Modeling objective, a counterpart of Masked Language Modeling which proves successful for language models. To optimize the codebook, we extend the formulation of VQ-VAE which gives a theoretic guarantee. Experiments validate the effectiveness of our approach across common vision-language benchmarks." @default.
- W4313172969 created "2023-01-06" @default.
- W4313172969 creator A5024679224 @default.
- W4313172969 creator A5025487788 @default.
- W4313172969 creator A5051293560 @default.
- W4313172969 creator A5065995768 @default.
- W4313172969 creator A5076075666 @default.
- W4313172969 date "2022-08-21" @default.
- W4313172969 modified "2023-09-27" @default.
- W4313172969 title "Augmenting Vision Language Pretraining by Learning Codebook with Visual Semantics" @default.
- W4313172969 cites W1773149199 @default.
- W4313172969 cites W2277195237 @default.
- W4313172969 cites W2560730294 @default.
- W4313172969 cites W2886641317 @default.
- W4313172969 cites W2962858109 @default.
- W4313172969 cites W2970231061 @default.
- W4313172969 cites W2981851019 @default.
- W4313172969 cites W2998356391 @default.
- W4313172969 cites W3034727271 @default.
- W4313172969 cites W3173220247 @default.
- W4313172969 cites W3179883066 @default.
- W4313172969 cites W3184784418 @default.
- W4313172969 cites W4312460555 @default.
- W4313172969 cites W4312877428 @default.
- W4313172969 doi "https://doi.org/10.1109/icpr56361.2022.9956616" @default.
- W4313172969 hasPublicationYear "2022" @default.
- W4313172969 type Work @default.
- W4313172969 citedByCount "0" @default.
- W4313172969 crossrefType "proceedings-article" @default.
- W4313172969 hasAuthorship W4313172969A5024679224 @default.
- W4313172969 hasAuthorship W4313172969A5025487788 @default.
- W4313172969 hasAuthorship W4313172969A5051293560 @default.
- W4313172969 hasAuthorship W4313172969A5065995768 @default.
- W4313172969 hasAuthorship W4313172969A5076075666 @default.
- W4313172969 hasBestOaLocation W43131729692 @default.
- W4313172969 hasConcept C127759330 @default.
- W4313172969 hasConcept C138885662 @default.
- W4313172969 hasConcept C154945302 @default.
- W4313172969 hasConcept C184337299 @default.
- W4313172969 hasConcept C199360897 @default.
- W4313172969 hasConcept C204321447 @default.
- W4313172969 hasConcept C2777601683 @default.
- W4313172969 hasConcept C41008148 @default.
- W4313172969 hasConcept C41895202 @default.
- W4313172969 hasConceptScore W4313172969C127759330 @default.
- W4313172969 hasConceptScore W4313172969C138885662 @default.
- W4313172969 hasConceptScore W4313172969C154945302 @default.
- W4313172969 hasConceptScore W4313172969C184337299 @default.
- W4313172969 hasConceptScore W4313172969C199360897 @default.
- W4313172969 hasConceptScore W4313172969C204321447 @default.
- W4313172969 hasConceptScore W4313172969C2777601683 @default.
- W4313172969 hasConceptScore W4313172969C41008148 @default.
- W4313172969 hasConceptScore W4313172969C41895202 @default.
- W4313172969 hasLocation W43131729691 @default.
- W4313172969 hasLocation W43131729692 @default.
- W4313172969 hasOpenAccess W4313172969 @default.
- W4313172969 hasPrimaryLocation W43131729691 @default.
- W4313172969 hasRelatedWork W1541271503 @default.
- W4313172969 hasRelatedWork W1789705271 @default.
- W4313172969 hasRelatedWork W2103733568 @default.
- W4313172969 hasRelatedWork W2351631223 @default.
- W4313172969 hasRelatedWork W2385578066 @default.
- W4313172969 hasRelatedWork W2611614995 @default.
- W4313172969 hasRelatedWork W2935589 @default.
- W4313172969 hasRelatedWork W3033110060 @default.
- W4313172969 hasRelatedWork W3107474891 @default.
- W4313172969 hasRelatedWork W4236003019 @default.
- W4313172969 isParatext "false" @default.
- W4313172969 isRetracted "false" @default.
- W4313172969 workType "article" @default.