Matches in SemOpenAlex for { <https://semopenalex.org/work/W3155420132> ?p ?o ?g. }
- W3155420132 abstract "Recently the vision transformer (ViT) architecture, where the backbone purely consists of self-attention mechanism, has achieved very promising performance in visual classification. However, the high performance of the original ViT heavily depends on pretraining using ultra large-scale datasets, and it significantly underperforms on ImageNet-1K if trained from scratch. This paper makes the efforts toward addressing this problem, by carefully considering the role of visual tokens. First, for classification head, existing ViT only exploits class token while entirely neglecting rich semantic information inherent in high-level visual tokens. Therefore, we propose a new classification paradigm, where the second-order, cross-covariance pooling of visual tokens is combined with class token for final classification. Meanwhile, a fast singular value power normalization is proposed for improving the second-order pooling. Second, the original ViT employs the naive embedding of fixed-size image patches, lacking the ability to model translation equivariance and locality. To alleviate this problem, we develop a light-weight, hierarchical module based on off-the-shelf convolutions for visual token embedding. The proposed architecture, which we call So-ViT, is thoroughly evaluated on ImageNet-1K. The results show our models, when trained from scratch, outperform the competing ViT variants, while being on par with or better than state-of-the-art CNN models. Code is available at this https URL" @default.
- W3155420132 created "2021-04-26" @default.
- W3155420132 creator A5029031269 @default.
- W3155420132 creator A5040217958 @default.
- W3155420132 creator A5052272812 @default.
- W3155420132 creator A5070918242 @default.
- W3155420132 creator A5089387872 @default.
- W3155420132 date "2021-04-22" @default.
- W3155420132 modified "2023-10-03" @default.
- W3155420132 title "So-ViT: Mind Visual Tokens for Vision Transformer." @default.
- W3155420132 cites W1584270973 @default.
- W3155420132 cites W1663973292 @default.
- W3155420132 cites W1967722715 @default.
- W3155420132 cites W1978601068 @default.
- W3155420132 cites W2183341477 @default.
- W3155420132 cites W2194775991 @default.
- W3155420132 cites W2237132896 @default.
- W3155420132 cites W2502312327 @default.
- W3155420132 cites W2936614765 @default.
- W3155420132 cites W2950094539 @default.
- W3155420132 cites W2963091558 @default.
- W3155420132 cites W2963446712 @default.
- W3155420132 cites W2970971581 @default.
- W3155420132 hasPublicationYear "2021" @default.
- W3155420132 type Work @default.
- W3155420132 sameAs 3155420132 @default.
- W3155420132 citedByCount "4" @default.
- W3155420132 countsByYear W31554201322021 @default.
- W3155420132 crossrefType "posted-content" @default.
- W3155420132 hasAuthorship W3155420132A5029031269 @default.
- W3155420132 hasAuthorship W3155420132A5040217958 @default.
- W3155420132 hasAuthorship W3155420132A5052272812 @default.
- W3155420132 hasAuthorship W3155420132A5070918242 @default.
- W3155420132 hasAuthorship W3155420132A5089387872 @default.
- W3155420132 hasConcept C119857082 @default.
- W3155420132 hasConcept C121332964 @default.
- W3155420132 hasConcept C123657996 @default.
- W3155420132 hasConcept C136886441 @default.
- W3155420132 hasConcept C138885662 @default.
- W3155420132 hasConcept C142362112 @default.
- W3155420132 hasConcept C144024400 @default.
- W3155420132 hasConcept C153180895 @default.
- W3155420132 hasConcept C153349607 @default.
- W3155420132 hasConcept C154945302 @default.
- W3155420132 hasConcept C165696696 @default.
- W3155420132 hasConcept C165801399 @default.
- W3155420132 hasConcept C19165224 @default.
- W3155420132 hasConcept C2779808786 @default.
- W3155420132 hasConcept C38652104 @default.
- W3155420132 hasConcept C41008148 @default.
- W3155420132 hasConcept C41608201 @default.
- W3155420132 hasConcept C41895202 @default.
- W3155420132 hasConcept C48145219 @default.
- W3155420132 hasConcept C62520636 @default.
- W3155420132 hasConcept C66322947 @default.
- W3155420132 hasConcept C70437156 @default.
- W3155420132 hasConcept C97541855 @default.
- W3155420132 hasConceptScore W3155420132C119857082 @default.
- W3155420132 hasConceptScore W3155420132C121332964 @default.
- W3155420132 hasConceptScore W3155420132C123657996 @default.
- W3155420132 hasConceptScore W3155420132C136886441 @default.
- W3155420132 hasConceptScore W3155420132C138885662 @default.
- W3155420132 hasConceptScore W3155420132C142362112 @default.
- W3155420132 hasConceptScore W3155420132C144024400 @default.
- W3155420132 hasConceptScore W3155420132C153180895 @default.
- W3155420132 hasConceptScore W3155420132C153349607 @default.
- W3155420132 hasConceptScore W3155420132C154945302 @default.
- W3155420132 hasConceptScore W3155420132C165696696 @default.
- W3155420132 hasConceptScore W3155420132C165801399 @default.
- W3155420132 hasConceptScore W3155420132C19165224 @default.
- W3155420132 hasConceptScore W3155420132C2779808786 @default.
- W3155420132 hasConceptScore W3155420132C38652104 @default.
- W3155420132 hasConceptScore W3155420132C41008148 @default.
- W3155420132 hasConceptScore W3155420132C41608201 @default.
- W3155420132 hasConceptScore W3155420132C41895202 @default.
- W3155420132 hasConceptScore W3155420132C48145219 @default.
- W3155420132 hasConceptScore W3155420132C62520636 @default.
- W3155420132 hasConceptScore W3155420132C66322947 @default.
- W3155420132 hasConceptScore W3155420132C70437156 @default.
- W3155420132 hasConceptScore W3155420132C97541855 @default.
- W3155420132 hasLocation W31554201321 @default.
- W3155420132 hasOpenAccess W3155420132 @default.
- W3155420132 hasPrimaryLocation W31554201321 @default.
- W3155420132 hasRelatedWork W2234895705 @default.
- W3155420132 hasRelatedWork W2781459369 @default.
- W3155420132 hasRelatedWork W2918598763 @default.
- W3155420132 hasRelatedWork W2952074843 @default.
- W3155420132 hasRelatedWork W2969730556 @default.
- W3155420132 hasRelatedWork W3029678209 @default.
- W3155420132 hasRelatedWork W3091985378 @default.
- W3155420132 hasRelatedWork W3129121463 @default.
- W3155420132 hasRelatedWork W3166658420 @default.
- W3155420132 hasRelatedWork W3173053527 @default.
- W3155420132 hasRelatedWork W3173631098 @default.
- W3155420132 hasRelatedWork W3178389175 @default.
- W3155420132 hasRelatedWork W3197083248 @default.
- W3155420132 hasRelatedWork W3202033052 @default.
- W3155420132 hasRelatedWork W3202053489 @default.
- W3155420132 hasRelatedWork W3203055845 @default.
- W3155420132 hasRelatedWork W3204826552 @default.