Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386076218> ?p ?o ?g. }
- W4386076218 abstract "Humans possess a versatile mechanism for extracting structured representations of our visual world. When looking at an image, we can decompose the scene into entities and their parts as well as obtain the dependencies between them. To mimic such capability, we propose Visual Dependency Transformers (DependencyViT) <sup xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>1</sup> <sup xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>1</sup> https://github.com/dingmyu/DependencyViT that can induce visual dependencies without any labels. We achieve that with a novel neural operator called reversed attention that can naturally capture long-range visual dependencies between image patches. Specifically, we formulate it as a dependency graph where a child token in reversed attention is trained to attend to its parent tokens and send information following a normalized probability distribution rather than gathering information in conventional self-attention. With such a design, hierarchies naturally emerge from reversed attention layers, and a dependency tree is progressively induced from leaf nodes to the root node unsupervisedly. DependencyViT offers several appealing benefits. (i) Entities and their parts in an image are represented by different subtrees, enabling part partitioning from dependencies; (ii) Dynamic visual pooling is made possible. The leaf nodes which rarely send messages can be pruned without hindering the model performance, based on which we propose the lightweight DependencyViT-Lite to reduce the computational and memory footprints; (iii) DependencyViT works well on both self- and weakly-supervised pretraining paradigms on ImageNet, and demonstrates its effectiveness on 8 datasets and 5 tasks, such as unsupervised part and saliency segmentation, recognition, and detection." @default.
- W4386076218 created "2023-08-23" @default.
- W4386076218 creator A5017218416 @default.
- W4386076218 creator A5017763539 @default.
- W4386076218 creator A5040877128 @default.
- W4386076218 creator A5042636572 @default.
- W4386076218 creator A5047843932 @default.
- W4386076218 creator A5073742611 @default.
- W4386076218 creator A5075158591 @default.
- W4386076218 creator A5081201135 @default.
- W4386076218 date "2023-06-01" @default.
- W4386076218 modified "2023-10-03" @default.
- W4386076218 title "Visual Dependency Transformers: Dependency Tree Emerges from Reversed Attention" @default.
- W4386076218 cites W1920234547 @default.
- W4386076218 cites W2002781701 @default.
- W4386076218 cites W2007339694 @default.
- W4386076218 cites W2039313011 @default.
- W4386076218 cites W2047670868 @default.
- W4386076218 cites W2056860348 @default.
- W4386076218 cites W2092919213 @default.
- W4386076218 cites W2104408738 @default.
- W4386076218 cites W2108598243 @default.
- W4386076218 cites W2121947440 @default.
- W4386076218 cites W2150593711 @default.
- W4386076218 cites W2168356304 @default.
- W4386076218 cites W2194775991 @default.
- W4386076218 cites W2222512263 @default.
- W4386076218 cites W2254462240 @default.
- W4386076218 cites W2294182682 @default.
- W4386076218 cites W2737258237 @default.
- W4386076218 cites W2740667773 @default.
- W4386076218 cites W2895526696 @default.
- W4386076218 cites W2962784289 @default.
- W4386076218 cites W2963143796 @default.
- W4386076218 cites W2963150697 @default.
- W4386076218 cites W2963318290 @default.
- W4386076218 cites W2997891818 @default.
- W4386076218 cites W3009224666 @default.
- W4386076218 cites W3034230284 @default.
- W4386076218 cites W3035638749 @default.
- W4386076218 cites W3113037084 @default.
- W4386076218 cites W3131500599 @default.
- W4386076218 cites W3138516171 @default.
- W4386076218 cites W3139445856 @default.
- W4386076218 cites W3145450063 @default.
- W4386076218 cites W3151130473 @default.
- W4386076218 cites W3151348293 @default.
- W4386076218 cites W3159481202 @default.
- W4386076218 cites W3164267792 @default.
- W4386076218 cites W3166171970 @default.
- W4386076218 cites W3170898657 @default.
- W4386076218 cites W3172509117 @default.
- W4386076218 cites W3174480456 @default.
- W4386076218 cites W3179869055 @default.
- W4386076218 cites W3203701986 @default.
- W4386076218 cites W4205969993 @default.
- W4386076218 cites W4214493665 @default.
- W4386076218 cites W4214588794 @default.
- W4386076218 cites W4214636423 @default.
- W4386076218 cites W4214669216 @default.
- W4386076218 cites W4221161778 @default.
- W4386076218 cites W4221167012 @default.
- W4386076218 cites W4312689172 @default.
- W4386076218 doi "https://doi.org/10.1109/cvpr52729.2023.01396" @default.
- W4386076218 hasPublicationYear "2023" @default.
- W4386076218 type Work @default.
- W4386076218 citedByCount "0" @default.
- W4386076218 crossrefType "proceedings-article" @default.
- W4386076218 hasAuthorship W4386076218A5017218416 @default.
- W4386076218 hasAuthorship W4386076218A5017763539 @default.
- W4386076218 hasAuthorship W4386076218A5040877128 @default.
- W4386076218 hasAuthorship W4386076218A5042636572 @default.
- W4386076218 hasAuthorship W4386076218A5047843932 @default.
- W4386076218 hasAuthorship W4386076218A5073742611 @default.
- W4386076218 hasAuthorship W4386076218A5075158591 @default.
- W4386076218 hasAuthorship W4386076218A5081201135 @default.
- W4386076218 hasConcept C113174947 @default.
- W4386076218 hasConcept C119857082 @default.
- W4386076218 hasConcept C132525143 @default.
- W4386076218 hasConcept C134306372 @default.
- W4386076218 hasConcept C153180895 @default.
- W4386076218 hasConcept C154945302 @default.
- W4386076218 hasConcept C16311509 @default.
- W4386076218 hasConcept C19768560 @default.
- W4386076218 hasConcept C33923547 @default.
- W4386076218 hasConcept C36464697 @default.
- W4386076218 hasConcept C38652104 @default.
- W4386076218 hasConcept C41008148 @default.
- W4386076218 hasConcept C48145219 @default.
- W4386076218 hasConcept C70437156 @default.
- W4386076218 hasConcept C80444323 @default.
- W4386076218 hasConceptScore W4386076218C113174947 @default.
- W4386076218 hasConceptScore W4386076218C119857082 @default.
- W4386076218 hasConceptScore W4386076218C132525143 @default.
- W4386076218 hasConceptScore W4386076218C134306372 @default.
- W4386076218 hasConceptScore W4386076218C153180895 @default.
- W4386076218 hasConceptScore W4386076218C154945302 @default.
- W4386076218 hasConceptScore W4386076218C16311509 @default.
- W4386076218 hasConceptScore W4386076218C19768560 @default.
- W4386076218 hasConceptScore W4386076218C33923547 @default.