Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385346076> ?p ?o ?g. }
- W4385346076 endingPage "752" @default.
- W4385346076 startingPage "733" @default.
- W4385346076 abstract "Abstract While originally designed for natural language processing tasks, the self-attention mechanism has recently taken various computer vision areas by storm. However, the 2D nature of images brings three challenges for applying self-attention in computer vision: (1) treating images as 1D sequences neglects their 2D structures; (2) the quadratic complexity is too expensive for high-resolution images; (3) it only captures spatial adaptability but ignores channel adaptability. In this paper, we propose a novel linear attention named large kernel attention (LKA) to enable self-adaptive and long-range correlations in self-attention while avoiding its shortcomings. Furthermore, we present a neural network based on LKA, namely Visual Attention Network (VAN). While extremely simple, VAN achieves comparable results with similar size convolutional neural networks (CNNs) and vision transformers (ViTs) in various tasks, including image classification, object detection, semantic segmentation, panoptic segmentation, pose estimation, etc. For example, VAN-B6 achieves 87.8% accuracy on ImageNet benchmark, and sets new state-of-the-art performance (58.2 PQ) for panoptic segmentation. Besides, VAN-B2 surpasses Swin-T 4 mIoU (50.1 vs. 46.1) for semantic segmentation on ADE20K benchmark, 2.6 AP (48.8 vs. 46.2) for object detection on COCO dataset. It provides a novel method and a simple yet strong baseline for the community. The code is available at https://github.com/Visual-Attention-Network ." @default.
- W4385346076 created "2023-07-29" @default.
- W4385346076 creator A5013917396 @default.
- W4385346076 creator A5018791285 @default.
- W4385346076 creator A5037131575 @default.
- W4385346076 creator A5037233582 @default.
- W4385346076 creator A5065936389 @default.
- W4385346076 date "2023-07-28" @default.
- W4385346076 modified "2023-10-11" @default.
- W4385346076 title "Visual attention network" @default.
- W4385346076 cites W1861492603 @default.
- W4385346076 cites W2030031014 @default.
- W4385346076 cites W2039313011 @default.
- W4385346076 cites W2040870580 @default.
- W4385346076 cites W2086161653 @default.
- W4385346076 cites W2086791339 @default.
- W4385346076 cites W2089597841 @default.
- W4385346076 cites W2097117768 @default.
- W4385346076 cites W2108598243 @default.
- W4385346076 cites W2112796928 @default.
- W4385346076 cites W2147800946 @default.
- W4385346076 cites W2149095485 @default.
- W4385346076 cites W2160903697 @default.
- W4385346076 cites W2194775991 @default.
- W4385346076 cites W2295107390 @default.
- W4385346076 cites W2507296351 @default.
- W4385346076 cites W2549139847 @default.
- W4385346076 cites W2550553598 @default.
- W4385346076 cites W2601564443 @default.
- W4385346076 cites W2618530766 @default.
- W4385346076 cites W2740667773 @default.
- W4385346076 cites W2752782242 @default.
- W4385346076 cites W2766736793 @default.
- W4385346076 cites W2798270772 @default.
- W4385346076 cites W2884585870 @default.
- W4385346076 cites W2884822772 @default.
- W4385346076 cites W2910628332 @default.
- W4385346076 cites W2928165649 @default.
- W4385346076 cites W2955058313 @default.
- W4385346076 cites W2962858109 @default.
- W4385346076 cites W2963091558 @default.
- W4385346076 cites W2963125010 @default.
- W4385346076 cites W2963150697 @default.
- W4385346076 cites W2963163009 @default.
- W4385346076 cites W2963351448 @default.
- W4385346076 cites W2963402313 @default.
- W4385346076 cites W2963446712 @default.
- W4385346076 cites W2963495494 @default.
- W4385346076 cites W2964080601 @default.
- W4385346076 cites W2981413347 @default.
- W4385346076 cites W2981689412 @default.
- W4385346076 cites W2989676862 @default.
- W4385346076 cites W2992308087 @default.
- W4385346076 cites W2998508940 @default.
- W4385346076 cites W3014641072 @default.
- W4385346076 cites W3034552520 @default.
- W4385346076 cites W3035396860 @default.
- W4385346076 cites W3096609285 @default.
- W4385346076 cites W3100341797 @default.
- W4385346076 cites W3109301572 @default.
- W4385346076 cites W3113755791 @default.
- W4385346076 cites W3121523901 @default.
- W4385346076 cites W3131500599 @default.
- W4385346076 cites W3138516171 @default.
- W4385346076 cites W3153465022 @default.
- W4385346076 cites W3163465952 @default.
- W4385346076 cites W3172509117 @default.
- W4385346076 cites W3172752666 @default.
- W4385346076 cites W3175515048 @default.
- W4385346076 cites W3186477774 @default.
- W4385346076 cites W3201844719 @default.
- W4385346076 cites W3209859545 @default.
- W4385346076 cites W3212386989 @default.
- W4385346076 cites W3212972574 @default.
- W4385346076 cites W4214493665 @default.
- W4385346076 cites W4214634256 @default.
- W4385346076 cites W4214666412 @default.
- W4385346076 cites W4214709605 @default.
- W4385346076 cites W4225829036 @default.
- W4385346076 cites W4226334005 @default.
- W4385346076 cites W4293680532 @default.
- W4385346076 cites W4302275239 @default.
- W4385346076 cites W4312443924 @default.
- W4385346076 cites W4312815172 @default.
- W4385346076 cites W4312820606 @default.
- W4385346076 cites W4312950730 @default.
- W4385346076 cites W4312977443 @default.
- W4385346076 cites W4313156423 @default.
- W4385346076 doi "https://doi.org/10.1007/s41095-023-0364-2" @default.
- W4385346076 hasPublicationYear "2023" @default.
- W4385346076 type Work @default.
- W4385346076 citedByCount "4" @default.
- W4385346076 countsByYear W43853460762023 @default.
- W4385346076 crossrefType "journal-article" @default.
- W4385346076 hasAuthorship W4385346076A5013917396 @default.
- W4385346076 hasAuthorship W4385346076A5018791285 @default.
- W4385346076 hasAuthorship W4385346076A5037131575 @default.
- W4385346076 hasAuthorship W4385346076A5037233582 @default.