Matches in SemOpenAlex for { <https://semopenalex.org/work/W3112857531> ?p ?o ?g. }
Showing items 1 to 95 of
95
with 100 items per page.
- W3112857531 endingPage "213446" @default.
- W3112857531 startingPage "213437" @default.
- W3112857531 abstract "Image Captioning is the task of providing a natural language description for an image. It has caught significant amounts of attention from both computer vision and natural language processing communities. Most image captioning models adopt deep encoder-decoder architectures to achieve state-of-the-art performances. However, it is difficult to model knowledge on relationships between input image region pairs in the encoder. Furthermore, the word in the decoder hardly knows the correlation to specific image regions. In this article, a novel deep encoder-decoder model is proposed for image captioning which is developed on sparse Transformer framework. The encoder adopts a multi-level representation of image features based on self-attention to exploit low-level and high-level features, naturally the correlations between image region pairs are adequately modeled as self-attention operation can be seen as a way of encoding pairwise relationships. The decoder improves the concentration of multi-head self-attention on the global context by explicitly selecting the most relevant segments at each row of the attention matrix. It can help the model focus on the more contributing image regions and generate more accurate words in the context. Experiments demonstrate that our model outperforms previous methods and achieves higher performance on MSCOCO and Flickr30k datasets. Our code is available at https://github.com/2014gaokao/ImageCaptioning." @default.
- W3112857531 created "2020-12-21" @default.
- W3112857531 creator A5042646508 @default.
- W3112857531 creator A5058737960 @default.
- W3112857531 creator A5062618254 @default.
- W3112857531 creator A5072602296 @default.
- W3112857531 creator A5088202431 @default.
- W3112857531 date "2020-01-01" @default.
- W3112857531 modified "2023-09-25" @default.
- W3112857531 title "A Sparse Transformer-Based Approach for Image Captioning" @default.
- W3112857531 cites W1773149199 @default.
- W3112857531 cites W1895577753 @default.
- W3112857531 cites W1905882502 @default.
- W3112857531 cites W1947481528 @default.
- W3112857531 cites W1956340063 @default.
- W3112857531 cites W2064675550 @default.
- W3112857531 cites W2108598243 @default.
- W3112857531 cites W2194775991 @default.
- W3112857531 cites W2277195237 @default.
- W3112857531 cites W2550553598 @default.
- W3112857531 cites W2575842049 @default.
- W3112857531 cites W2745461083 @default.
- W3112857531 cites W2795151422 @default.
- W3112857531 cites W2808206191 @default.
- W3112857531 cites W2885013662 @default.
- W3112857531 cites W2887585070 @default.
- W3112857531 cites W2890531016 @default.
- W3112857531 cites W2896348597 @default.
- W3112857531 cites W2901988662 @default.
- W3112857531 cites W2963084599 @default.
- W3112857531 cites W2963101956 @default.
- W3112857531 cites W2965359408 @default.
- W3112857531 cites W2965697393 @default.
- W3112857531 cites W2972897806 @default.
- W3112857531 cites W2979747405 @default.
- W3112857531 cites W2982553922 @default.
- W3112857531 cites W2984138079 @default.
- W3112857531 cites W2986670728 @default.
- W3112857531 cites W2990818246 @default.
- W3112857531 cites W3034655362 @default.
- W3112857531 cites W3034984754 @default.
- W3112857531 doi "https://doi.org/10.1109/access.2020.3024639" @default.
- W3112857531 hasPublicationYear "2020" @default.
- W3112857531 type Work @default.
- W3112857531 sameAs 3112857531 @default.
- W3112857531 citedByCount "1" @default.
- W3112857531 countsByYear W31128575312023 @default.
- W3112857531 crossrefType "journal-article" @default.
- W3112857531 hasAuthorship W3112857531A5042646508 @default.
- W3112857531 hasAuthorship W3112857531A5058737960 @default.
- W3112857531 hasAuthorship W3112857531A5062618254 @default.
- W3112857531 hasAuthorship W3112857531A5072602296 @default.
- W3112857531 hasAuthorship W3112857531A5088202431 @default.
- W3112857531 hasBestOaLocation W31128575311 @default.
- W3112857531 hasConcept C115961682 @default.
- W3112857531 hasConcept C119599485 @default.
- W3112857531 hasConcept C127413603 @default.
- W3112857531 hasConcept C154945302 @default.
- W3112857531 hasConcept C157657479 @default.
- W3112857531 hasConcept C165801399 @default.
- W3112857531 hasConcept C31972630 @default.
- W3112857531 hasConcept C41008148 @default.
- W3112857531 hasConcept C66322947 @default.
- W3112857531 hasConceptScore W3112857531C115961682 @default.
- W3112857531 hasConceptScore W3112857531C119599485 @default.
- W3112857531 hasConceptScore W3112857531C127413603 @default.
- W3112857531 hasConceptScore W3112857531C154945302 @default.
- W3112857531 hasConceptScore W3112857531C157657479 @default.
- W3112857531 hasConceptScore W3112857531C165801399 @default.
- W3112857531 hasConceptScore W3112857531C31972630 @default.
- W3112857531 hasConceptScore W3112857531C41008148 @default.
- W3112857531 hasConceptScore W3112857531C66322947 @default.
- W3112857531 hasFunder F4320321001 @default.
- W3112857531 hasFunder F4320321885 @default.
- W3112857531 hasLocation W31128575311 @default.
- W3112857531 hasLocation W31128575312 @default.
- W3112857531 hasOpenAccess W3112857531 @default.
- W3112857531 hasPrimaryLocation W31128575311 @default.
- W3112857531 hasRelatedWork W2130228941 @default.
- W3112857531 hasRelatedWork W2161229648 @default.
- W3112857531 hasRelatedWork W2547835662 @default.
- W3112857531 hasRelatedWork W2905654560 @default.
- W3112857531 hasRelatedWork W2923366293 @default.
- W3112857531 hasRelatedWork W2993674027 @default.
- W3112857531 hasRelatedWork W3008515501 @default.
- W3112857531 hasRelatedWork W3183824823 @default.
- W3112857531 hasRelatedWork W4307856881 @default.
- W3112857531 hasRelatedWork W4320016117 @default.
- W3112857531 hasVolume "8" @default.
- W3112857531 isParatext "false" @default.
- W3112857531 isRetracted "false" @default.
- W3112857531 magId "3112857531" @default.
- W3112857531 workType "article" @default.