Matches in SemOpenAlex for { <https://semopenalex.org/work/W3035323998> ?p ?o ?g. }
- W3035323998 abstract "Visual attention not only improves the performance of image captioners, but also serves as a visual interpretation to qualitatively measure the caption rationality and model transparency. Specifically, we expect that a captioner can fix its attentive gaze on the correct objects while generating the corresponding words. This ability is also known as grounded image captioning. However, the grounding accuracy of existing captioners is far from satisfactory.To improve the grounding accuracy while retaining the captioning quality, it is expensive to collect the word-region alignment as strong supervision.To this end, we propose a Part-of-Speech (POS) enhanced image-text matching model (SCAN[24]): POS-SCAN, as the effective knowledge distillation for more grounded image captioning. The benefits are two-fold: 1) given a sentence and an image, POS-SCAN can ground the objects more accurately than SCAN; 2) POS-SCAN serves as a word-region alignment regularization for the captioner's visual attention module. By showing benchmark experimental results, we demonstrate that conventional image captioners equipped with POS-SCAN can significantly improve the grounding accuracy without strong supervision. Last but not the least, we explore the indispensable Self-Critical Sequence Training (SCST[46]) in the context of grounded image captioning and show that the image-text matching score can serve as a reward for more grounded captioning <sup xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>1</sup> ." @default.
- W3035323998 created "2020-06-19" @default.
- W3035323998 creator A5003983265 @default.
- W3035323998 creator A5025436570 @default.
- W3035323998 creator A5036726873 @default.
- W3035323998 creator A5040235604 @default.
- W3035323998 creator A5042324027 @default.
- W3035323998 date "2020-06-01" @default.
- W3035323998 modified "2023-10-14" @default.
- W3035323998 title "More Grounded Image Captioning by Distilling Image-Text Matching Model" @default.
- W3035323998 cites W1773149199 @default.
- W3035323998 cites W1895577753 @default.
- W3035323998 cites W1905882502 @default.
- W3035323998 cites W1956340063 @default.
- W3035323998 cites W1969616664 @default.
- W3035323998 cites W2064675550 @default.
- W3035323998 cites W2101105183 @default.
- W3035323998 cites W2131774270 @default.
- W3035323998 cites W2133459682 @default.
- W3035323998 cites W2277195237 @default.
- W3035323998 cites W2302086703 @default.
- W3035323998 cites W2550553598 @default.
- W3035323998 cites W2558535589 @default.
- W3035323998 cites W2575842049 @default.
- W3035323998 cites W2740118378 @default.
- W3035323998 cites W2745461083 @default.
- W3035323998 cites W2778940641 @default.
- W3035323998 cites W2779827764 @default.
- W3035323998 cites W2795151422 @default.
- W3035323998 cites W2886300652 @default.
- W3035323998 cites W2887585070 @default.
- W3035323998 cites W2938603906 @default.
- W3035323998 cites W2962735233 @default.
- W3035323998 cites W2963084599 @default.
- W3035323998 cites W2963101956 @default.
- W3035323998 cites W2963109634 @default.
- W3035323998 cites W2963389687 @default.
- W3035323998 cites W2963445828 @default.
- W3035323998 cites W2963448089 @default.
- W3035323998 cites W2963914122 @default.
- W3035323998 cites W2964345792 @default.
- W3035323998 cites W2968101724 @default.
- W3035323998 cites W2981448908 @default.
- W3035323998 cites W2984121207 @default.
- W3035323998 cites W2986670728 @default.
- W3035323998 cites W2987809065 @default.
- W3035323998 cites W2989176720 @default.
- W3035323998 cites W2990069284 @default.
- W3035323998 cites W3035017890 @default.
- W3035323998 cites W3102424508 @default.
- W3035323998 cites W3103651098 @default.
- W3035323998 cites W639708223 @default.
- W3035323998 cites W753847829 @default.
- W3035323998 doi "https://doi.org/10.1109/cvpr42600.2020.00483" @default.
- W3035323998 hasPublicationYear "2020" @default.
- W3035323998 type Work @default.
- W3035323998 sameAs 3035323998 @default.
- W3035323998 citedByCount "69" @default.
- W3035323998 countsByYear W30353239982020 @default.
- W3035323998 countsByYear W30353239982021 @default.
- W3035323998 countsByYear W30353239982022 @default.
- W3035323998 countsByYear W30353239982023 @default.
- W3035323998 crossrefType "proceedings-article" @default.
- W3035323998 hasAuthorship W3035323998A5003983265 @default.
- W3035323998 hasAuthorship W3035323998A5025436570 @default.
- W3035323998 hasAuthorship W3035323998A5036726873 @default.
- W3035323998 hasAuthorship W3035323998A5040235604 @default.
- W3035323998 hasAuthorship W3035323998A5042324027 @default.
- W3035323998 hasBestOaLocation W30353239982 @default.
- W3035323998 hasConcept C105795698 @default.
- W3035323998 hasConcept C115961682 @default.
- W3035323998 hasConcept C151730666 @default.
- W3035323998 hasConcept C153180895 @default.
- W3035323998 hasConcept C154945302 @default.
- W3035323998 hasConcept C157657479 @default.
- W3035323998 hasConcept C165064840 @default.
- W3035323998 hasConcept C204321447 @default.
- W3035323998 hasConcept C2524010 @default.
- W3035323998 hasConcept C2777530160 @default.
- W3035323998 hasConcept C2779343474 @default.
- W3035323998 hasConcept C28490314 @default.
- W3035323998 hasConcept C31972630 @default.
- W3035323998 hasConcept C33923547 @default.
- W3035323998 hasConcept C41008148 @default.
- W3035323998 hasConcept C86803240 @default.
- W3035323998 hasConcept C90805587 @default.
- W3035323998 hasConceptScore W3035323998C105795698 @default.
- W3035323998 hasConceptScore W3035323998C115961682 @default.
- W3035323998 hasConceptScore W3035323998C151730666 @default.
- W3035323998 hasConceptScore W3035323998C153180895 @default.
- W3035323998 hasConceptScore W3035323998C154945302 @default.
- W3035323998 hasConceptScore W3035323998C157657479 @default.
- W3035323998 hasConceptScore W3035323998C165064840 @default.
- W3035323998 hasConceptScore W3035323998C204321447 @default.
- W3035323998 hasConceptScore W3035323998C2524010 @default.
- W3035323998 hasConceptScore W3035323998C2777530160 @default.
- W3035323998 hasConceptScore W3035323998C2779343474 @default.
- W3035323998 hasConceptScore W3035323998C28490314 @default.
- W3035323998 hasConceptScore W3035323998C31972630 @default.
- W3035323998 hasConceptScore W3035323998C33923547 @default.