Matches in SemOpenAlex for { <https://semopenalex.org/work/W4300485503> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4300485503 abstract "The conventional encoder-decoder framework for image captioning generally adopts a single-pass decoding process, which predicts the target descriptive sentence word by word in temporal order. Despite the great success of this framework, it still suffers from two serious disadvantages. Firstly, it is unable to correct the mistakes in the predicted words, which may mislead the subsequent prediction and result in error accumulation problem. Secondly, such a framework can only leverage the already generated words but not the possible future words, and thus lacks the ability of global planning on linguistic information. To overcome these limitations, we explore a universal two-pass decoding framework, where a single-pass decoding based model serving as the Drafting Model first generates a draft caption according to an input image, and a Deliberation Model then performs the polishing process to refine the draft caption to a better image description. Furthermore, inspired from the complementarity between different modalities, we propose a novel Cross Modification Attention (CMA) module to enhance the semantic expression of the image features and filter out error information from the draft captions. We integrate CMA with the decoder of our Deliberation Model and name it as Cross Modification Attention based Deliberation Model (CMA-DM). We train our proposed framework by jointly optimizing all trainable components from scratch with a trade-off coefficient. Experiments on MS COCO dataset demonstrate that our approach obtains significant improvements over single-pass decoding baselines and achieves competitive performances compared with other state-of-the-art two-pass decoding based methods." @default.
- W4300485503 created "2022-10-03" @default.
- W4300485503 creator A5016179282 @default.
- W4300485503 creator A5039766174 @default.
- W4300485503 creator A5060640583 @default.
- W4300485503 creator A5068951778 @default.
- W4300485503 creator A5070812231 @default.
- W4300485503 date "2021-09-17" @default.
- W4300485503 modified "2023-09-23" @default.
- W4300485503 title "Cross Modification Attention Based Deliberation Model for Image Captioning" @default.
- W4300485503 doi "https://doi.org/10.48550/arxiv.2109.08411" @default.
- W4300485503 hasPublicationYear "2021" @default.
- W4300485503 type Work @default.
- W4300485503 citedByCount "0" @default.
- W4300485503 crossrefType "posted-content" @default.
- W4300485503 hasAuthorship W4300485503A5016179282 @default.
- W4300485503 hasAuthorship W4300485503A5039766174 @default.
- W4300485503 hasAuthorship W4300485503A5060640583 @default.
- W4300485503 hasAuthorship W4300485503A5068951778 @default.
- W4300485503 hasAuthorship W4300485503A5070812231 @default.
- W4300485503 hasBestOaLocation W43004855031 @default.
- W4300485503 hasConcept C106131492 @default.
- W4300485503 hasConcept C111919701 @default.
- W4300485503 hasConcept C11413529 @default.
- W4300485503 hasConcept C115961682 @default.
- W4300485503 hasConcept C118505674 @default.
- W4300485503 hasConcept C153083717 @default.
- W4300485503 hasConcept C154945302 @default.
- W4300485503 hasConcept C157657479 @default.
- W4300485503 hasConcept C17744445 @default.
- W4300485503 hasConcept C199539241 @default.
- W4300485503 hasConcept C204321447 @default.
- W4300485503 hasConcept C2776946740 @default.
- W4300485503 hasConcept C2777530160 @default.
- W4300485503 hasConcept C28490314 @default.
- W4300485503 hasConcept C31972630 @default.
- W4300485503 hasConcept C41008148 @default.
- W4300485503 hasConcept C57273362 @default.
- W4300485503 hasConcept C94625758 @default.
- W4300485503 hasConcept C98045186 @default.
- W4300485503 hasConceptScore W4300485503C106131492 @default.
- W4300485503 hasConceptScore W4300485503C111919701 @default.
- W4300485503 hasConceptScore W4300485503C11413529 @default.
- W4300485503 hasConceptScore W4300485503C115961682 @default.
- W4300485503 hasConceptScore W4300485503C118505674 @default.
- W4300485503 hasConceptScore W4300485503C153083717 @default.
- W4300485503 hasConceptScore W4300485503C154945302 @default.
- W4300485503 hasConceptScore W4300485503C157657479 @default.
- W4300485503 hasConceptScore W4300485503C17744445 @default.
- W4300485503 hasConceptScore W4300485503C199539241 @default.
- W4300485503 hasConceptScore W4300485503C204321447 @default.
- W4300485503 hasConceptScore W4300485503C2776946740 @default.
- W4300485503 hasConceptScore W4300485503C2777530160 @default.
- W4300485503 hasConceptScore W4300485503C28490314 @default.
- W4300485503 hasConceptScore W4300485503C31972630 @default.
- W4300485503 hasConceptScore W4300485503C41008148 @default.
- W4300485503 hasConceptScore W4300485503C57273362 @default.
- W4300485503 hasConceptScore W4300485503C94625758 @default.
- W4300485503 hasConceptScore W4300485503C98045186 @default.
- W4300485503 hasLocation W43004855031 @default.
- W4300485503 hasOpenAccess W4300485503 @default.
- W4300485503 hasPrimaryLocation W43004855031 @default.
- W4300485503 hasRelatedWork W1978971213 @default.
- W4300485503 hasRelatedWork W2547835662 @default.
- W4300485503 hasRelatedWork W2788710361 @default.
- W4300485503 hasRelatedWork W2795359650 @default.
- W4300485503 hasRelatedWork W2903179935 @default.
- W4300485503 hasRelatedWork W2923366293 @default.
- W4300485503 hasRelatedWork W2963421891 @default.
- W4300485503 hasRelatedWork W3008515501 @default.
- W4300485503 hasRelatedWork W4281560470 @default.
- W4300485503 hasRelatedWork W4366341475 @default.
- W4300485503 isParatext "false" @default.
- W4300485503 isRetracted "false" @default.
- W4300485503 workType "article" @default.