Matches in SemOpenAlex for { <https://semopenalex.org/work/W4286488082> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W4286488082 endingPage "18" @default.
- W4286488082 startingPage "1" @default.
- W4286488082 abstract "Video captioning, which bridges vision and language, is a fundamental yet challenging task in computer vision. To generate accurate and comprehensive sentences, both visual and semantic information is quite important. However, most existing methods simply concatenate different types of features and ignore the interactions between them. In addition, there is a large semantic gap between visual feature space and semantic embedding space, making the task very challenging. To address these issues, we propose a framework named semantic embedding guided attention with Explicit visual Feature Fusion for vidEo CapTioning, EFFECT for short, in which we design an explicit visual-feature fusion (EVF) scheme to capture the pairwise interactions between multiple visual modalities and fuse multimodal visual features of videos in an explicit way. Furthermore, we propose a novel attention mechanism called semantic embedding guided attention (SEGA ), which cooperates with the temporal attention to generate a joint attention map. Specifically, in SEGA, the semantic word embedding information is leveraged to guide the model to pay more attention to the most correlated visual features at each decoding stage. In this way, the semantic gap between visual and semantic space is alleviated to some extent. To evaluate the proposed model, we conduct extensive experiments on two widely used datasets, i.e., MSVD and MSR-VTT. The experimental results demonstrate that our approach achieves state-of-the-art results in terms of four evaluation metrics." @default.
- W4286488082 created "2022-07-22" @default.
- W4286488082 creator A5004858240 @default.
- W4286488082 creator A5049500421 @default.
- W4286488082 creator A5052168662 @default.
- W4286488082 creator A5066725930 @default.
- W4286488082 creator A5086235570 @default.
- W4286488082 date "2023-02-06" @default.
- W4286488082 modified "2023-09-28" @default.
- W4286488082 title "Semantic Embedding Guided Attention with Explicit Visual Feature Fusion for Video Captioning" @default.
- W4286488082 cites W2117539524 @default.
- W4286488082 cites W2133459682 @default.
- W4286488082 cites W2607326921 @default.
- W4286488082 cites W2739107216 @default.
- W4286488082 cites W2808203533 @default.
- W4286488082 cites W2945223572 @default.
- W4286488082 cites W2996817764 @default.
- W4286488082 cites W3162529291 @default.
- W4286488082 doi "https://doi.org/10.1145/3550276" @default.
- W4286488082 hasPublicationYear "2023" @default.
- W4286488082 type Work @default.
- W4286488082 citedByCount "1" @default.
- W4286488082 countsByYear W42864880822023 @default.
- W4286488082 crossrefType "journal-article" @default.
- W4286488082 hasAuthorship W4286488082A5004858240 @default.
- W4286488082 hasAuthorship W4286488082A5049500421 @default.
- W4286488082 hasAuthorship W4286488082A5052168662 @default.
- W4286488082 hasAuthorship W4286488082A5066725930 @default.
- W4286488082 hasAuthorship W4286488082A5086235570 @default.
- W4286488082 hasConcept C115961682 @default.
- W4286488082 hasConcept C138885662 @default.
- W4286488082 hasConcept C154945302 @default.
- W4286488082 hasConcept C157657479 @default.
- W4286488082 hasConcept C162324750 @default.
- W4286488082 hasConcept C1667742 @default.
- W4286488082 hasConcept C184898388 @default.
- W4286488082 hasConcept C187736073 @default.
- W4286488082 hasConcept C204321447 @default.
- W4286488082 hasConcept C2776401178 @default.
- W4286488082 hasConcept C2780451532 @default.
- W4286488082 hasConcept C36464697 @default.
- W4286488082 hasConcept C41008148 @default.
- W4286488082 hasConcept C41608201 @default.
- W4286488082 hasConcept C41895202 @default.
- W4286488082 hasConcept C86034646 @default.
- W4286488082 hasConceptScore W4286488082C115961682 @default.
- W4286488082 hasConceptScore W4286488082C138885662 @default.
- W4286488082 hasConceptScore W4286488082C154945302 @default.
- W4286488082 hasConceptScore W4286488082C157657479 @default.
- W4286488082 hasConceptScore W4286488082C162324750 @default.
- W4286488082 hasConceptScore W4286488082C1667742 @default.
- W4286488082 hasConceptScore W4286488082C184898388 @default.
- W4286488082 hasConceptScore W4286488082C187736073 @default.
- W4286488082 hasConceptScore W4286488082C204321447 @default.
- W4286488082 hasConceptScore W4286488082C2776401178 @default.
- W4286488082 hasConceptScore W4286488082C2780451532 @default.
- W4286488082 hasConceptScore W4286488082C36464697 @default.
- W4286488082 hasConceptScore W4286488082C41008148 @default.
- W4286488082 hasConceptScore W4286488082C41608201 @default.
- W4286488082 hasConceptScore W4286488082C41895202 @default.
- W4286488082 hasConceptScore W4286488082C86034646 @default.
- W4286488082 hasFunder F4320321001 @default.
- W4286488082 hasFunder F4320324174 @default.
- W4286488082 hasIssue "2" @default.
- W4286488082 hasLocation W42864880821 @default.
- W4286488082 hasOpenAccess W4286488082 @default.
- W4286488082 hasPrimaryLocation W42864880821 @default.
- W4286488082 hasRelatedWork W2081647779 @default.
- W4286488082 hasRelatedWork W2176040302 @default.
- W4286488082 hasRelatedWork W2735824434 @default.
- W4286488082 hasRelatedWork W2753526458 @default.
- W4286488082 hasRelatedWork W2963026686 @default.
- W4286488082 hasRelatedWork W2963898017 @default.
- W4286488082 hasRelatedWork W3107474891 @default.
- W4286488082 hasRelatedWork W3185852197 @default.
- W4286488082 hasRelatedWork W4200486724 @default.
- W4286488082 hasRelatedWork W4285224442 @default.
- W4286488082 hasVolume "19" @default.
- W4286488082 isParatext "false" @default.
- W4286488082 isRetracted "false" @default.
- W4286488082 workType "article" @default.