Matches in SemOpenAlex for { <https://semopenalex.org/work/W3103702088> ?p ?o ?g. }
- W3103702088 abstract "Neuro-symbolic representations have proved effective in learning structure information in vision and language. In this paper, we propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning. Our approach uses a dictionary learning-based method of learning relations between videos and their paired text descriptions. We refer to these relations as relative roles and leverage them to make each token role-aware using attention. This results in a more structured and interpretable architecture that incorporates modality-specific inductive biases for the captioning task. Intuitively, the model is able to learn spatial, temporal, and cross-modal relations in a given pair of video and text. The disentanglement achieved by our proposal gives the model more capacity to capture multi-modal structures which result in captions with higher quality for videos. Our experiments on two established video captioning datasets verifies the effectiveness of the proposed approach based on automatic metrics. We further conduct a human evaluation to measure the grounding and relevance of the generated captions and observe consistent improvement for the proposed model. The codes and trained models can be found at this https URL" @default.
- W3103702088 created "2020-11-23" @default.
- W3103702088 creator A5001484153 @default.
- W3103702088 creator A5008582408 @default.
- W3103702088 creator A5027052332 @default.
- W3103702088 creator A5030468199 @default.
- W3103702088 creator A5030624782 @default.
- W3103702088 creator A5033846851 @default.
- W3103702088 creator A5036987560 @default.
- W3103702088 creator A5047233371 @default.
- W3103702088 creator A5088217655 @default.
- W3103702088 date "2020-11-18" @default.
- W3103702088 modified "2023-09-23" @default.
- W3103702088 title "Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language." @default.
- W3103702088 cites W1514535095 @default.
- W3103702088 cites W1522301498 @default.
- W3103702088 cites W1522734439 @default.
- W3103702088 cites W1573040851 @default.
- W3103702088 cites W1586939924 @default.
- W3103702088 cites W1601567445 @default.
- W3103702088 cites W1895577753 @default.
- W3103702088 cites W1927052826 @default.
- W3103702088 cites W1931639407 @default.
- W3103702088 cites W1956340063 @default.
- W3103702088 cites W1996430422 @default.
- W3103702088 cites W2013494846 @default.
- W3103702088 cites W2064675550 @default.
- W3103702088 cites W2101105183 @default.
- W3103702088 cites W2108325777 @default.
- W3103702088 cites W2110933980 @default.
- W3103702088 cites W2121879602 @default.
- W3103702088 cites W2123301721 @default.
- W3103702088 cites W2139501017 @default.
- W3103702088 cites W2142900973 @default.
- W3103702088 cites W2167293745 @default.
- W3103702088 cites W2242818861 @default.
- W3103702088 cites W2277195237 @default.
- W3103702088 cites W2506483933 @default.
- W3103702088 cites W2547875792 @default.
- W3103702088 cites W2619947201 @default.
- W3103702088 cites W2745461083 @default.
- W3103702088 cites W2766375149 @default.
- W3103702088 cites W2784025607 @default.
- W3103702088 cites W2789541106 @default.
- W3103702088 cites W2793170671 @default.
- W3103702088 cites W2899420756 @default.
- W3103702088 cites W2905145027 @default.
- W3103702088 cites W2951098185 @default.
- W3103702088 cites W2951390634 @default.
- W3103702088 cites W2962907269 @default.
- W3103702088 cites W2962990649 @default.
- W3103702088 cites W2963013168 @default.
- W3103702088 cites W2963101956 @default.
- W3103702088 cites W2963177403 @default.
- W3103702088 cites W2963329541 @default.
- W3103702088 cites W2963341956 @default.
- W3103702088 cites W2963351113 @default.
- W3103702088 cites W2963403868 @default.
- W3103702088 cites W2963576560 @default.
- W3103702088 cites W2963799213 @default.
- W3103702088 cites W2963916161 @default.
- W3103702088 cites W2963971014 @default.
- W3103702088 cites W2964241990 @default.
- W3103702088 cites W2964308564 @default.
- W3103702088 cites W2970231061 @default.
- W3103702088 cites W2973978812 @default.
- W3103702088 cites W2975357369 @default.
- W3103702088 cites W2980282514 @default.
- W3103702088 cites W2981037730 @default.
- W3103702088 cites W2981851019 @default.
- W3103702088 cites W2982111970 @default.
- W3103702088 cites W2988753485 @default.
- W3103702088 cites W2990503944 @default.
- W3103702088 cites W2996383576 @default.
- W3103702088 cites W2997591391 @default.
- W3103702088 cites W3006320872 @default.
- W3103702088 cites W3009192917 @default.
- W3103702088 cites W3020712669 @default.
- W3103702088 cites W3025136821 @default.
- W3103702088 cites W3030163527 @default.
- W3103702088 cites W3035265375 @default.
- W3103702088 cites W3035576622 @default.
- W3103702088 cites W3043447363 @default.
- W3103702088 cites W3082274269 @default.
- W3103702088 cites W3090449556 @default.
- W3103702088 cites W3091588028 @default.
- W3103702088 cites W3102685975 @default.
- W3103702088 cites W3104915307 @default.
- W3103702088 cites W2914587137 @default.
- W3103702088 hasPublicationYear "2020" @default.
- W3103702088 type Work @default.
- W3103702088 sameAs 3103702088 @default.
- W3103702088 citedByCount "0" @default.
- W3103702088 crossrefType "posted-content" @default.
- W3103702088 hasAuthorship W3103702088A5001484153 @default.
- W3103702088 hasAuthorship W3103702088A5008582408 @default.
- W3103702088 hasAuthorship W3103702088A5027052332 @default.
- W3103702088 hasAuthorship W3103702088A5030468199 @default.
- W3103702088 hasAuthorship W3103702088A5030624782 @default.
- W3103702088 hasAuthorship W3103702088A5033846851 @default.