Matches in SemOpenAlex for { <https://semopenalex.org/work/W4304098327> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W4304098327 abstract "Understanding human behaviors and intents from videos is a challenging task. Video flows usually involve time-series data from different modalities, such as natural language, facial gestures, and acoustic information. Due to the variable receiving frequency for sequences from each modality, the collected multimodal streams are usually unaligned. For multimodal fusion of asynchronous sequences, the existing methods focus on projecting multiple modalities into a common latent space and learning the hybrid representations, which neglects the diversity of each modality and the commonality across different modalities. Motivated by this observation, we propose a Multimodal Fusion approach for learning modality-Specific and modality-Agnostic representations (MFSA) to refine multimodal representations and leverage the complementarity across different modalities. Specifically, a predictive self-attention module is used to capture reliable contextual dependencies and enhance the unique features over the modality-specific spaces. Meanwhile, we propose a hierarchical cross-modal attention module to explore the correlations between cross-modal elements over the modality-agnostic space. In this case, a double-discriminator strategy is presented to ensure the production of distinct representations in an adversarial manner. Eventually, the modality-specific and -agnostic multimodal representations are used together for downstream tasks. Comprehensive experiments on three multimodal datasets clearly demonstrate the superiority of our approach." @default.
- W4304098327 created "2022-10-10" @default.
- W4304098327 creator A5060264631 @default.
- W4304098327 creator A5064280715 @default.
- W4304098327 creator A5072909549 @default.
- W4304098327 creator A5074948475 @default.
- W4304098327 date "2022-10-10" @default.
- W4304098327 modified "2023-09-25" @default.
- W4304098327 title "Learning Modality-Specific and -Agnostic Representations for Asynchronous Multimodal Language Sequences" @default.
- W4304098327 cites W2105020570 @default.
- W4304098327 cites W2127141656 @default.
- W4304098327 cites W2250539671 @default.
- W4304098327 cites W2470957930 @default.
- W4304098327 cites W2556418146 @default.
- W4304098327 cites W2753064086 @default.
- W4304098327 cites W2777446440 @default.
- W4304098327 cites W2883409523 @default.
- W4304098327 cites W2883430806 @default.
- W4304098327 cites W2962931510 @default.
- W4304098327 cites W2964216663 @default.
- W4304098327 cites W3001529617 @default.
- W4304098327 cites W3012948425 @default.
- W4304098327 cites W3169801598 @default.
- W4304098327 cites W3173549566 @default.
- W4304098327 cites W3205519684 @default.
- W4304098327 cites W3207409354 @default.
- W4304098327 cites W3209114384 @default.
- W4304098327 cites W4221002311 @default.
- W4304098327 cites W4221163935 @default.
- W4304098327 cites W4221166007 @default.
- W4304098327 cites W4225274086 @default.
- W4304098327 cites W4283797804 @default.
- W4304098327 doi "https://doi.org/10.1145/3503161.3547755" @default.
- W4304098327 hasPublicationYear "2022" @default.
- W4304098327 type Work @default.
- W4304098327 citedByCount "7" @default.
- W4304098327 countsByYear W43040983272023 @default.
- W4304098327 crossrefType "proceedings-article" @default.
- W4304098327 hasAuthorship W4304098327A5060264631 @default.
- W4304098327 hasAuthorship W4304098327A5064280715 @default.
- W4304098327 hasAuthorship W4304098327A5072909549 @default.
- W4304098327 hasAuthorship W4304098327A5074948475 @default.
- W4304098327 hasConcept C119857082 @default.
- W4304098327 hasConcept C144024400 @default.
- W4304098327 hasConcept C153083717 @default.
- W4304098327 hasConcept C154945302 @default.
- W4304098327 hasConcept C202269582 @default.
- W4304098327 hasConcept C204321447 @default.
- W4304098327 hasConcept C207347870 @default.
- W4304098327 hasConcept C2779903281 @default.
- W4304098327 hasConcept C2780226545 @default.
- W4304098327 hasConcept C2780660688 @default.
- W4304098327 hasConcept C36289849 @default.
- W4304098327 hasConcept C41008148 @default.
- W4304098327 hasConcept C54355233 @default.
- W4304098327 hasConcept C86803240 @default.
- W4304098327 hasConceptScore W4304098327C119857082 @default.
- W4304098327 hasConceptScore W4304098327C144024400 @default.
- W4304098327 hasConceptScore W4304098327C153083717 @default.
- W4304098327 hasConceptScore W4304098327C154945302 @default.
- W4304098327 hasConceptScore W4304098327C202269582 @default.
- W4304098327 hasConceptScore W4304098327C204321447 @default.
- W4304098327 hasConceptScore W4304098327C207347870 @default.
- W4304098327 hasConceptScore W4304098327C2779903281 @default.
- W4304098327 hasConceptScore W4304098327C2780226545 @default.
- W4304098327 hasConceptScore W4304098327C2780660688 @default.
- W4304098327 hasConceptScore W4304098327C36289849 @default.
- W4304098327 hasConceptScore W4304098327C41008148 @default.
- W4304098327 hasConceptScore W4304098327C54355233 @default.
- W4304098327 hasConceptScore W4304098327C86803240 @default.
- W4304098327 hasLocation W43040983271 @default.
- W4304098327 hasOpenAccess W4304098327 @default.
- W4304098327 hasPrimaryLocation W43040983271 @default.
- W4304098327 hasRelatedWork W1883289659 @default.
- W4304098327 hasRelatedWork W2055532432 @default.
- W4304098327 hasRelatedWork W2608527058 @default.
- W4304098327 hasRelatedWork W2760367275 @default.
- W4304098327 hasRelatedWork W2904518532 @default.
- W4304098327 hasRelatedWork W2962931510 @default.
- W4304098327 hasRelatedWork W3200817606 @default.
- W4304098327 hasRelatedWork W4380551887 @default.
- W4304098327 hasRelatedWork W4383993410 @default.
- W4304098327 hasRelatedWork W201765088 @default.
- W4304098327 isParatext "false" @default.
- W4304098327 isRetracted "false" @default.
- W4304098327 workType "article" @default.