Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385965619> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W4385965619 abstract "The integration of different modalities, such as audio and visual information, plays a crucial role in human perception of the surrounding environment. Recent research has made significant progress in designing fusion modules for audio-visual speech separation. However, they predominantly focus on multi-modal fusion architectures situated either at the top or bottom positions, rather than comprehensively considering multi-modal fusion at various hierarchical positions within the network. In this paper, we propose a novel model called self- and cross-attention network (SCANet), which leverages the attention mechanism for efficient audio-visual feature fusion. SCANet consists of two types of attention blocks: self-attention (SA) and cross-attention (CA) blocks, where the CA blocks are distributed at the top (TCA), middle (MCA) and bottom (BCA) of SCANet. These blocks maintain the ability to learn modality-specific features and enable the extraction of different semantics from audio-visual features. Comprehensive experiments on three standard audio-visual separation benchmarks (LRS2, LRS3, and VoxCeleb2) demonstrate the effectiveness of SCANet, outperforming existing state-of-the-art (SOTA) methods while maintaining comparable inference time." @default.
- W4385965619 created "2023-08-18" @default.
- W4385965619 creator A5004579631 @default.
- W4385965619 creator A5006425914 @default.
- W4385965619 creator A5051363890 @default.
- W4385965619 date "2023-08-16" @default.
- W4385965619 modified "2023-09-27" @default.
- W4385965619 title "SCANet: A Self- and Cross-Attention Network for Audio-Visual Speech Separation" @default.
- W4385965619 doi "https://doi.org/10.48550/arxiv.2308.08143" @default.
- W4385965619 hasPublicationYear "2023" @default.
- W4385965619 type Work @default.
- W4385965619 citedByCount "0" @default.
- W4385965619 crossrefType "posted-content" @default.
- W4385965619 hasAuthorship W4385965619A5004579631 @default.
- W4385965619 hasAuthorship W4385965619A5006425914 @default.
- W4385965619 hasAuthorship W4385965619A5051363890 @default.
- W4385965619 hasBestOaLocation W43859656191 @default.
- W4385965619 hasConcept C120665830 @default.
- W4385965619 hasConcept C121332964 @default.
- W4385965619 hasConcept C138885662 @default.
- W4385965619 hasConcept C144024400 @default.
- W4385965619 hasConcept C154945302 @default.
- W4385965619 hasConcept C169760540 @default.
- W4385965619 hasConcept C184337299 @default.
- W4385965619 hasConcept C185592680 @default.
- W4385965619 hasConcept C188027245 @default.
- W4385965619 hasConcept C192209626 @default.
- W4385965619 hasConcept C199360897 @default.
- W4385965619 hasConcept C26760741 @default.
- W4385965619 hasConcept C2776214188 @default.
- W4385965619 hasConcept C2776401178 @default.
- W4385965619 hasConcept C2779903281 @default.
- W4385965619 hasConcept C2780226545 @default.
- W4385965619 hasConcept C28490314 @default.
- W4385965619 hasConcept C3017588708 @default.
- W4385965619 hasConcept C36289849 @default.
- W4385965619 hasConcept C36464697 @default.
- W4385965619 hasConcept C41008148 @default.
- W4385965619 hasConcept C41895202 @default.
- W4385965619 hasConcept C49774154 @default.
- W4385965619 hasConcept C71139939 @default.
- W4385965619 hasConcept C86803240 @default.
- W4385965619 hasConceptScore W4385965619C120665830 @default.
- W4385965619 hasConceptScore W4385965619C121332964 @default.
- W4385965619 hasConceptScore W4385965619C138885662 @default.
- W4385965619 hasConceptScore W4385965619C144024400 @default.
- W4385965619 hasConceptScore W4385965619C154945302 @default.
- W4385965619 hasConceptScore W4385965619C169760540 @default.
- W4385965619 hasConceptScore W4385965619C184337299 @default.
- W4385965619 hasConceptScore W4385965619C185592680 @default.
- W4385965619 hasConceptScore W4385965619C188027245 @default.
- W4385965619 hasConceptScore W4385965619C192209626 @default.
- W4385965619 hasConceptScore W4385965619C199360897 @default.
- W4385965619 hasConceptScore W4385965619C26760741 @default.
- W4385965619 hasConceptScore W4385965619C2776214188 @default.
- W4385965619 hasConceptScore W4385965619C2776401178 @default.
- W4385965619 hasConceptScore W4385965619C2779903281 @default.
- W4385965619 hasConceptScore W4385965619C2780226545 @default.
- W4385965619 hasConceptScore W4385965619C28490314 @default.
- W4385965619 hasConceptScore W4385965619C3017588708 @default.
- W4385965619 hasConceptScore W4385965619C36289849 @default.
- W4385965619 hasConceptScore W4385965619C36464697 @default.
- W4385965619 hasConceptScore W4385965619C41008148 @default.
- W4385965619 hasConceptScore W4385965619C41895202 @default.
- W4385965619 hasConceptScore W4385965619C49774154 @default.
- W4385965619 hasConceptScore W4385965619C71139939 @default.
- W4385965619 hasConceptScore W4385965619C86803240 @default.
- W4385965619 hasLocation W43859656191 @default.
- W4385965619 hasOpenAccess W4385965619 @default.
- W4385965619 hasPrimaryLocation W43859656191 @default.
- W4385965619 hasRelatedWork W2474574787 @default.
- W4385965619 hasRelatedWork W2949074159 @default.
- W4385965619 hasRelatedWork W2952745240 @default.
- W4385965619 hasRelatedWork W3142456083 @default.
- W4385965619 hasRelatedWork W3198037411 @default.
- W4385965619 hasRelatedWork W4298715519 @default.
- W4385965619 hasRelatedWork W4301143707 @default.
- W4385965619 hasRelatedWork W4306353150 @default.
- W4385965619 hasRelatedWork W4386554981 @default.
- W4385965619 hasRelatedWork W4386721968 @default.
- W4385965619 isParatext "false" @default.
- W4385965619 isRetracted "false" @default.
- W4385965619 workType "article" @default.