Matches in SemOpenAlex for { <https://semopenalex.org/work/W4312367758> ?p ?o ?g. }
- W4312367758 endingPage "562" @default.
- W4312367758 startingPage "550" @default.
- W4312367758 abstract "Audio-visual signals can be used jointly for robotic perception as they complement each other. Such multi-modal sensory fusion has a clear advantage, especially under noisy acoustic conditions. Speaker localization, as an essential robotic function, was traditionally solved as a signal processing problem that now increasingly finds deep learning solutions. The question is how to fuse audio-visual signals in an effective way. Speaker tracking is not only more desirable, but also potentially more accurate than speaker localization because it explores the speaker's temporal motion dynamics for smoothed trajectory estimation. However, due to the lack of large annotated dataset, speaker tracking is not well studied as speaker localization. In this paper, we study robotic speaker Direction of Arrival (DoA) estimation with a focus on audio-visual fusion and tracking methodology. We propose a Cross-Modal Attentive Fusion (CMAF) mechanism, which explores self-attention to learn intra-modal temporal dependencies, and cross-attention mechanism for inter-modal alignment. We also collect a realistic dataset on a robotic platform to support the study. The experimental results demonstrate that our proposed network outperforms the state-of-the-art audio-visual localization and tracking methods under noisy conditions, with an improved accuracy of 5.82% and 3.62% at SNR = −20 dB, respectively." @default.
- W4312367758 created "2023-01-04" @default.
- W4312367758 creator A5006476935 @default.
- W4312367758 creator A5026852887 @default.
- W4312367758 creator A5032690182 @default.
- W4312367758 creator A5056495776 @default.
- W4312367758 creator A5077375723 @default.
- W4312367758 date "2023-01-01" @default.
- W4312367758 modified "2023-10-18" @default.
- W4312367758 title "Audio-Visual Cross-Attention Network for Robotic Speaker Tracking" @default.
- W4312367758 cites W1518556865 @default.
- W4312367758 cites W1555217905 @default.
- W4312367758 cites W1603075283 @default.
- W4312367758 cites W1790748249 @default.
- W4312367758 cites W1908118577 @default.
- W4312367758 cites W1994630425 @default.
- W4312367758 cites W2046317813 @default.
- W4312367758 cites W2088725063 @default.
- W4312367758 cites W2129866629 @default.
- W4312367758 cites W2167206042 @default.
- W4312367758 cites W2194775991 @default.
- W4312367758 cites W2231821870 @default.
- W4312367758 cites W2517955251 @default.
- W4312367758 cites W2543696449 @default.
- W4312367758 cites W2586642235 @default.
- W4312367758 cites W2618530766 @default.
- W4312367758 cites W2734774145 @default.
- W4312367758 cites W2772736377 @default.
- W4312367758 cites W2807015669 @default.
- W4312367758 cites W2885219692 @default.
- W4312367758 cites W2918984654 @default.
- W4312367758 cites W2937324313 @default.
- W4312367758 cites W2937742986 @default.
- W4312367758 cites W2940285530 @default.
- W4312367758 cites W2942551338 @default.
- W4312367758 cites W2962708126 @default.
- W4312367758 cites W2962960500 @default.
- W4312367758 cites W2963218389 @default.
- W4312367758 cites W2964342924 @default.
- W4312367758 cites W2969987364 @default.
- W4312367758 cites W2970201661 @default.
- W4312367758 cites W2981393651 @default.
- W4312367758 cites W2981905048 @default.
- W4312367758 cites W2989954484 @default.
- W4312367758 cites W2990113535 @default.
- W4312367758 cites W3011514609 @default.
- W4312367758 cites W3016158375 @default.
- W4312367758 cites W3019002735 @default.
- W4312367758 cites W3089887959 @default.
- W4312367758 cites W3093835562 @default.
- W4312367758 cites W3102937397 @default.
- W4312367758 cites W3116298410 @default.
- W4312367758 cites W3119066640 @default.
- W4312367758 cites W3119918773 @default.
- W4312367758 cites W3132182240 @default.
- W4312367758 cites W3162475350 @default.
- W4312367758 cites W3163287738 @default.
- W4312367758 cites W3188274837 @default.
- W4312367758 cites W3200324659 @default.
- W4312367758 cites W3205596865 @default.
- W4312367758 cites W4223646224 @default.
- W4312367758 cites W4251733995 @default.
- W4312367758 cites W4285819380 @default.
- W4312367758 doi "https://doi.org/10.1109/taslp.2022.3226330" @default.
- W4312367758 hasPublicationYear "2023" @default.
- W4312367758 type Work @default.
- W4312367758 citedByCount "1" @default.
- W4312367758 countsByYear W43123677582023 @default.
- W4312367758 crossrefType "journal-article" @default.
- W4312367758 hasAuthorship W4312367758A5006476935 @default.
- W4312367758 hasAuthorship W4312367758A5026852887 @default.
- W4312367758 hasAuthorship W4312367758A5032690182 @default.
- W4312367758 hasAuthorship W4312367758A5056495776 @default.
- W4312367758 hasAuthorship W4312367758A5077375723 @default.
- W4312367758 hasBestOaLocation W43123677581 @default.
- W4312367758 hasConcept C114793014 @default.
- W4312367758 hasConcept C119599485 @default.
- W4312367758 hasConcept C120665830 @default.
- W4312367758 hasConcept C121332964 @default.
- W4312367758 hasConcept C127313418 @default.
- W4312367758 hasConcept C127413603 @default.
- W4312367758 hasConcept C1276947 @default.
- W4312367758 hasConcept C13662910 @default.
- W4312367758 hasConcept C13895895 @default.
- W4312367758 hasConcept C141353440 @default.
- W4312367758 hasConcept C154945302 @default.
- W4312367758 hasConcept C15744967 @default.
- W4312367758 hasConcept C185592680 @default.
- W4312367758 hasConcept C188027245 @default.
- W4312367758 hasConcept C192209626 @default.
- W4312367758 hasConcept C19417346 @default.
- W4312367758 hasConcept C203718221 @default.
- W4312367758 hasConcept C2775936607 @default.
- W4312367758 hasConcept C28490314 @default.
- W4312367758 hasConcept C3017588708 @default.
- W4312367758 hasConcept C31972630 @default.
- W4312367758 hasConcept C41008148 @default.
- W4312367758 hasConcept C49774154 @default.