Matches in SemOpenAlex for { <https://semopenalex.org/work/W3186351311> ?p ?o ?g. }
- W3186351311 endingPage "2629" @default.
- W3186351311 startingPage "2617" @default.
- W3186351311 abstract "Multimodal emotion recognition is a challenging task in emotion computing as it is quite difficult to extract discriminative features to identify the subtle differences in human emotions with abstract concept and multiple expressions. Moreover, how to fully utilize both audio and visual information is still an open problem. In this paper, we propose a novel multimodal fusion attention network for audio-visual emotion recognition based on adaptive and multi-level factorized bilinear pooling (FBP). First, for the audio stream, a fully convolutional network (FCN) equipped with 1-D attention mechanism and local response normalization is designed for speech emotion recognition. Next, a global FBP (G-FBP) approach is presented to perform audio-visual information fusion by integrating self-attention based video stream with the proposed audio stream. To improve G-FBP, an adaptive strategy (AG-FBP) to dynamically calculate the fusion weight of two modalities is devised based on the emotion-related representation vectors from the attention mechanism of respective modalities. Finally, to fully utilize the local emotion information, adaptive and multi-level FBP (AM-FBP) is introduced by combining both global-trunk and intra-trunk data in one recording on top of AG-FBP. Tested on the IEMOCAP corpus for speech emotion recognition with only audio stream, the new FCN method outperforms the state-of-the-art results with an accuracy of 71.40%. Moreover, validated on the AFEW database of EmotiW2019 sub-challenge and the IEMOCAP corpus for audio-visual emotion recognition, the proposed AM-FBP approach achieves the best accuracy of 63.09% and 75.49% respectively on the test set." @default.
- W3186351311 created "2021-08-02" @default.
- W3186351311 creator A5002188808 @default.
- W3186351311 creator A5019935534 @default.
- W3186351311 creator A5066595711 @default.
- W3186351311 creator A5066868860 @default.
- W3186351311 creator A5072315367 @default.
- W3186351311 creator A5083942931 @default.
- W3186351311 date "2021-01-01" @default.
- W3186351311 modified "2023-10-16" @default.
- W3186351311 title "Information Fusion in Attention Networks Using Adaptive and Multi-Level Factorized Bilinear Pooling for Audio-Visual Emotion Recognition" @default.
- W3186351311 cites W1522734439 @default.
- W3186351311 cites W1586176709 @default.
- W3186351311 cites W1981070999 @default.
- W3186351311 cites W1981918162 @default.
- W3186351311 cites W2005445667 @default.
- W3186351311 cites W2026243162 @default.
- W3186351311 cites W2058127449 @default.
- W3186351311 cites W2071249869 @default.
- W3186351311 cites W2071954570 @default.
- W3186351311 cites W2074657431 @default.
- W3186351311 cites W2080289724 @default.
- W3186351311 cites W2104657103 @default.
- W3186351311 cites W2106390385 @default.
- W3186351311 cites W2112555538 @default.
- W3186351311 cites W2123260696 @default.
- W3186351311 cites W2126552487 @default.
- W3186351311 cites W2136155248 @default.
- W3186351311 cites W2146334809 @default.
- W3186351311 cites W2163026698 @default.
- W3186351311 cites W2168692779 @default.
- W3186351311 cites W2214134199 @default.
- W3186351311 cites W2300860047 @default.
- W3186351311 cites W2546875627 @default.
- W3186351311 cites W2551829638 @default.
- W3186351311 cites W2604290265 @default.
- W3186351311 cites W2625297138 @default.
- W3186351311 cites W2648194195 @default.
- W3186351311 cites W2703895418 @default.
- W3186351311 cites W2747664154 @default.
- W3186351311 cites W2767618761 @default.
- W3186351311 cites W2806432949 @default.
- W3186351311 cites W2884739346 @default.
- W3186351311 cites W2885005742 @default.
- W3186351311 cites W2888683367 @default.
- W3186351311 cites W2889065492 @default.
- W3186351311 cites W2889374687 @default.
- W3186351311 cites W2889462515 @default.
- W3186351311 cites W2889717020 @default.
- W3186351311 cites W2890915920 @default.
- W3186351311 cites W2891182955 @default.
- W3186351311 cites W2894458059 @default.
- W3186351311 cites W2894805842 @default.
- W3186351311 cites W2894871570 @default.
- W3186351311 cites W2895006884 @default.
- W3186351311 cites W2937624600 @default.
- W3186351311 cites W2937977583 @default.
- W3186351311 cites W2956824775 @default.
- W3186351311 cites W2963066927 @default.
- W3186351311 cites W2963150162 @default.
- W3186351311 cites W2970710980 @default.
- W3186351311 cites W2970868904 @default.
- W3186351311 cites W2972463723 @default.
- W3186351311 cites W2972495317 @default.
- W3186351311 cites W2972640480 @default.
- W3186351311 cites W2972688505 @default.
- W3186351311 cites W2973037561 @default.
- W3186351311 cites W2977809668 @default.
- W3186351311 cites W2980393359 @default.
- W3186351311 cites W2980730551 @default.
- W3186351311 cites W2981072818 @default.
- W3186351311 cites W2981101532 @default.
- W3186351311 cites W2981171802 @default.
- W3186351311 cites W2982395984 @default.
- W3186351311 cites W3097488938 @default.
- W3186351311 cites W3097969370 @default.
- W3186351311 cites W3104696513 @default.
- W3186351311 cites W3156237571 @default.
- W3186351311 doi "https://doi.org/10.1109/taslp.2021.3096037" @default.
- W3186351311 hasPublicationYear "2021" @default.
- W3186351311 type Work @default.
- W3186351311 sameAs 3186351311 @default.
- W3186351311 citedByCount "15" @default.
- W3186351311 countsByYear W31863513112021 @default.
- W3186351311 countsByYear W31863513112022 @default.
- W3186351311 countsByYear W31863513112023 @default.
- W3186351311 crossrefType "journal-article" @default.
- W3186351311 hasAuthorship W3186351311A5002188808 @default.
- W3186351311 hasAuthorship W3186351311A5019935534 @default.
- W3186351311 hasAuthorship W3186351311A5066595711 @default.
- W3186351311 hasAuthorship W3186351311A5066868860 @default.
- W3186351311 hasAuthorship W3186351311A5072315367 @default.
- W3186351311 hasAuthorship W3186351311A5083942931 @default.
- W3186351311 hasBestOaLocation W31863513112 @default.
- W3186351311 hasConcept C153180895 @default.
- W3186351311 hasConcept C154945302 @default.
- W3186351311 hasConcept C205203396 @default.
- W3186351311 hasConcept C28490314 @default.