Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313887688> ?p ?o ?g. }
- W4313887688 endingPage "788" @default.
- W4313887688 startingPage "775" @default.
- W4313887688 abstract "Paralinguistic speech processing is important in addressing many issues, such as sentiment and neurocognitive disorder analyses. Recently, Transformer has achieved remarkable success in the natural language processing field and has demonstrated its adaptation to speech. However, previous works on Transformer in the speech field have not incorporated the properties of speech, leaving the full potential of Transformer unexplored. In this paper, we consider the characteristics of speech and propose a general structure-based framework, called SpeechFormer++, for paralinguistic speech processing. More concretely, following the component relationship in the speech signal, we design a unit encoder to model the intra- and inter-unit information (i.e., frames, phones, and words) efficiently. According to the hierarchical relationship, we utilize merging blocks to generate features at different granularities, which is consistent with the structural pattern in the speech signal. Moreover, a word encoder is introduced to integrate word-grained features into each unit encoder, which effectively balances fine-grained and coarse-grained information. SpeechFormer++ is evaluated on the speech emotion recognition (IEMOCAP & MELD), depression classification (DAIC-WOZ) and Alzheimer's disease detection (Pitt) tasks. The results show that SpeechFormer++ outperforms the standard Transformer while greatly reducing the computational cost. Furthermore, it delivers superior results compared to the state-of-the-art approaches." @default.
- W4313887688 created "2023-01-10" @default.
- W4313887688 creator A5007354180 @default.
- W4313887688 creator A5036301580 @default.
- W4313887688 creator A5040709029 @default.
- W4313887688 creator A5041597624 @default.
- W4313887688 creator A5047568626 @default.
- W4313887688 date "2023-01-01" @default.
- W4313887688 modified "2023-10-16" @default.
- W4313887688 title "SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing" @default.
- W4313887688 cites W1935012542 @default.
- W4313887688 cites W1964757081 @default.
- W4313887688 cites W1972420736 @default.
- W4313887688 cites W2029996593 @default.
- W4313887688 cites W2074037951 @default.
- W4313887688 cites W2076423492 @default.
- W4313887688 cites W2099767163 @default.
- W4313887688 cites W2114330138 @default.
- W4313887688 cites W2134929491 @default.
- W4313887688 cites W2146334809 @default.
- W4313887688 cites W2157331557 @default.
- W4313887688 cites W2408520939 @default.
- W4313887688 cites W2529925562 @default.
- W4313887688 cites W2751214333 @default.
- W4313887688 cites W2786625961 @default.
- W4313887688 cites W2896480997 @default.
- W4313887688 cites W2905016804 @default.
- W4313887688 cites W2963686995 @default.
- W4313887688 cites W2964266095 @default.
- W4313887688 cites W2964295436 @default.
- W4313887688 cites W2965453734 @default.
- W4313887688 cites W2973049979 @default.
- W4313887688 cites W2976556660 @default.
- W4313887688 cites W2981857663 @default.
- W4313887688 cites W2990825125 @default.
- W4313887688 cites W3014475539 @default.
- W4313887688 cites W3015240477 @default.
- W4313887688 cites W3015554124 @default.
- W4313887688 cites W3016138882 @default.
- W4313887688 cites W3035448883 @default.
- W4313887688 cites W3042631625 @default.
- W4313887688 cites W3095334805 @default.
- W4313887688 cites W3096039514 @default.
- W4313887688 cites W3097777922 @default.
- W4313887688 cites W3101080567 @default.
- W4313887688 cites W3120680448 @default.
- W4313887688 cites W3124411359 @default.
- W4313887688 cites W3138516171 @default.
- W4313887688 cites W3154831185 @default.
- W4313887688 cites W3161659450 @default.
- W4313887688 cites W3162475537 @default.
- W4313887688 cites W3162993161 @default.
- W4313887688 cites W3174794493 @default.
- W4313887688 cites W3182607842 @default.
- W4313887688 cites W3196495667 @default.
- W4313887688 cites W3196735225 @default.
- W4313887688 cites W3197977579 @default.
- W4313887688 cites W3198220993 @default.
- W4313887688 cites W3201447839 @default.
- W4313887688 cites W3208158979 @default.
- W4313887688 cites W3209059054 @default.
- W4313887688 cites W4221154966 @default.
- W4313887688 cites W4221162872 @default.
- W4313887688 cites W4221162874 @default.
- W4313887688 cites W4224917001 @default.
- W4313887688 cites W4224918181 @default.
- W4313887688 cites W4224932149 @default.
- W4313887688 cites W4225748935 @default.
- W4313887688 cites W4226442948 @default.
- W4313887688 cites W4285106979 @default.
- W4313887688 cites W4285194130 @default.
- W4313887688 cites W4296070431 @default.
- W4313887688 cites W4297841379 @default.
- W4313887688 cites W4312530435 @default.
- W4313887688 doi "https://doi.org/10.1109/taslp.2023.3235194" @default.
- W4313887688 hasPublicationYear "2023" @default.
- W4313887688 type Work @default.
- W4313887688 citedByCount "3" @default.
- W4313887688 countsByYear W43138876882023 @default.
- W4313887688 crossrefType "journal-article" @default.
- W4313887688 hasAuthorship W4313887688A5007354180 @default.
- W4313887688 hasAuthorship W4313887688A5036301580 @default.
- W4313887688 hasAuthorship W4313887688A5040709029 @default.
- W4313887688 hasAuthorship W4313887688A5041597624 @default.
- W4313887688 hasAuthorship W4313887688A5047568626 @default.
- W4313887688 hasBestOaLocation W43138876882 @default.
- W4313887688 hasConcept C111919701 @default.
- W4313887688 hasConcept C118505674 @default.
- W4313887688 hasConcept C121332964 @default.
- W4313887688 hasConcept C133378560 @default.
- W4313887688 hasConcept C13895895 @default.
- W4313887688 hasConcept C144024400 @default.
- W4313887688 hasConcept C154945302 @default.
- W4313887688 hasConcept C165801399 @default.
- W4313887688 hasConcept C204201278 @default.
- W4313887688 hasConcept C204321447 @default.
- W4313887688 hasConcept C28490314 @default.
- W4313887688 hasConcept C41008148 @default.