Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226421465> ?p ?o ?g. }
Showing items 1 to 99 of
99
with 100 items per page.
- W4226421465 endingPage "1460" @default.
- W4226421465 startingPage "1448" @default.
- W4226421465 abstract "The cross-speaker emotion transfer task in text-to-speech (TTS) synthesis particularly aims to synthesize speech for a target speaker with the emotion transferred from reference speech recorded by another (source) speaker. During the emotion transfer process, the identity information of the source speaker could also affect the synthesized results, resulting in the issue of speaker leakage, i.e., synthetic speech may have the voice identity of the source speaker rather than the target speaker. This paper proposes a new method with the aim to synthesize controllable emotional expressive speech and meanwhile maintain the target speaker’s identity in the cross-speaker emotion TTS task. The proposed method is a Tacotron2-based framework with emotion embedding as the conditioning variable to provide emotion information. Two emotion disentangling modules are contained in our method to 1) get speaker-irrelevant and emotion-discriminative embedding, and 2) explicitly constrain the emotion and speaker identity of synthetic speech to be that as expected. Moreover, we present an intuitive method to control the emotion strength in the synthetic speech for the target speaker. Specifically, the learned emotion embedding is adjusted with a flexible scalar value, which allows controlling the emotion strength conveyed by the embedding. Extensive experiments have been conducted on a Mandarin disjoint corpus, and the results demonstrate that the proposed method is able to synthesize reasonable emotional speech for the target speaker. Compared to the state-of-the-art reference embedding learned methods, our method gets the best performance on the cross-speaker emotion transfer task, indicating that our method achieves the new state-of-the-art performance on learning the speaker-irrelevant emotion embedding. Furthermore, the strength ranking test and pitch trajectories plots demonstrate that the proposed method can effectively control the emotion strength, leading to prosody-diverse synthetic speech." @default.
- W4226421465 created "2022-05-05" @default.
- W4226421465 creator A5021472684 @default.
- W4226421465 creator A5033976028 @default.
- W4226421465 creator A5049213273 @default.
- W4226421465 creator A5050219087 @default.
- W4226421465 creator A5055592507 @default.
- W4226421465 date "2022-01-01" @default.
- W4226421465 modified "2023-10-13" @default.
- W4226421465 title "Cross-Speaker Emotion Disentangling and Transfer for End-to-End Speech Synthesis" @default.
- W4226421465 cites W2043003570 @default.
- W4226421465 cites W2336823283 @default.
- W4226421465 cites W2398561585 @default.
- W4226421465 cites W2531207078 @default.
- W4226421465 cites W2785364623 @default.
- W4226421465 cites W2802968248 @default.
- W4226421465 cites W2889092828 @default.
- W4226421465 cites W2890287821 @default.
- W4226421465 cites W2904459034 @default.
- W4226421465 cites W2962936105 @default.
- W4226421465 cites W2963609956 @default.
- W4226421465 cites W2963920537 @default.
- W4226421465 cites W2964243274 @default.
- W4226421465 cites W2966387353 @default.
- W4226421465 cites W2968201928 @default.
- W4226421465 cites W3007580377 @default.
- W4226421465 cites W3008691130 @default.
- W4226421465 cites W3010916717 @default.
- W4226421465 cites W3015645837 @default.
- W4226421465 cites W3015841875 @default.
- W4226421465 cites W3015922793 @default.
- W4226421465 cites W3022876224 @default.
- W4226421465 cites W3023996010 @default.
- W4226421465 cites W3024869864 @default.
- W4226421465 cites W3094785744 @default.
- W4226421465 cites W3096457008 @default.
- W4226421465 cites W3135418837 @default.
- W4226421465 cites W3135644023 @default.
- W4226421465 cites W3139170550 @default.
- W4226421465 cites W3162794600 @default.
- W4226421465 cites W3171667842 @default.
- W4226421465 cites W3197704090 @default.
- W4226421465 cites W3197943112 @default.
- W4226421465 cites W3198791321 @default.
- W4226421465 doi "https://doi.org/10.1109/taslp.2022.3164181" @default.
- W4226421465 hasPublicationYear "2022" @default.
- W4226421465 type Work @default.
- W4226421465 citedByCount "5" @default.
- W4226421465 countsByYear W42264214652022 @default.
- W4226421465 countsByYear W42264214652023 @default.
- W4226421465 crossrefType "journal-article" @default.
- W4226421465 hasAuthorship W4226421465A5021472684 @default.
- W4226421465 hasAuthorship W4226421465A5033976028 @default.
- W4226421465 hasAuthorship W4226421465A5049213273 @default.
- W4226421465 hasAuthorship W4226421465A5050219087 @default.
- W4226421465 hasAuthorship W4226421465A5055592507 @default.
- W4226421465 hasBestOaLocation W42264214652 @default.
- W4226421465 hasConcept C121332964 @default.
- W4226421465 hasConcept C133892786 @default.
- W4226421465 hasConcept C149838564 @default.
- W4226421465 hasConcept C14999030 @default.
- W4226421465 hasConcept C154945302 @default.
- W4226421465 hasConcept C24890656 @default.
- W4226421465 hasConcept C2778355321 @default.
- W4226421465 hasConcept C28490314 @default.
- W4226421465 hasConcept C41008148 @default.
- W4226421465 hasConcept C41608201 @default.
- W4226421465 hasConcept C97931131 @default.
- W4226421465 hasConceptScore W4226421465C121332964 @default.
- W4226421465 hasConceptScore W4226421465C133892786 @default.
- W4226421465 hasConceptScore W4226421465C149838564 @default.
- W4226421465 hasConceptScore W4226421465C14999030 @default.
- W4226421465 hasConceptScore W4226421465C154945302 @default.
- W4226421465 hasConceptScore W4226421465C24890656 @default.
- W4226421465 hasConceptScore W4226421465C2778355321 @default.
- W4226421465 hasConceptScore W4226421465C28490314 @default.
- W4226421465 hasConceptScore W4226421465C41008148 @default.
- W4226421465 hasConceptScore W4226421465C41608201 @default.
- W4226421465 hasConceptScore W4226421465C97931131 @default.
- W4226421465 hasLocation W42264214651 @default.
- W4226421465 hasLocation W42264214652 @default.
- W4226421465 hasOpenAccess W4226421465 @default.
- W4226421465 hasPrimaryLocation W42264214651 @default.
- W4226421465 hasRelatedWork W1493012537 @default.
- W4226421465 hasRelatedWork W1521049138 @default.
- W4226421465 hasRelatedWork W2103897043 @default.
- W4226421465 hasRelatedWork W2125642021 @default.
- W4226421465 hasRelatedWork W2162158162 @default.
- W4226421465 hasRelatedWork W2206035908 @default.
- W4226421465 hasRelatedWork W3148366653 @default.
- W4226421465 hasRelatedWork W4247736853 @default.
- W4226421465 hasRelatedWork W4384929466 @default.
- W4226421465 hasRelatedWork W2175373321 @default.
- W4226421465 hasVolume "30" @default.
- W4226421465 isParatext "false" @default.
- W4226421465 isRetracted "false" @default.
- W4226421465 workType "article" @default.