Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385896119> ?p ?o ?g. }
Showing items 1 to 65 of
65
with 100 items per page.
- W4385896119 abstract "The diffusion model is capable of generating high-quality data through a probabilistic approach. However, it suffers from the drawback of slow generation speed due to the requirement of a large number of time steps. To address this limitation, recent models such as denoising diffusion implicit models (DDIM) focus on generating samples without directly modeling the probability distribution, while models like denoising diffusion generative adversarial networks (GAN) combine diffusion processes with GANs. In the field of speech synthesis, a recent diffusion speech synthesis model called DiffGAN-TTS, utilizing the structure of GANs, has been introduced and demonstrates superior performance in both speech quality and generation speed. In this paper, to further enhance the performance of DiffGAN-TTS, we propose a speech synthesis model with two discriminators: a diffusion discriminator for learning the distribution of the reverse process and a spectrogram discriminator for learning the distribution of the generated data. Objective metrics such as structural similarity index measure (SSIM), mel-cepstral distortion (MCD), F0 root mean squared error (F0 RMSE), short-time objective intelligibility (STOI), perceptual evaluation of speech quality (PESQ), as well as subjective metrics like mean opinion score (MOS), are used to evaluate the performance of the proposed model. The evaluation results show that the proposed model outperforms recent state-of-the-art models such as FastSpeech2 and DiffGAN-TTS in various metrics. Our implementation and audio samples are located on GitHub." @default.
- W4385896119 created "2023-08-18" @default.
- W4385896119 creator A5061227788 @default.
- W4385896119 creator A5079603835 @default.
- W4385896119 date "2023-08-03" @default.
- W4385896119 modified "2023-09-28" @default.
- W4385896119 title "Adversarial Training of Denoising Diffusion Model Using Dual Discriminators for High-Fidelity Multi-Speaker TTS" @default.
- W4385896119 doi "https://doi.org/10.48550/arxiv.2308.01573" @default.
- W4385896119 hasPublicationYear "2023" @default.
- W4385896119 type Work @default.
- W4385896119 citedByCount "0" @default.
- W4385896119 crossrefType "posted-content" @default.
- W4385896119 hasAuthorship W4385896119A5061227788 @default.
- W4385896119 hasAuthorship W4385896119A5079603835 @default.
- W4385896119 hasBestOaLocation W43858961191 @default.
- W4385896119 hasConcept C103734657 @default.
- W4385896119 hasConcept C105795698 @default.
- W4385896119 hasConcept C139945424 @default.
- W4385896119 hasConcept C153180895 @default.
- W4385896119 hasConcept C154945302 @default.
- W4385896119 hasConcept C162324750 @default.
- W4385896119 hasConcept C163294075 @default.
- W4385896119 hasConcept C176217482 @default.
- W4385896119 hasConcept C21547014 @default.
- W4385896119 hasConcept C2776182073 @default.
- W4385896119 hasConcept C2779803651 @default.
- W4385896119 hasConcept C28490314 @default.
- W4385896119 hasConcept C33923547 @default.
- W4385896119 hasConcept C41008148 @default.
- W4385896119 hasConcept C62897895 @default.
- W4385896119 hasConcept C76155785 @default.
- W4385896119 hasConcept C94915269 @default.
- W4385896119 hasConceptScore W4385896119C103734657 @default.
- W4385896119 hasConceptScore W4385896119C105795698 @default.
- W4385896119 hasConceptScore W4385896119C139945424 @default.
- W4385896119 hasConceptScore W4385896119C153180895 @default.
- W4385896119 hasConceptScore W4385896119C154945302 @default.
- W4385896119 hasConceptScore W4385896119C162324750 @default.
- W4385896119 hasConceptScore W4385896119C163294075 @default.
- W4385896119 hasConceptScore W4385896119C176217482 @default.
- W4385896119 hasConceptScore W4385896119C21547014 @default.
- W4385896119 hasConceptScore W4385896119C2776182073 @default.
- W4385896119 hasConceptScore W4385896119C2779803651 @default.
- W4385896119 hasConceptScore W4385896119C28490314 @default.
- W4385896119 hasConceptScore W4385896119C33923547 @default.
- W4385896119 hasConceptScore W4385896119C41008148 @default.
- W4385896119 hasConceptScore W4385896119C62897895 @default.
- W4385896119 hasConceptScore W4385896119C76155785 @default.
- W4385896119 hasConceptScore W4385896119C94915269 @default.
- W4385896119 hasLocation W43858961191 @default.
- W4385896119 hasOpenAccess W4385896119 @default.
- W4385896119 hasPrimaryLocation W43858961191 @default.
- W4385896119 hasRelatedWork W1632545988 @default.
- W4385896119 hasRelatedWork W1893647155 @default.
- W4385896119 hasRelatedWork W2094180435 @default.
- W4385896119 hasRelatedWork W2151415191 @default.
- W4385896119 hasRelatedWork W2189586111 @default.
- W4385896119 hasRelatedWork W2922332774 @default.
- W4385896119 hasRelatedWork W2949082025 @default.
- W4385896119 hasRelatedWork W2987307811 @default.
- W4385896119 hasRelatedWork W3128419421 @default.
- W4385896119 hasRelatedWork W2092619848 @default.
- W4385896119 isParatext "false" @default.
- W4385896119 isRetracted "false" @default.
- W4385896119 workType "article" @default.