Matches in SemOpenAlex for { <https://semopenalex.org/work/W3207340675> ?p ?o ?g. }
Showing items 1 to 90 of
90
with 100 items per page.
- W3207340675 abstract "In this paper, we propose VISinger, a complete end-to-end high-quality singing voice synthesis (SVS) system that directly generates singing audio from lyrics and musical score. Our approach is inspired by VITS [1], an end-to-end speech generation model which adopts VAE-based posterior encoder augmented with normalizing flow based prior encoder and adversarial decoder. VISinger follows the main architecture of VITS, but makes substantial improvements to the prior encoder according to the characteristics of singing. First, instead of using phoneme-level mean and variance of acoustic features, we introduce a length regulator and a frame prior network to get the frame-level mean and variance on acoustic features, modeling the rich acoustic variation in singing. Second, we further introduce an F0 predictor to guide the frame prior network, leading to stabler singing performance. Finally, to improve the singing rhythm, we modify the duration predictor to specifically predict the phoneme to note duration ratio, helped with singing note normalization. Experiments on a professional Mandarin singing corpus show that VISinger significantly outperforms FastSpeech+Neural-Vocoder two-stage approach and the oracle VITS; ablation study demonstrates the effectiveness of different contributions." @default.
- W3207340675 created "2021-10-25" @default.
- W3207340675 creator A5036369578 @default.
- W3207340675 creator A5050166453 @default.
- W3207340675 creator A5060467975 @default.
- W3207340675 creator A5064364353 @default.
- W3207340675 creator A5064426059 @default.
- W3207340675 creator A5066245750 @default.
- W3207340675 date "2022-05-23" @default.
- W3207340675 modified "2023-09-27" @default.
- W3207340675 title "VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis" @default.
- W3207340675 cites W2471520273 @default.
- W3207340675 cites W2889244839 @default.
- W3207340675 cites W2921576841 @default.
- W3207340675 cites W2940405045 @default.
- W3207340675 cites W3015499232 @default.
- W3207340675 cites W3081279708 @default.
- W3207340675 cites W3097152652 @default.
- W3207340675 cites W3133525064 @default.
- W3207340675 cites W3196760057 @default.
- W3207340675 cites W3198869563 @default.
- W3207340675 doi "https://doi.org/10.1109/icassp43922.2022.9747664" @default.
- W3207340675 hasPublicationYear "2022" @default.
- W3207340675 type Work @default.
- W3207340675 sameAs 3207340675 @default.
- W3207340675 citedByCount "10" @default.
- W3207340675 countsByYear W32073406752022 @default.
- W3207340675 countsByYear W32073406752023 @default.
- W3207340675 crossrefType "proceedings-article" @default.
- W3207340675 hasAuthorship W3207340675A5036369578 @default.
- W3207340675 hasAuthorship W3207340675A5050166453 @default.
- W3207340675 hasAuthorship W3207340675A5060467975 @default.
- W3207340675 hasAuthorship W3207340675A5064364353 @default.
- W3207340675 hasAuthorship W3207340675A5064426059 @default.
- W3207340675 hasAuthorship W3207340675A5066245750 @default.
- W3207340675 hasBestOaLocation W32073406752 @default.
- W3207340675 hasConcept C115903868 @default.
- W3207340675 hasConcept C121332964 @default.
- W3207340675 hasConcept C126042441 @default.
- W3207340675 hasConcept C136886441 @default.
- W3207340675 hasConcept C144024400 @default.
- W3207340675 hasConcept C147168706 @default.
- W3207340675 hasConcept C154945302 @default.
- W3207340675 hasConcept C19165224 @default.
- W3207340675 hasConcept C24890656 @default.
- W3207340675 hasConcept C2779803651 @default.
- W3207340675 hasConcept C28490314 @default.
- W3207340675 hasConcept C41008148 @default.
- W3207340675 hasConcept C44819458 @default.
- W3207340675 hasConcept C50644808 @default.
- W3207340675 hasConcept C55166926 @default.
- W3207340675 hasConcept C74296488 @default.
- W3207340675 hasConcept C76155785 @default.
- W3207340675 hasConcept C94915269 @default.
- W3207340675 hasConceptScore W3207340675C115903868 @default.
- W3207340675 hasConceptScore W3207340675C121332964 @default.
- W3207340675 hasConceptScore W3207340675C126042441 @default.
- W3207340675 hasConceptScore W3207340675C136886441 @default.
- W3207340675 hasConceptScore W3207340675C144024400 @default.
- W3207340675 hasConceptScore W3207340675C147168706 @default.
- W3207340675 hasConceptScore W3207340675C154945302 @default.
- W3207340675 hasConceptScore W3207340675C19165224 @default.
- W3207340675 hasConceptScore W3207340675C24890656 @default.
- W3207340675 hasConceptScore W3207340675C2779803651 @default.
- W3207340675 hasConceptScore W3207340675C28490314 @default.
- W3207340675 hasConceptScore W3207340675C41008148 @default.
- W3207340675 hasConceptScore W3207340675C44819458 @default.
- W3207340675 hasConceptScore W3207340675C50644808 @default.
- W3207340675 hasConceptScore W3207340675C55166926 @default.
- W3207340675 hasConceptScore W3207340675C74296488 @default.
- W3207340675 hasConceptScore W3207340675C76155785 @default.
- W3207340675 hasConceptScore W3207340675C94915269 @default.
- W3207340675 hasLocation W32073406751 @default.
- W3207340675 hasLocation W32073406752 @default.
- W3207340675 hasOpenAccess W3207340675 @default.
- W3207340675 hasPrimaryLocation W32073406751 @default.
- W3207340675 hasRelatedWork W1586532344 @default.
- W3207340675 hasRelatedWork W2407648438 @default.
- W3207340675 hasRelatedWork W2608712415 @default.
- W3207340675 hasRelatedWork W2782005958 @default.
- W3207340675 hasRelatedWork W2921857201 @default.
- W3207340675 hasRelatedWork W2951961943 @default.
- W3207340675 hasRelatedWork W3012498027 @default.
- W3207340675 hasRelatedWork W3207340675 @default.
- W3207340675 hasRelatedWork W4298017177 @default.
- W3207340675 hasRelatedWork W4312096714 @default.
- W3207340675 isParatext "false" @default.
- W3207340675 isRetracted "false" @default.
- W3207340675 magId "3207340675" @default.
- W3207340675 workType "article" @default.