Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387158232> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W4387158232 abstract "Voice conversion is becoming increasingly popular, and a growing number of application scenarios require models with streaming inference capabilities. The recently proposed DualVC attempts to achieve this objective through streaming model architecture design and intra-model knowledge distillation along with hybrid predictive coding to compensate for the lack of future information. However, DualVC encounters several problems that limit its performance. First, the autoregressive decoder has error accumulation in its nature and limits the inference speed as well. Second, the causal convolution enables streaming capability but cannot sufficiently use future information within chunks. Third, the model is unable to effectively address the noise in the unvoiced segments, lowering the sound quality. In this paper, we propose DualVC 2 to address these issues. Specifically, the model backbone is migrated to a Conformer-based architecture, empowering parallel inference. Causal convolution is replaced by non-causal convolution with dynamic chunk mask to make better use of within-chunk future information. Also, quiet attention is introduced to enhance the model's noise robustness. Experiments show that DualVC 2 outperforms DualVC and other baseline systems in both subjective and objective metrics, with only 186.4 ms latency. Our audio samples are made publicly available." @default.
- W4387158232 created "2023-09-30" @default.
- W4387158232 creator A5006901857 @default.
- W4387158232 creator A5015560758 @default.
- W4387158232 creator A5033976028 @default.
- W4387158232 creator A5036369578 @default.
- W4387158232 creator A5050166453 @default.
- W4387158232 creator A5058646669 @default.
- W4387158232 creator A5081164682 @default.
- W4387158232 date "2023-09-27" @default.
- W4387158232 modified "2023-10-14" @default.
- W4387158232 title "DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion" @default.
- W4387158232 doi "https://doi.org/10.48550/arxiv.2309.15496" @default.
- W4387158232 hasPublicationYear "2023" @default.
- W4387158232 type Work @default.
- W4387158232 citedByCount "0" @default.
- W4387158232 crossrefType "posted-content" @default.
- W4387158232 hasAuthorship W4387158232A5006901857 @default.
- W4387158232 hasAuthorship W4387158232A5015560758 @default.
- W4387158232 hasAuthorship W4387158232A5033976028 @default.
- W4387158232 hasAuthorship W4387158232A5036369578 @default.
- W4387158232 hasAuthorship W4387158232A5050166453 @default.
- W4387158232 hasAuthorship W4387158232A5058646669 @default.
- W4387158232 hasAuthorship W4387158232A5081164682 @default.
- W4387158232 hasBestOaLocation W43871582321 @default.
- W4387158232 hasConcept C104317684 @default.
- W4387158232 hasConcept C113775141 @default.
- W4387158232 hasConcept C115961682 @default.
- W4387158232 hasConcept C149782125 @default.
- W4387158232 hasConcept C154945302 @default.
- W4387158232 hasConcept C159877910 @default.
- W4387158232 hasConcept C162324750 @default.
- W4387158232 hasConcept C175291020 @default.
- W4387158232 hasConcept C185592680 @default.
- W4387158232 hasConcept C199360897 @default.
- W4387158232 hasConcept C2776214188 @default.
- W4387158232 hasConcept C28490314 @default.
- W4387158232 hasConcept C41008148 @default.
- W4387158232 hasConcept C45347329 @default.
- W4387158232 hasConcept C50644808 @default.
- W4387158232 hasConcept C55493867 @default.
- W4387158232 hasConcept C63479239 @default.
- W4387158232 hasConcept C76155785 @default.
- W4387158232 hasConcept C82876162 @default.
- W4387158232 hasConcept C99498987 @default.
- W4387158232 hasConceptScore W4387158232C104317684 @default.
- W4387158232 hasConceptScore W4387158232C113775141 @default.
- W4387158232 hasConceptScore W4387158232C115961682 @default.
- W4387158232 hasConceptScore W4387158232C149782125 @default.
- W4387158232 hasConceptScore W4387158232C154945302 @default.
- W4387158232 hasConceptScore W4387158232C159877910 @default.
- W4387158232 hasConceptScore W4387158232C162324750 @default.
- W4387158232 hasConceptScore W4387158232C175291020 @default.
- W4387158232 hasConceptScore W4387158232C185592680 @default.
- W4387158232 hasConceptScore W4387158232C199360897 @default.
- W4387158232 hasConceptScore W4387158232C2776214188 @default.
- W4387158232 hasConceptScore W4387158232C28490314 @default.
- W4387158232 hasConceptScore W4387158232C41008148 @default.
- W4387158232 hasConceptScore W4387158232C45347329 @default.
- W4387158232 hasConceptScore W4387158232C50644808 @default.
- W4387158232 hasConceptScore W4387158232C55493867 @default.
- W4387158232 hasConceptScore W4387158232C63479239 @default.
- W4387158232 hasConceptScore W4387158232C76155785 @default.
- W4387158232 hasConceptScore W4387158232C82876162 @default.
- W4387158232 hasConceptScore W4387158232C99498987 @default.
- W4387158232 hasLocation W43871582321 @default.
- W4387158232 hasOpenAccess W4387158232 @default.
- W4387158232 hasPrimaryLocation W43871582321 @default.
- W4387158232 hasRelatedWork W2981519481 @default.
- W4387158232 hasRelatedWork W3091913713 @default.
- W4387158232 hasRelatedWork W3151612890 @default.
- W4387158232 hasRelatedWork W3165698711 @default.
- W4387158232 hasRelatedWork W3195610113 @default.
- W4387158232 hasRelatedWork W4206841102 @default.
- W4387158232 hasRelatedWork W4285327616 @default.
- W4387158232 hasRelatedWork W4294982680 @default.
- W4387158232 hasRelatedWork W4318348488 @default.
- W4387158232 hasRelatedWork W4319994481 @default.
- W4387158232 isParatext "false" @default.
- W4387158232 isRetracted "false" @default.
- W4387158232 workType "article" @default.