Matches in SemOpenAlex for { <https://semopenalex.org/work/W4306169301> ?p ?o ?g. }
Showing items 1 to 85 of
85
with 100 items per page.
- W4306169301 abstract "Self-supervised learning (SSL) is a long-standing goal for speech processing, since it utilizes large-scale unlabeled data and avoids extensive human labeling. Recent years witness great successes in applying self-supervised learning in speech recognition, while limited exploration was attempted in applying SSL for modeling speaker characteristics. In this paper, we aim to improve the existing SSL framework for speaker representation learning. Two methods are introduced for enhancing the unsupervised speaker information extraction. First, we apply the multi-task learning to the current SSL framework, where we integrate the utterance-wise contrastive loss with the SSL objective function. Second, for better speaker discrimination, we propose an utterance mixing strategy for data augmentation, where additional overlapped utterances are created unsupervisely and incorporate during training. We integrate the proposed methods into the HuBERT framework. Experiment results on SUPERB benchmark show that the proposed system achieves state-of-the-art performance in universal representation learning, especially for speaker identification oriented tasks. An ablation study is performed verifying the efficacy of each proposed method. Finally, we scale up training dataset to 94 thousand hours public audio data and achieve further performance improvement in all SUPERB tasks." @default.
- W4306169301 created "2022-10-14" @default.
- W4306169301 creator A5002894983 @default.
- W4306169301 creator A5003866126 @default.
- W4306169301 creator A5014662947 @default.
- W4306169301 creator A5015824704 @default.
- W4306169301 creator A5017825677 @default.
- W4306169301 creator A5029670581 @default.
- W4306169301 creator A5034770439 @default.
- W4306169301 creator A5067921099 @default.
- W4306169301 creator A5072540013 @default.
- W4306169301 creator A5079533447 @default.
- W4306169301 creator A5084524299 @default.
- W4306169301 date "2021-10-12" @default.
- W4306169301 modified "2023-10-14" @default.
- W4306169301 title "UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training" @default.
- W4306169301 doi "https://doi.org/10.48550/arxiv.2110.05752" @default.
- W4306169301 hasPublicationYear "2021" @default.
- W4306169301 type Work @default.
- W4306169301 citedByCount "0" @default.
- W4306169301 crossrefType "posted-content" @default.
- W4306169301 hasAuthorship W4306169301A5002894983 @default.
- W4306169301 hasAuthorship W4306169301A5003866126 @default.
- W4306169301 hasAuthorship W4306169301A5014662947 @default.
- W4306169301 hasAuthorship W4306169301A5015824704 @default.
- W4306169301 hasAuthorship W4306169301A5017825677 @default.
- W4306169301 hasAuthorship W4306169301A5029670581 @default.
- W4306169301 hasAuthorship W4306169301A5034770439 @default.
- W4306169301 hasAuthorship W4306169301A5067921099 @default.
- W4306169301 hasAuthorship W4306169301A5072540013 @default.
- W4306169301 hasAuthorship W4306169301A5079533447 @default.
- W4306169301 hasAuthorship W4306169301A5084524299 @default.
- W4306169301 hasBestOaLocation W43061693011 @default.
- W4306169301 hasConcept C119857082 @default.
- W4306169301 hasConcept C13280743 @default.
- W4306169301 hasConcept C133892786 @default.
- W4306169301 hasConcept C154945302 @default.
- W4306169301 hasConcept C162324750 @default.
- W4306169301 hasConcept C17744445 @default.
- W4306169301 hasConcept C185798385 @default.
- W4306169301 hasConcept C187736073 @default.
- W4306169301 hasConcept C199539241 @default.
- W4306169301 hasConcept C204321447 @default.
- W4306169301 hasConcept C205649164 @default.
- W4306169301 hasConcept C2775852435 @default.
- W4306169301 hasConcept C2776359362 @default.
- W4306169301 hasConcept C2780451532 @default.
- W4306169301 hasConcept C28490314 @default.
- W4306169301 hasConcept C41008148 @default.
- W4306169301 hasConcept C59404180 @default.
- W4306169301 hasConcept C94625758 @default.
- W4306169301 hasConceptScore W4306169301C119857082 @default.
- W4306169301 hasConceptScore W4306169301C13280743 @default.
- W4306169301 hasConceptScore W4306169301C133892786 @default.
- W4306169301 hasConceptScore W4306169301C154945302 @default.
- W4306169301 hasConceptScore W4306169301C162324750 @default.
- W4306169301 hasConceptScore W4306169301C17744445 @default.
- W4306169301 hasConceptScore W4306169301C185798385 @default.
- W4306169301 hasConceptScore W4306169301C187736073 @default.
- W4306169301 hasConceptScore W4306169301C199539241 @default.
- W4306169301 hasConceptScore W4306169301C204321447 @default.
- W4306169301 hasConceptScore W4306169301C205649164 @default.
- W4306169301 hasConceptScore W4306169301C2775852435 @default.
- W4306169301 hasConceptScore W4306169301C2776359362 @default.
- W4306169301 hasConceptScore W4306169301C2780451532 @default.
- W4306169301 hasConceptScore W4306169301C28490314 @default.
- W4306169301 hasConceptScore W4306169301C41008148 @default.
- W4306169301 hasConceptScore W4306169301C59404180 @default.
- W4306169301 hasConceptScore W4306169301C94625758 @default.
- W4306169301 hasLocation W43061693011 @default.
- W4306169301 hasOpenAccess W4306169301 @default.
- W4306169301 hasPrimaryLocation W43061693011 @default.
- W4306169301 hasRelatedWork W1946464671 @default.
- W4306169301 hasRelatedWork W2081647779 @default.
- W4306169301 hasRelatedWork W2136038945 @default.
- W4306169301 hasRelatedWork W2788991015 @default.
- W4306169301 hasRelatedWork W2804625088 @default.
- W4306169301 hasRelatedWork W2950523597 @default.
- W4306169301 hasRelatedWork W2988839259 @default.
- W4306169301 hasRelatedWork W2997340161 @default.
- W4306169301 hasRelatedWork W3185852197 @default.
- W4306169301 hasRelatedWork W4368276095 @default.
- W4306169301 isParatext "false" @default.
- W4306169301 isRetracted "false" @default.
- W4306169301 workType "article" @default.