Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226033575> ?p ?o ?g. }
- W4226033575 abstract "Motivated by the success of masked language modeling (MLM) in pre-training natural language processing models, we propose w2v-BERT that explores MLM for self-supervised speech representation learning. w2v-BERT is a framework that combines contrastive learning and MLM, where the former trains the model to discretize input continuous speech signals into a finite set of discriminative speech tokens, and the latter trains the model to learn contextualized speech representations via solving a masked prediction task consuming the discretized tokens. In contrast to existing MLM-based speech pre-training frameworks such as HuBERT, which relies on an iterative re-clustering and re-training process, or vq-wav2vec, which concatenates two separately trained modules, w2v-BERT can be optimized in an end-to-end fashion by solving the two self-supervised tasks (the contrastive task and MLM) simultaneously. Our experiments show that w2v-BERT achieves competitive results compared to current state-of-the-art pre-trained models on the LibriSpeech benchmarks when using the Libri-Light 60k corpus as the unsupervised data. In particular, when compared to published models such as conformer-based wav2vec 2.0 and HuBERT, our model shows 5% to 10% relative WER reduction on the test-clean and test-other subsets. When applied to the Google's Voice Search traffic dataset, w2v-BERT outperforms our internal conformer-based wav2vec 2.0 by more than 30% relatively." @default.
- W4226033575 created "2022-05-05" @default.
- W4226033575 creator A5002729731 @default.
- W4226033575 creator A5008107321 @default.
- W4226033575 creator A5010253402 @default.
- W4226033575 creator A5027763497 @default.
- W4226033575 creator A5048771433 @default.
- W4226033575 creator A5071773009 @default.
- W4226033575 creator A5074220692 @default.
- W4226033575 date "2021-12-13" @default.
- W4226033575 modified "2023-10-06" @default.
- W4226033575 title "w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training" @default.
- W4226033575 cites W1494198834 @default.
- W4226033575 cites W165878654 @default.
- W4226033575 cites W2064675550 @default.
- W4226033575 cites W2088622183 @default.
- W4226033575 cites W2101210369 @default.
- W4226033575 cites W2111316763 @default.
- W4226033575 cites W2121879602 @default.
- W4226033575 cites W2936774411 @default.
- W4226033575 cites W2940322076 @default.
- W4226033575 cites W2972943112 @default.
- W4226033575 cites W2973049979 @default.
- W4226033575 cites W2982223350 @default.
- W4226033575 cites W3003875258 @default.
- W4226033575 cites W3015265920 @default.
- W4226033575 cites W3015522062 @default.
- W4226033575 cites W3015995734 @default.
- W4226033575 cites W3016011332 @default.
- W4226033575 cites W3026041220 @default.
- W4226033575 cites W3035202887 @default.
- W4226033575 cites W3041561163 @default.
- W4226033575 cites W3097777922 @default.
- W4226033575 cites W3160525311 @default.
- W4226033575 cites W3209059054 @default.
- W4226033575 doi "https://doi.org/10.1109/asru51503.2021.9688253" @default.
- W4226033575 hasPublicationYear "2021" @default.
- W4226033575 type Work @default.
- W4226033575 citedByCount "52" @default.
- W4226033575 countsByYear W42260335752022 @default.
- W4226033575 countsByYear W42260335752023 @default.
- W4226033575 crossrefType "proceedings-article" @default.
- W4226033575 hasAuthorship W4226033575A5002729731 @default.
- W4226033575 hasAuthorship W4226033575A5008107321 @default.
- W4226033575 hasAuthorship W4226033575A5010253402 @default.
- W4226033575 hasAuthorship W4226033575A5027763497 @default.
- W4226033575 hasAuthorship W4226033575A5048771433 @default.
- W4226033575 hasAuthorship W4226033575A5071773009 @default.
- W4226033575 hasAuthorship W4226033575A5074220692 @default.
- W4226033575 hasBestOaLocation W42260335752 @default.
- W4226033575 hasConcept C134306372 @default.
- W4226033575 hasConcept C137293760 @default.
- W4226033575 hasConcept C154945302 @default.
- W4226033575 hasConcept C162324750 @default.
- W4226033575 hasConcept C169903167 @default.
- W4226033575 hasConcept C177264268 @default.
- W4226033575 hasConcept C17744445 @default.
- W4226033575 hasConcept C187736073 @default.
- W4226033575 hasConcept C199360897 @default.
- W4226033575 hasConcept C199539241 @default.
- W4226033575 hasConcept C204321447 @default.
- W4226033575 hasConcept C2776359362 @default.
- W4226033575 hasConcept C2780451532 @default.
- W4226033575 hasConcept C28490314 @default.
- W4226033575 hasConcept C33923547 @default.
- W4226033575 hasConcept C41008148 @default.
- W4226033575 hasConcept C73000952 @default.
- W4226033575 hasConcept C94625758 @default.
- W4226033575 hasConcept C97931131 @default.
- W4226033575 hasConceptScore W4226033575C134306372 @default.
- W4226033575 hasConceptScore W4226033575C137293760 @default.
- W4226033575 hasConceptScore W4226033575C154945302 @default.
- W4226033575 hasConceptScore W4226033575C162324750 @default.
- W4226033575 hasConceptScore W4226033575C169903167 @default.
- W4226033575 hasConceptScore W4226033575C177264268 @default.
- W4226033575 hasConceptScore W4226033575C17744445 @default.
- W4226033575 hasConceptScore W4226033575C187736073 @default.
- W4226033575 hasConceptScore W4226033575C199360897 @default.
- W4226033575 hasConceptScore W4226033575C199539241 @default.
- W4226033575 hasConceptScore W4226033575C204321447 @default.
- W4226033575 hasConceptScore W4226033575C2776359362 @default.
- W4226033575 hasConceptScore W4226033575C2780451532 @default.
- W4226033575 hasConceptScore W4226033575C28490314 @default.
- W4226033575 hasConceptScore W4226033575C33923547 @default.
- W4226033575 hasConceptScore W4226033575C41008148 @default.
- W4226033575 hasConceptScore W4226033575C73000952 @default.
- W4226033575 hasConceptScore W4226033575C94625758 @default.
- W4226033575 hasConceptScore W4226033575C97931131 @default.
- W4226033575 hasLocation W42260335751 @default.
- W4226033575 hasLocation W42260335752 @default.
- W4226033575 hasOpenAccess W4226033575 @default.
- W4226033575 hasPrimaryLocation W42260335751 @default.
- W4226033575 hasRelatedWork W1772447446 @default.
- W4226033575 hasRelatedWork W184209194 @default.
- W4226033575 hasRelatedWork W2359001871 @default.
- W4226033575 hasRelatedWork W2543116847 @default.
- W4226033575 hasRelatedWork W2608096034 @default.
- W4226033575 hasRelatedWork W2799124825 @default.
- W4226033575 hasRelatedWork W2963277000 @default.
- W4226033575 hasRelatedWork W2980745533 @default.