Matches in SemOpenAlex for { <https://semopenalex.org/work/W4221161145> ?p ?o ?g. }
Showing items 1 to 64 of
64
with 100 items per page.
- W4221161145 abstract "Audio-based automatic speech recognition (ASR) degrades significantly in noisy environments and is particularly vulnerable to interfering speech, as the model cannot determine which speaker to transcribe. Audio-visual speech recognition (AVSR) systems improve robustness by complementing the audio stream with the visual information that is invariant to noise and helps the model focus on the desired speaker. However, previous AVSR work focused solely on the supervised learning setup; hence the progress was hindered by the amount of labeled data available. In this work, we present a self-supervised AVSR framework built upon Audio-Visual HuBERT (AV-HuBERT), a state-of-the-art audio-visual speech representation learning model. On the largest available AVSR benchmark dataset LRS3, our approach outperforms prior state-of-the-art by ~50% (28.0% vs. 14.1%) using less than 10% of labeled data (433hr vs. 30hr) in the presence of babble noise, while reducing the WER of an audio-based model by over 75% (25.8% vs. 5.8%) on average." @default.
- W4221161145 created "2022-04-03" @default.
- W4221161145 creator A5035574012 @default.
- W4221161145 creator A5051950818 @default.
- W4221161145 creator A5085086690 @default.
- W4221161145 date "2022-01-05" @default.
- W4221161145 modified "2023-09-23" @default.
- W4221161145 title "Robust Self-Supervised Audio-Visual Speech Recognition" @default.
- W4221161145 doi "https://doi.org/10.48550/arxiv.2201.01763" @default.
- W4221161145 hasPublicationYear "2022" @default.
- W4221161145 type Work @default.
- W4221161145 citedByCount "1" @default.
- W4221161145 countsByYear W42211611452023 @default.
- W4221161145 crossrefType "posted-content" @default.
- W4221161145 hasAuthorship W4221161145A5035574012 @default.
- W4221161145 hasAuthorship W4221161145A5051950818 @default.
- W4221161145 hasAuthorship W4221161145A5085086690 @default.
- W4221161145 hasBestOaLocation W42211611451 @default.
- W4221161145 hasConcept C104317684 @default.
- W4221161145 hasConcept C13280743 @default.
- W4221161145 hasConcept C154945302 @default.
- W4221161145 hasConcept C155635449 @default.
- W4221161145 hasConcept C185592680 @default.
- W4221161145 hasConcept C185798385 @default.
- W4221161145 hasConcept C204201278 @default.
- W4221161145 hasConcept C205649164 @default.
- W4221161145 hasConcept C28490314 @default.
- W4221161145 hasConcept C3017588708 @default.
- W4221161145 hasConcept C41008148 @default.
- W4221161145 hasConcept C49774154 @default.
- W4221161145 hasConcept C55493867 @default.
- W4221161145 hasConcept C61328038 @default.
- W4221161145 hasConcept C63479239 @default.
- W4221161145 hasConceptScore W4221161145C104317684 @default.
- W4221161145 hasConceptScore W4221161145C13280743 @default.
- W4221161145 hasConceptScore W4221161145C154945302 @default.
- W4221161145 hasConceptScore W4221161145C155635449 @default.
- W4221161145 hasConceptScore W4221161145C185592680 @default.
- W4221161145 hasConceptScore W4221161145C185798385 @default.
- W4221161145 hasConceptScore W4221161145C204201278 @default.
- W4221161145 hasConceptScore W4221161145C205649164 @default.
- W4221161145 hasConceptScore W4221161145C28490314 @default.
- W4221161145 hasConceptScore W4221161145C3017588708 @default.
- W4221161145 hasConceptScore W4221161145C41008148 @default.
- W4221161145 hasConceptScore W4221161145C49774154 @default.
- W4221161145 hasConceptScore W4221161145C55493867 @default.
- W4221161145 hasConceptScore W4221161145C61328038 @default.
- W4221161145 hasConceptScore W4221161145C63479239 @default.
- W4221161145 hasLocation W42211611451 @default.
- W4221161145 hasOpenAccess W4221161145 @default.
- W4221161145 hasPrimaryLocation W42211611451 @default.
- W4221161145 hasRelatedWork W1566941554 @default.
- W4221161145 hasRelatedWork W1587401114 @default.
- W4221161145 hasRelatedWork W2116497041 @default.
- W4221161145 hasRelatedWork W2121486117 @default.
- W4221161145 hasRelatedWork W2122924390 @default.
- W4221161145 hasRelatedWork W2403424637 @default.
- W4221161145 hasRelatedWork W2794873916 @default.
- W4221161145 hasRelatedWork W4213448838 @default.
- W4221161145 hasRelatedWork W4300529166 @default.
- W4221161145 hasRelatedWork W2889596223 @default.
- W4221161145 isParatext "false" @default.
- W4221161145 isRetracted "false" @default.
- W4221161145 workType "article" @default.