Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387158707> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4387158707 abstract "Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to attain human parity on several publicly available clean speech datasets. However, even state-of-the-art ASR systems experience performance degradation when confronted with adverse conditions, as a well-trained acoustic model is sensitive to variations in the speech domain, e.g., background noise. Intuitively, humans address this issue by relying on their linguistic knowledge: the meaning of ambiguous spoken terms is usually inferred from contextual cues thereby reducing the dependency on the auditory system. Inspired by this observation, we introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction, where N-best decoding hypotheses provide informative elements for true transcription prediction. This approach is a paradigm shift from the traditional language model rescoring strategy that can only select one candidate hypothesis as the output transcription. The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses and corresponding accurate transcriptions across prevalent speech domains. Given this dataset, we examine three types of error correction techniques based on LLMs with varying amounts of labeled hypotheses-transcription pairs, which gains a significant word error rate (WER) reduction. Experimental evidence demonstrates the proposed technique achieves a breakthrough by surpassing the upper bound of traditional re-ranking based methods. More surprisingly, LLM with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list. We make our results publicly accessible for reproducible pipelines with released pre-trained models, thus providing a new evaluation paradigm for ASR error correction with LLMs." @default.
- W4387158707 created "2023-09-30" @default.
- W4387158707 creator A5013624090 @default.
- W4387158707 creator A5020376803 @default.
- W4387158707 creator A5050344371 @default.
- W4387158707 creator A5063253432 @default.
- W4387158707 creator A5070872826 @default.
- W4387158707 creator A5079659476 @default.
- W4387158707 date "2023-09-27" @default.
- W4387158707 modified "2023-10-14" @default.
- W4387158707 title "HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models" @default.
- W4387158707 doi "https://doi.org/10.48550/arxiv.2309.15701" @default.
- W4387158707 hasPublicationYear "2023" @default.
- W4387158707 type Work @default.
- W4387158707 citedByCount "0" @default.
- W4387158707 crossrefType "posted-content" @default.
- W4387158707 hasAuthorship W4387158707A5013624090 @default.
- W4387158707 hasAuthorship W4387158707A5020376803 @default.
- W4387158707 hasAuthorship W4387158707A5050344371 @default.
- W4387158707 hasAuthorship W4387158707A5063253432 @default.
- W4387158707 hasAuthorship W4387158707A5070872826 @default.
- W4387158707 hasAuthorship W4387158707A5079659476 @default.
- W4387158707 hasBestOaLocation W43871587071 @default.
- W4387158707 hasConcept C13280743 @default.
- W4387158707 hasConcept C137293760 @default.
- W4387158707 hasConcept C138885662 @default.
- W4387158707 hasConcept C154945302 @default.
- W4387158707 hasConcept C167966045 @default.
- W4387158707 hasConcept C179926584 @default.
- W4387158707 hasConcept C185798385 @default.
- W4387158707 hasConcept C204321447 @default.
- W4387158707 hasConcept C205649164 @default.
- W4387158707 hasConcept C23224414 @default.
- W4387158707 hasConcept C2776230583 @default.
- W4387158707 hasConcept C28490314 @default.
- W4387158707 hasConcept C39890363 @default.
- W4387158707 hasConcept C40969351 @default.
- W4387158707 hasConcept C41008148 @default.
- W4387158707 hasConcept C41895202 @default.
- W4387158707 hasConcept C57273362 @default.
- W4387158707 hasConcept C76155785 @default.
- W4387158707 hasConceptScore W4387158707C13280743 @default.
- W4387158707 hasConceptScore W4387158707C137293760 @default.
- W4387158707 hasConceptScore W4387158707C138885662 @default.
- W4387158707 hasConceptScore W4387158707C154945302 @default.
- W4387158707 hasConceptScore W4387158707C167966045 @default.
- W4387158707 hasConceptScore W4387158707C179926584 @default.
- W4387158707 hasConceptScore W4387158707C185798385 @default.
- W4387158707 hasConceptScore W4387158707C204321447 @default.
- W4387158707 hasConceptScore W4387158707C205649164 @default.
- W4387158707 hasConceptScore W4387158707C23224414 @default.
- W4387158707 hasConceptScore W4387158707C2776230583 @default.
- W4387158707 hasConceptScore W4387158707C28490314 @default.
- W4387158707 hasConceptScore W4387158707C39890363 @default.
- W4387158707 hasConceptScore W4387158707C40969351 @default.
- W4387158707 hasConceptScore W4387158707C41008148 @default.
- W4387158707 hasConceptScore W4387158707C41895202 @default.
- W4387158707 hasConceptScore W4387158707C57273362 @default.
- W4387158707 hasConceptScore W4387158707C76155785 @default.
- W4387158707 hasLocation W43871587071 @default.
- W4387158707 hasOpenAccess W4387158707 @default.
- W4387158707 hasPrimaryLocation W43871587071 @default.
- W4387158707 hasRelatedWork W1822699154 @default.
- W4387158707 hasRelatedWork W1966737826 @default.
- W4387158707 hasRelatedWork W2001732961 @default.
- W4387158707 hasRelatedWork W2026408911 @default.
- W4387158707 hasRelatedWork W2143620265 @default.
- W4387158707 hasRelatedWork W2217717732 @default.
- W4387158707 hasRelatedWork W2917344756 @default.
- W4387158707 hasRelatedWork W3011988934 @default.
- W4387158707 hasRelatedWork W4205868073 @default.
- W4387158707 hasRelatedWork W62743518 @default.
- W4387158707 isParatext "false" @default.
- W4387158707 isRetracted "false" @default.
- W4387158707 workType "article" @default.