SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387158707> ?p ?o ?g. }

Showing items 1 to 75 of 75 with 100 items per page.

W4387158707 abstract "Advancements in deep neural networks have allowed automatic speech recognition (ASR) systems to attain human parity on several publicly available clean speech datasets. However, even state-of-the-art ASR systems experience performance degradation when confronted with adverse conditions, as a well-trained acoustic model is sensitive to variations in the speech domain, e.g., background noise. Intuitively, humans address this issue by relying on their linguistic knowledge: the meaning of ambiguous spoken terms is usually inferred from contextual cues thereby reducing the dependency on the auditory system. Inspired by this observation, we introduce the first open-source benchmark to utilize external large language models (LLMs) for ASR error correction, where N-best decoding hypotheses provide informative elements for true transcription prediction. This approach is a paradigm shift from the traditional language model rescoring strategy that can only select one candidate hypothesis as the output transcription. The proposed benchmark contains a novel dataset, HyPoradise (HP), encompassing more than 334,000 pairs of N-best hypotheses and corresponding accurate transcriptions across prevalent speech domains. Given this dataset, we examine three types of error correction techniques based on LLMs with varying amounts of labeled hypotheses-transcription pairs, which gains a significant word error rate (WER) reduction. Experimental evidence demonstrates the proposed technique achieves a breakthrough by surpassing the upper bound of traditional re-ranking based methods. More surprisingly, LLM with reasonable prompt and its generative capability can even correct those tokens that are missing in N-best list. We make our results publicly accessible for reproducible pipelines with released pre-trained models, thus providing a new evaluation paradigm for ASR error correction with LLMs." @default.
W4387158707 created "2023-09-30" @default.
W4387158707 creator A5013624090 @default.
W4387158707 creator A5020376803 @default.
W4387158707 creator A5050344371 @default.
W4387158707 creator A5063253432 @default.
W4387158707 creator A5070872826 @default.
W4387158707 creator A5079659476 @default.
W4387158707 date "2023-09-27" @default.
W4387158707 modified "2023-10-14" @default.
W4387158707 title "HyPoradise: An Open Baseline for Generative Speech Recognition with Large Language Models" @default.
W4387158707 doi "https://doi.org/10.48550/arxiv.2309.15701" @default.
W4387158707 hasPublicationYear "2023" @default.
W4387158707 type Work @default.
W4387158707 citedByCount "0" @default.
W4387158707 crossrefType "posted-content" @default.
W4387158707 hasAuthorship W4387158707A5013624090 @default.
W4387158707 hasAuthorship W4387158707A5020376803 @default.
W4387158707 hasAuthorship W4387158707A5050344371 @default.
W4387158707 hasAuthorship W4387158707A5063253432 @default.
W4387158707 hasAuthorship W4387158707A5070872826 @default.
W4387158707 hasAuthorship W4387158707A5079659476 @default.
W4387158707 hasBestOaLocation W43871587071 @default.
W4387158707 hasConcept C13280743 @default.
W4387158707 hasConcept C137293760 @default.
W4387158707 hasConcept C138885662 @default.
W4387158707 hasConcept C154945302 @default.
W4387158707 hasConcept C167966045 @default.
W4387158707 hasConcept C179926584 @default.
W4387158707 hasConcept C185798385 @default.
W4387158707 hasConcept C204321447 @default.
W4387158707 hasConcept C205649164 @default.
W4387158707 hasConcept C23224414 @default.
W4387158707 hasConcept C2776230583 @default.
W4387158707 hasConcept C28490314 @default.
W4387158707 hasConcept C39890363 @default.
W4387158707 hasConcept C40969351 @default.
W4387158707 hasConcept C41008148 @default.
W4387158707 hasConcept C41895202 @default.
W4387158707 hasConcept C57273362 @default.
W4387158707 hasConcept C76155785 @default.
W4387158707 hasConceptScore W4387158707C13280743 @default.
W4387158707 hasConceptScore W4387158707C137293760 @default.
W4387158707 hasConceptScore W4387158707C138885662 @default.
W4387158707 hasConceptScore W4387158707C154945302 @default.
W4387158707 hasConceptScore W4387158707C167966045 @default.
W4387158707 hasConceptScore W4387158707C179926584 @default.
W4387158707 hasConceptScore W4387158707C185798385 @default.
W4387158707 hasConceptScore W4387158707C204321447 @default.
W4387158707 hasConceptScore W4387158707C205649164 @default.
W4387158707 hasConceptScore W4387158707C23224414 @default.
W4387158707 hasConceptScore W4387158707C2776230583 @default.
W4387158707 hasConceptScore W4387158707C28490314 @default.
W4387158707 hasConceptScore W4387158707C39890363 @default.
W4387158707 hasConceptScore W4387158707C40969351 @default.
W4387158707 hasConceptScore W4387158707C41008148 @default.
W4387158707 hasConceptScore W4387158707C41895202 @default.
W4387158707 hasConceptScore W4387158707C57273362 @default.
W4387158707 hasConceptScore W4387158707C76155785 @default.
W4387158707 hasLocation W43871587071 @default.
W4387158707 hasOpenAccess W4387158707 @default.
W4387158707 hasPrimaryLocation W43871587071 @default.
W4387158707 hasRelatedWork W1822699154 @default.
W4387158707 hasRelatedWork W1966737826 @default.
W4387158707 hasRelatedWork W2001732961 @default.
W4387158707 hasRelatedWork W2026408911 @default.
W4387158707 hasRelatedWork W2143620265 @default.
W4387158707 hasRelatedWork W2217717732 @default.
W4387158707 hasRelatedWork W2917344756 @default.
W4387158707 hasRelatedWork W3011988934 @default.
W4387158707 hasRelatedWork W4205868073 @default.
W4387158707 hasRelatedWork W62743518 @default.
W4387158707 isParatext "false" @default.
W4387158707 isRetracted "false" @default.
W4387158707 workType "article" @default.