Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386876109> ?p ?o ?g. }
Showing items 1 to 69 of
69
with 100 items per page.
- W4386876109 abstract "Conventional end-to-end Automatic Speech Recognition (ASR) models primarily focus on exact transcription tasks, lacking flexibility for nuanced user interactions. With the advent of Large Language Models (LLMs) in speech processing, more organic, text-prompt-based interactions have become possible. However, the mechanisms behind these models' speech understanding and reasoning capabilities remain underexplored. To study this question from the data perspective, we introduce instruction-following speech recognition, training a Listen-Attend-Spell model to understand and execute a diverse set of free-form text instructions. This enables a multitude of speech recognition tasks -- ranging from transcript manipulation to summarization -- without relying on predefined command sets. Remarkably, our model, trained from scratch on Librispeech, interprets and executes simple instructions without requiring LLMs or pre-trained speech modules. It also offers selective transcription options based on instructions like transcribe first half and then turn off listening, providing an additional layer of privacy and safety compared to existing LLMs. Our findings highlight the significant potential of instruction-following training to advance speech foundation models." @default.
- W4386876109 created "2023-09-20" @default.
- W4386876109 creator A5002729731 @default.
- W4386876109 creator A5010825170 @default.
- W4386876109 creator A5076426606 @default.
- W4386876109 creator A5089287882 @default.
- W4386876109 date "2023-09-18" @default.
- W4386876109 modified "2023-09-27" @default.
- W4386876109 title "Instruction-Following Speech Recognition" @default.
- W4386876109 doi "https://doi.org/10.48550/arxiv.2309.09843" @default.
- W4386876109 hasPublicationYear "2023" @default.
- W4386876109 type Work @default.
- W4386876109 citedByCount "0" @default.
- W4386876109 crossrefType "posted-content" @default.
- W4386876109 hasAuthorship W4386876109A5002729731 @default.
- W4386876109 hasAuthorship W4386876109A5010825170 @default.
- W4386876109 hasAuthorship W4386876109A5076426606 @default.
- W4386876109 hasAuthorship W4386876109A5089287882 @default.
- W4386876109 hasBestOaLocation W43868761091 @default.
- W4386876109 hasConcept C111472728 @default.
- W4386876109 hasConcept C12713177 @default.
- W4386876109 hasConcept C137293760 @default.
- W4386876109 hasConcept C138885662 @default.
- W4386876109 hasConcept C154945302 @default.
- W4386876109 hasConcept C15744967 @default.
- W4386876109 hasConcept C170858558 @default.
- W4386876109 hasConcept C177264268 @default.
- W4386876109 hasConcept C177291462 @default.
- W4386876109 hasConcept C179926584 @default.
- W4386876109 hasConcept C199360897 @default.
- W4386876109 hasConcept C204321447 @default.
- W4386876109 hasConcept C2780565519 @default.
- W4386876109 hasConcept C28490314 @default.
- W4386876109 hasConcept C41008148 @default.
- W4386876109 hasConcept C41895202 @default.
- W4386876109 hasConcept C46312422 @default.
- W4386876109 hasConceptScore W4386876109C111472728 @default.
- W4386876109 hasConceptScore W4386876109C12713177 @default.
- W4386876109 hasConceptScore W4386876109C137293760 @default.
- W4386876109 hasConceptScore W4386876109C138885662 @default.
- W4386876109 hasConceptScore W4386876109C154945302 @default.
- W4386876109 hasConceptScore W4386876109C15744967 @default.
- W4386876109 hasConceptScore W4386876109C170858558 @default.
- W4386876109 hasConceptScore W4386876109C177264268 @default.
- W4386876109 hasConceptScore W4386876109C177291462 @default.
- W4386876109 hasConceptScore W4386876109C179926584 @default.
- W4386876109 hasConceptScore W4386876109C199360897 @default.
- W4386876109 hasConceptScore W4386876109C204321447 @default.
- W4386876109 hasConceptScore W4386876109C2780565519 @default.
- W4386876109 hasConceptScore W4386876109C28490314 @default.
- W4386876109 hasConceptScore W4386876109C41008148 @default.
- W4386876109 hasConceptScore W4386876109C41895202 @default.
- W4386876109 hasConceptScore W4386876109C46312422 @default.
- W4386876109 hasLocation W43868761091 @default.
- W4386876109 hasOpenAccess W4386876109 @default.
- W4386876109 hasPrimaryLocation W43868761091 @default.
- W4386876109 hasRelatedWork W2359001871 @default.
- W4386876109 hasRelatedWork W3103417625 @default.
- W4386876109 hasRelatedWork W3207693618 @default.
- W4386876109 hasRelatedWork W3212242574 @default.
- W4386876109 hasRelatedWork W4226126415 @default.
- W4386876109 hasRelatedWork W4286233511 @default.
- W4386876109 hasRelatedWork W4287887250 @default.
- W4386876109 hasRelatedWork W4316012698 @default.
- W4386876109 hasRelatedWork W4361193143 @default.
- W4386876109 hasRelatedWork W4362598702 @default.
- W4386876109 isParatext "false" @default.
- W4386876109 isRetracted "false" @default.
- W4386876109 workType "article" @default.