Matches in SemOpenAlex for { <https://semopenalex.org/work/W2949779614> ?p ?o ?g. }
Showing items 1 to 73 of
73
with 100 items per page.
- W2949779614 abstract "In today's society, speech recognition systems have reached a mass audience, especially in the field of personal assistants such as Amazon Alexa or Google Home. Yet, this does not mean that speech recognition has been solved. On the contrary, for many domains, tasks, and languages such systems do not exist. Subword-based automatic speech recognition has been studied in the past for many reasons, often to overcome limitations on the size of the vocabulary. Specifically for agglutinative languages, where new words can be created on the fly, handling these limitations is possible using a subword-based automatic speech recognition (ASR) system. Though, over time subword-based systems lost a bit of popularity as system resources increased and word-based models with large vocabularies became possible. Still, subword-based models in modern ASR systems can predict words that have never been seen before and better use the available language modeling resources. Furthermore, subword models have smaller vocabularies, which makes neural network language models (NNLMs) easier to train and use. Hence, in this thesis, we study subword models for ASR and make two major contributions. First, this thesis reintroduces subword-based modeling in a modern framework based on weighted finite-state transducers and describe the necessary tools for making a sound and effective system. It does this through careful modification of the lexicon FST part of a WFST-based recognizer. Secondly, extensive experiments using are done using subwords, with different types of language models including n-gram models and NNLMs. These experiments are performed on six different languages setting the new best-published result for any of these datasets. Overall, we show that subword-based models can outperform word-based models in terms of ASR performance for many different types of languages. This thesis also details design choices needed when building modern subword ASR systems, including the choices of the segmentation algorithm, vocabulary size and subword marking style. In addition, it includes techniques to combine speech recognition models trained on different units through system combination. Lastly, it evaluates the use of the smallest possible subword unit; characters and shows that these models can be smaller and yet be competitive to word-based models." @default.
- W2949779614 created "2019-06-27" @default.
- W2949779614 creator A5074707861 @default.
- W2949779614 date "2019-01-01" @default.
- W2949779614 modified "2023-10-18" @default.
- W2949779614 title "Modern subword-based models for automatic speech recognition" @default.
- W2949779614 cites W2070737455 @default.
- W2949779614 cites W2479014774 @default.
- W2949779614 cites W2963226322 @default.
- W2949779614 hasPublicationYear "2019" @default.
- W2949779614 type Work @default.
- W2949779614 sameAs 2949779614 @default.
- W2949779614 citedByCount "1" @default.
- W2949779614 countsByYear W29497796142020 @default.
- W2949779614 crossrefType "journal-article" @default.
- W2949779614 hasAuthorship W2949779614A5074707861 @default.
- W2949779614 hasConcept C137293760 @default.
- W2949779614 hasConcept C138885662 @default.
- W2949779614 hasConcept C154945302 @default.
- W2949779614 hasConcept C186644900 @default.
- W2949779614 hasConcept C202444582 @default.
- W2949779614 hasConcept C204321447 @default.
- W2949779614 hasConcept C2777601683 @default.
- W2949779614 hasConcept C2778121359 @default.
- W2949779614 hasConcept C28490314 @default.
- W2949779614 hasConcept C33923547 @default.
- W2949779614 hasConcept C41008148 @default.
- W2949779614 hasConcept C41895202 @default.
- W2949779614 hasConcept C80875076 @default.
- W2949779614 hasConcept C90805587 @default.
- W2949779614 hasConcept C9652623 @default.
- W2949779614 hasConceptScore W2949779614C137293760 @default.
- W2949779614 hasConceptScore W2949779614C138885662 @default.
- W2949779614 hasConceptScore W2949779614C154945302 @default.
- W2949779614 hasConceptScore W2949779614C186644900 @default.
- W2949779614 hasConceptScore W2949779614C202444582 @default.
- W2949779614 hasConceptScore W2949779614C204321447 @default.
- W2949779614 hasConceptScore W2949779614C2777601683 @default.
- W2949779614 hasConceptScore W2949779614C2778121359 @default.
- W2949779614 hasConceptScore W2949779614C28490314 @default.
- W2949779614 hasConceptScore W2949779614C33923547 @default.
- W2949779614 hasConceptScore W2949779614C41008148 @default.
- W2949779614 hasConceptScore W2949779614C41895202 @default.
- W2949779614 hasConceptScore W2949779614C80875076 @default.
- W2949779614 hasConceptScore W2949779614C90805587 @default.
- W2949779614 hasConceptScore W2949779614C9652623 @default.
- W2949779614 hasLocation W29497796141 @default.
- W2949779614 hasOpenAccess W2949779614 @default.
- W2949779614 hasPrimaryLocation W29497796141 @default.
- W2949779614 hasRelatedWork W1501139663 @default.
- W2949779614 hasRelatedWork W1553541619 @default.
- W2949779614 hasRelatedWork W2077450519 @default.
- W2949779614 hasRelatedWork W2187526818 @default.
- W2949779614 hasRelatedWork W2299662909 @default.
- W2949779614 hasRelatedWork W2614744999 @default.
- W2949779614 hasRelatedWork W2735949438 @default.
- W2949779614 hasRelatedWork W2763690830 @default.
- W2949779614 hasRelatedWork W2791647162 @default.
- W2949779614 hasRelatedWork W2914864687 @default.
- W2949779614 hasRelatedWork W3003127021 @default.
- W2949779614 hasRelatedWork W3043056204 @default.
- W2949779614 hasRelatedWork W3045592404 @default.
- W2949779614 hasRelatedWork W3082735059 @default.
- W2949779614 hasRelatedWork W3089901025 @default.
- W2949779614 hasRelatedWork W3094871513 @default.
- W2949779614 hasRelatedWork W3105242324 @default.
- W2949779614 hasRelatedWork W3119042470 @default.
- W2949779614 hasRelatedWork W3161539047 @default.
- W2949779614 hasRelatedWork W4419257 @default.
- W2949779614 isParatext "false" @default.
- W2949779614 isRetracted "false" @default.
- W2949779614 magId "2949779614" @default.
- W2949779614 workType "article" @default.