Matches in SemOpenAlex for { <https://semopenalex.org/work/W3048704486> ?p ?o ?g. }
Showing items 1 to 96 of
96
with 100 items per page.
- W3048704486 abstract "Transformer has achieved competitive performance against state-of-the-art end-to-end models in automatic speech recognition (ASR), and requires significantly less training time than RNN-based models. The original Transformer, with encoder-decoder architecture, is only suitable for offline ASR. It relies on an attention mechanism to learn alignments, and encodes input audio bidirectionally. The high computation cost of Transformer decoding also limits its use in production streaming systems. To make Transformer suitable for streaming ASR, we explore Transducer framework as a streamable way to learn alignments. For audio encoding, we apply unidirectional Transformer with interleaved convolution layers. The interleaved convolution layers are used for modeling future context which is important to performance. To reduce computation cost, we gradually downsample acoustic input, also with the interleaved convolution layers. Moreover, we limit the length of history context in self-attention to maintain constant computation cost for each decoding step. We show that this architecture, named Conv-Transformer Transducer, achieves competitive performance on LibriSpeech dataset (3.6% WER on test-clean) without external language models. The performance is comparable to previously published streamable Transformer Transducer and strong hybrid streaming ASR systems, and is achieved with smaller look-ahead window (140~ms), fewer parameters and lower frame rate." @default.
- W3048704486 created "2020-08-18" @default.
- W3048704486 creator A5023202964 @default.
- W3048704486 creator A5032912499 @default.
- W3048704486 creator A5034654778 @default.
- W3048704486 creator A5083831776 @default.
- W3048704486 date "2020-08-13" @default.
- W3048704486 modified "2023-09-27" @default.
- W3048704486 title "Conv-Transformer Transducer: Low Latency, Low Frame Rate, Streamable End-to-End Speech Recognition" @default.
- W3048704486 cites W1494198834 @default.
- W3048704486 cites W1524333225 @default.
- W3048704486 cites W1828163288 @default.
- W3048704486 cites W2127141656 @default.
- W3048704486 cites W2407080277 @default.
- W3048704486 cites W2514741789 @default.
- W3048704486 cites W2515439472 @default.
- W3048704486 cites W2729190387 @default.
- W3048704486 cites W2787214294 @default.
- W3048704486 cites W2892009249 @default.
- W3048704486 cites W2936123380 @default.
- W3048704486 cites W2945697643 @default.
- W3048704486 cites W2962760690 @default.
- W3048704486 cites W2963088785 @default.
- W3048704486 cites W2963250244 @default.
- W3048704486 cites W2963341956 @default.
- W3048704486 cites W2963403868 @default.
- W3048704486 cites W2963925437 @default.
- W3048704486 cites W2964110616 @default.
- W3048704486 cites W2964272710 @default.
- W3048704486 cites W2964308564 @default.
- W3048704486 cites W2972389417 @default.
- W3048704486 cites W2972818416 @default.
- W3048704486 cites W2976556660 @default.
- W3048704486 cites W2981857663 @default.
- W3048704486 cites W2982413405 @default.
- W3048704486 cites W3008191852 @default.
- W3048704486 cites W3015194534 @default.
- W3048704486 cites W3015995734 @default.
- W3048704486 cites W3016010032 @default.
- W3048704486 doi "https://doi.org/10.48550/arxiv.2008.05750" @default.
- W3048704486 hasPublicationYear "2020" @default.
- W3048704486 type Work @default.
- W3048704486 sameAs 3048704486 @default.
- W3048704486 citedByCount "1" @default.
- W3048704486 countsByYear W30487044862021 @default.
- W3048704486 crossrefType "posted-content" @default.
- W3048704486 hasAuthorship W3048704486A5023202964 @default.
- W3048704486 hasAuthorship W3048704486A5032912499 @default.
- W3048704486 hasAuthorship W3048704486A5034654778 @default.
- W3048704486 hasAuthorship W3048704486A5083831776 @default.
- W3048704486 hasBestOaLocation W30487044861 @default.
- W3048704486 hasConcept C111919701 @default.
- W3048704486 hasConcept C11413529 @default.
- W3048704486 hasConcept C118505674 @default.
- W3048704486 hasConcept C119599485 @default.
- W3048704486 hasConcept C127413603 @default.
- W3048704486 hasConcept C154945302 @default.
- W3048704486 hasConcept C165801399 @default.
- W3048704486 hasConcept C28490314 @default.
- W3048704486 hasConcept C3261483 @default.
- W3048704486 hasConcept C41008148 @default.
- W3048704486 hasConcept C45374587 @default.
- W3048704486 hasConcept C56318395 @default.
- W3048704486 hasConcept C57273362 @default.
- W3048704486 hasConcept C66322947 @default.
- W3048704486 hasConceptScore W3048704486C111919701 @default.
- W3048704486 hasConceptScore W3048704486C11413529 @default.
- W3048704486 hasConceptScore W3048704486C118505674 @default.
- W3048704486 hasConceptScore W3048704486C119599485 @default.
- W3048704486 hasConceptScore W3048704486C127413603 @default.
- W3048704486 hasConceptScore W3048704486C154945302 @default.
- W3048704486 hasConceptScore W3048704486C165801399 @default.
- W3048704486 hasConceptScore W3048704486C28490314 @default.
- W3048704486 hasConceptScore W3048704486C3261483 @default.
- W3048704486 hasConceptScore W3048704486C41008148 @default.
- W3048704486 hasConceptScore W3048704486C45374587 @default.
- W3048704486 hasConceptScore W3048704486C56318395 @default.
- W3048704486 hasConceptScore W3048704486C57273362 @default.
- W3048704486 hasConceptScore W3048704486C66322947 @default.
- W3048704486 hasLocation W30487044861 @default.
- W3048704486 hasOpenAccess W3048704486 @default.
- W3048704486 hasPrimaryLocation W30487044861 @default.
- W3048704486 hasRelatedWork W1950712214 @default.
- W3048704486 hasRelatedWork W2892009249 @default.
- W3048704486 hasRelatedWork W2992696780 @default.
- W3048704486 hasRelatedWork W3015671919 @default.
- W3048704486 hasRelatedWork W3156915121 @default.
- W3048704486 hasRelatedWork W3197898596 @default.
- W3048704486 hasRelatedWork W3198654230 @default.
- W3048704486 hasRelatedWork W4281621826 @default.
- W3048704486 hasRelatedWork W4287186213 @default.
- W3048704486 hasRelatedWork W4312120773 @default.
- W3048704486 isParatext "false" @default.
- W3048704486 isRetracted "false" @default.
- W3048704486 magId "3048704486" @default.
- W3048704486 workType "article" @default.