Matches in SemOpenAlex for { <https://semopenalex.org/work/W2943845043> ?p ?o ?g. }
- W2943845043 abstract "We explore deep autoregressive Transformer models in language modeling for speech recognition. We focus on two aspects. First, we revisit Transformer model configurations specifically for language modeling. We show that well configured Transformer models outperform our baseline models based on the shallow stack of LSTM recurrent neural network layers. We carry out experiments on the open-source LibriSpeech 960hr task, for both 200K vocabulary word-level and 10K byte-pair encoding subword-level language modeling. We apply our word-level models to conventional hybrid speech recognition by lattice rescoring, and the subword-level models to attention based encoder-decoder models by shallow fusion. Second, we show that deep Transformer language models do not require positional encoding. The positional encoding is an essential augmentation for the self-attention mechanism which is invariant to sequence ordering. However, in autoregressive setup, as is the case for language modeling, the amount of information increases along the position dimension, which is a positional signal by its own. The analysis of attention weights shows that deep autoregressive self-attention models can automatically make use of such positional information. We find that removing the positional encoding even slightly improves the performance of these models." @default.
- W2943845043 created "2019-05-16" @default.
- W2943845043 creator A5002810304 @default.
- W2943845043 creator A5059521407 @default.
- W2943845043 creator A5087367411 @default.
- W2943845043 creator A5088968292 @default.
- W2943845043 date "2019-09-15" @default.
- W2943845043 modified "2023-09-29" @default.
- W2943845043 title "Language Modeling with Deep Transformers" @default.
- W2943845043 cites W1494198834 @default.
- W2943845043 cites W1665214252 @default.
- W2943845043 cites W1689711448 @default.
- W2943845043 cites W1915251500 @default.
- W2943845043 cites W2064675550 @default.
- W2943845043 cites W2100664567 @default.
- W2943845043 cites W2194775991 @default.
- W2943845043 cites W2302255633 @default.
- W2943845043 cites W2327501763 @default.
- W2943845043 cites W2402144811 @default.
- W2943845043 cites W2402268235 @default.
- W2943845043 cites W2404974730 @default.
- W2943845043 cites W2413794162 @default.
- W2943845043 cites W2462831000 @default.
- W2943845043 cites W2471933213 @default.
- W2943845043 cites W2626778328 @default.
- W2943845043 cites W2795138957 @default.
- W2943845043 cites W2799800213 @default.
- W2943845043 cites W2799923439 @default.
- W2943845043 cites W2899663614 @default.
- W2943845043 cites W2904818793 @default.
- W2943845043 cites W2911291251 @default.
- W2943845043 cites W2912492482 @default.
- W2943845043 cites W2926063217 @default.
- W2943845043 cites W2936774411 @default.
- W2943845043 cites W2962784628 @default.
- W2943845043 cites W2963045354 @default.
- W2943845043 cites W2963088785 @default.
- W2943845043 cites W2963341956 @default.
- W2943845043 cites W2963362078 @default.
- W2943845043 cites W2963382687 @default.
- W2943845043 cites W2963386218 @default.
- W2943845043 cites W2963537482 @default.
- W2943845043 cites W2963631907 @default.
- W2943845043 cites W2963925437 @default.
- W2943845043 cites W2963970792 @default.
- W2943845043 cites W2964045208 @default.
- W2943845043 cites W2964110616 @default.
- W2943845043 cites W2964265128 @default.
- W2943845043 cites W3103005696 @default.
- W2943845043 cites W2890012642 @default.
- W2943845043 doi "https://doi.org/10.21437/interspeech.2019-2225" @default.
- W2943845043 hasPublicationYear "2019" @default.
- W2943845043 type Work @default.
- W2943845043 sameAs 2943845043 @default.
- W2943845043 citedByCount "118" @default.
- W2943845043 countsByYear W29438450432019 @default.
- W2943845043 countsByYear W29438450432020 @default.
- W2943845043 countsByYear W29438450432021 @default.
- W2943845043 countsByYear W29438450432022 @default.
- W2943845043 countsByYear W29438450432023 @default.
- W2943845043 crossrefType "proceedings-article" @default.
- W2943845043 hasAuthorship W2943845043A5002810304 @default.
- W2943845043 hasAuthorship W2943845043A5059521407 @default.
- W2943845043 hasAuthorship W2943845043A5087367411 @default.
- W2943845043 hasAuthorship W2943845043A5088968292 @default.
- W2943845043 hasBestOaLocation W29438450432 @default.
- W2943845043 hasConcept C111919701 @default.
- W2943845043 hasConcept C118505674 @default.
- W2943845043 hasConcept C121332964 @default.
- W2943845043 hasConcept C137293760 @default.
- W2943845043 hasConcept C138885662 @default.
- W2943845043 hasConcept C147168706 @default.
- W2943845043 hasConcept C149782125 @default.
- W2943845043 hasConcept C154945302 @default.
- W2943845043 hasConcept C159877910 @default.
- W2943845043 hasConcept C162324750 @default.
- W2943845043 hasConcept C165801399 @default.
- W2943845043 hasConcept C204321447 @default.
- W2943845043 hasConcept C2777601683 @default.
- W2943845043 hasConcept C28490314 @default.
- W2943845043 hasConcept C41008148 @default.
- W2943845043 hasConcept C41895202 @default.
- W2943845043 hasConcept C50644808 @default.
- W2943845043 hasConcept C62520636 @default.
- W2943845043 hasConcept C66322947 @default.
- W2943845043 hasConceptScore W2943845043C111919701 @default.
- W2943845043 hasConceptScore W2943845043C118505674 @default.
- W2943845043 hasConceptScore W2943845043C121332964 @default.
- W2943845043 hasConceptScore W2943845043C137293760 @default.
- W2943845043 hasConceptScore W2943845043C138885662 @default.
- W2943845043 hasConceptScore W2943845043C147168706 @default.
- W2943845043 hasConceptScore W2943845043C149782125 @default.
- W2943845043 hasConceptScore W2943845043C154945302 @default.
- W2943845043 hasConceptScore W2943845043C159877910 @default.
- W2943845043 hasConceptScore W2943845043C162324750 @default.
- W2943845043 hasConceptScore W2943845043C165801399 @default.
- W2943845043 hasConceptScore W2943845043C204321447 @default.
- W2943845043 hasConceptScore W2943845043C2777601683 @default.
- W2943845043 hasConceptScore W2943845043C28490314 @default.
- W2943845043 hasConceptScore W2943845043C41008148 @default.