Matches in SemOpenAlex for { <https://semopenalex.org/work/W3217016993> ?p ?o ?g. }
- W3217016993 endingPage "3693" @default.
- W3217016993 startingPage "3679" @default.
- W3217016993 abstract "State-of-the-art language models (LMs) represented by long-short term memory recurrent neural networks (LSTM-RNNs) and Transformers are becoming increasingly complex and expensive for practical applications. Low-bit neural network quantization provides a powerful solution to dramatically reduce their model size. Current quantization methods are based on uniform precision and fail to account for the varying performance sensitivity at different parts of LMs to quantization errors. To this end, novel mixed precision neural network LM quantization methods are proposed in this paper. The optimal local precision choices for LSTM-RNN and Transformer based neural LMs are automatically learned using three techniques. The first two approaches are based on quantization sensitivity metrics in the form of either the KL-divergence measured between full precision and quantized LMs, or Hessian trace weighted quantization perturbation that can be approximated efficiently using matrix free techniques. The third approach is based on mixed precision neural architecture search. In order to overcome the difficulty in using gradient descent methods to directly estimate discrete quantized weights, alternating direction methods of multipliers (ADMM) are used to efficiently train quantized LMs. Experiments were conducted on state-of-the-art LF-MMI CNN-TDNN systems featuring speed perturbation, i-Vector and learning hidden unit contribution (LHUC) based speaker adaptation on two tasks: Switchboard telephone speech and AMI meeting transcription. The proposed mixed precision quantization techniques achieved lossless quantization on both tasks, by producing model size compression ratios of up to approximately 16 times over the full precision LSTM and Transformer baseline LMs, while incurring no statistically significant word error rate increase." @default.
- W3217016993 created "2021-12-06" @default.
- W3217016993 creator A5004643540 @default.
- W3217016993 creator A5011790590 @default.
- W3217016993 creator A5019458385 @default.
- W3217016993 creator A5037109470 @default.
- W3217016993 creator A5045355404 @default.
- W3217016993 date "2021-01-01" @default.
- W3217016993 modified "2023-10-14" @default.
- W3217016993 title "Mixed Precision Low-Bit Quantization of Neural Network Language Models for Speech Recognition" @default.
- W3217016993 cites W1495076553 @default.
- W3217016993 cites W1735150698 @default.
- W3217016993 cites W179875071 @default.
- W3217016993 cites W1903115690 @default.
- W3217016993 cites W1970689298 @default.
- W3217016993 cites W1985258458 @default.
- W3217016993 cites W1996901117 @default.
- W3217016993 cites W2057450058 @default.
- W3217016993 cites W2058641082 @default.
- W3217016993 cites W2064675550 @default.
- W3217016993 cites W2076094076 @default.
- W3217016993 cites W2094147890 @default.
- W3217016993 cites W2111935653 @default.
- W3217016993 cites W2134237567 @default.
- W3217016993 cites W2158195707 @default.
- W3217016993 cites W2166637769 @default.
- W3217016993 cites W2183341477 @default.
- W3217016993 cites W2194775991 @default.
- W3217016993 cites W2267186426 @default.
- W3217016993 cites W2327501763 @default.
- W3217016993 cites W2408021097 @default.
- W3217016993 cites W2413794162 @default.
- W3217016993 cites W2496955520 @default.
- W3217016993 cites W2508418541 @default.
- W3217016993 cites W2514741789 @default.
- W3217016993 cites W2796265726 @default.
- W3217016993 cites W2803431233 @default.
- W3217016993 cites W2943845043 @default.
- W3217016993 cites W2952265507 @default.
- W3217016993 cites W2952613254 @default.
- W3217016993 cites W2962760690 @default.
- W3217016993 cites W2963125010 @default.
- W3217016993 cites W2963821229 @default.
- W3217016993 cites W2972736078 @default.
- W3217016993 cites W2972818416 @default.
- W3217016993 cites W2973051376 @default.
- W3217016993 cites W2981857663 @default.
- W3217016993 cites W2982041622 @default.
- W3217016993 cites W2982479999 @default.
- W3217016993 cites W3008011454 @default.
- W3217016993 cites W3008191852 @default.
- W3217016993 cites W3015484572 @default.
- W3217016993 cites W3015720739 @default.
- W3217016993 cites W3016010032 @default.
- W3217016993 cites W3016230677 @default.
- W3217016993 cites W3034309359 @default.
- W3217016993 cites W3034887213 @default.
- W3217016993 cites W3048889251 @default.
- W3217016993 cites W3094852745 @default.
- W3217016993 cites W3095311338 @default.
- W3217016993 cites W3095714920 @default.
- W3217016993 cites W3113244915 @default.
- W3217016993 cites W3163368926 @default.
- W3217016993 cites W3196364802 @default.
- W3217016993 doi "https://doi.org/10.1109/taslp.2021.3129357" @default.
- W3217016993 hasPublicationYear "2021" @default.
- W3217016993 type Work @default.
- W3217016993 sameAs 3217016993 @default.
- W3217016993 citedByCount "3" @default.
- W3217016993 countsByYear W32170169932022 @default.
- W3217016993 countsByYear W32170169932023 @default.
- W3217016993 crossrefType "journal-article" @default.
- W3217016993 hasAuthorship W3217016993A5004643540 @default.
- W3217016993 hasAuthorship W3217016993A5011790590 @default.
- W3217016993 hasAuthorship W3217016993A5019458385 @default.
- W3217016993 hasAuthorship W3217016993A5037109470 @default.
- W3217016993 hasAuthorship W3217016993A5045355404 @default.
- W3217016993 hasBestOaLocation W32170169932 @default.
- W3217016993 hasConcept C11413529 @default.
- W3217016993 hasConcept C119599485 @default.
- W3217016993 hasConcept C127413603 @default.
- W3217016993 hasConcept C137293760 @default.
- W3217016993 hasConcept C147168706 @default.
- W3217016993 hasConcept C154945302 @default.
- W3217016993 hasConcept C165801399 @default.
- W3217016993 hasConcept C28490314 @default.
- W3217016993 hasConcept C28855332 @default.
- W3217016993 hasConcept C41008148 @default.
- W3217016993 hasConcept C50644808 @default.
- W3217016993 hasConcept C66322947 @default.
- W3217016993 hasConceptScore W3217016993C11413529 @default.
- W3217016993 hasConceptScore W3217016993C119599485 @default.
- W3217016993 hasConceptScore W3217016993C127413603 @default.
- W3217016993 hasConceptScore W3217016993C137293760 @default.
- W3217016993 hasConceptScore W3217016993C147168706 @default.
- W3217016993 hasConceptScore W3217016993C154945302 @default.
- W3217016993 hasConceptScore W3217016993C165801399 @default.
- W3217016993 hasConceptScore W3217016993C28490314 @default.