Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385848788> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4385848788 abstract "Generative Language Models (GLMs) have shown impressive performance in tasks such as text generation, understanding, and reasoning. However, the large model size poses challenges for practical deployment. To solve this problem, Quantization-Aware Training (QAT) has become increasingly popular. However, current QAT methods for generative models have resulted in a noticeable loss of accuracy. To counteract this issue, we propose a novel knowledge distillation method specifically designed for GLMs. Our method, called token-scaled logit distillation, prevents overfitting and provides superior learning from the teacher model and ground truth. This research marks the first evaluation of ternary weight quantization-aware training of large-scale GLMs with less than 1.0 degradation in perplexity and no loss of accuracy in a reasoning task." @default.
- W4385848788 created "2023-08-16" @default.
- W4385848788 creator A5024679686 @default.
- W4385848788 creator A5026330969 @default.
- W4385848788 creator A5028070594 @default.
- W4385848788 creator A5036363860 @default.
- W4385848788 creator A5043250246 @default.
- W4385848788 creator A5066359770 @default.
- W4385848788 creator A5089720016 @default.
- W4385848788 date "2023-08-13" @default.
- W4385848788 modified "2023-10-18" @default.
- W4385848788 title "Token-Scaled Logit Distillation for Ternary Weight Generative Language Models" @default.
- W4385848788 doi "https://doi.org/10.48550/arxiv.2308.06744" @default.
- W4385848788 hasPublicationYear "2023" @default.
- W4385848788 type Work @default.
- W4385848788 citedByCount "0" @default.
- W4385848788 crossrefType "posted-content" @default.
- W4385848788 hasAuthorship W4385848788A5024679686 @default.
- W4385848788 hasAuthorship W4385848788A5026330969 @default.
- W4385848788 hasAuthorship W4385848788A5028070594 @default.
- W4385848788 hasAuthorship W4385848788A5036363860 @default.
- W4385848788 hasAuthorship W4385848788A5043250246 @default.
- W4385848788 hasAuthorship W4385848788A5066359770 @default.
- W4385848788 hasAuthorship W4385848788A5089720016 @default.
- W4385848788 hasBestOaLocation W43858487881 @default.
- W4385848788 hasConcept C100279451 @default.
- W4385848788 hasConcept C11413529 @default.
- W4385848788 hasConcept C119857082 @default.
- W4385848788 hasConcept C137293760 @default.
- W4385848788 hasConcept C140331021 @default.
- W4385848788 hasConcept C154945302 @default.
- W4385848788 hasConcept C167966045 @default.
- W4385848788 hasConcept C199360897 @default.
- W4385848788 hasConcept C22019652 @default.
- W4385848788 hasConcept C2776214188 @default.
- W4385848788 hasConcept C28855332 @default.
- W4385848788 hasConcept C38652104 @default.
- W4385848788 hasConcept C39890363 @default.
- W4385848788 hasConcept C41008148 @default.
- W4385848788 hasConcept C48145219 @default.
- W4385848788 hasConcept C50644808 @default.
- W4385848788 hasConcept C64452783 @default.
- W4385848788 hasConceptScore W4385848788C100279451 @default.
- W4385848788 hasConceptScore W4385848788C11413529 @default.
- W4385848788 hasConceptScore W4385848788C119857082 @default.
- W4385848788 hasConceptScore W4385848788C137293760 @default.
- W4385848788 hasConceptScore W4385848788C140331021 @default.
- W4385848788 hasConceptScore W4385848788C154945302 @default.
- W4385848788 hasConceptScore W4385848788C167966045 @default.
- W4385848788 hasConceptScore W4385848788C199360897 @default.
- W4385848788 hasConceptScore W4385848788C22019652 @default.
- W4385848788 hasConceptScore W4385848788C2776214188 @default.
- W4385848788 hasConceptScore W4385848788C28855332 @default.
- W4385848788 hasConceptScore W4385848788C38652104 @default.
- W4385848788 hasConceptScore W4385848788C39890363 @default.
- W4385848788 hasConceptScore W4385848788C41008148 @default.
- W4385848788 hasConceptScore W4385848788C48145219 @default.
- W4385848788 hasConceptScore W4385848788C50644808 @default.
- W4385848788 hasConceptScore W4385848788C64452783 @default.
- W4385848788 hasLocation W43858487881 @default.
- W4385848788 hasOpenAccess W4385848788 @default.
- W4385848788 hasPrimaryLocation W43858487881 @default.
- W4385848788 hasRelatedWork W2757136988 @default.
- W4385848788 hasRelatedWork W2804228899 @default.
- W4385848788 hasRelatedWork W2883697671 @default.
- W4385848788 hasRelatedWork W2952492946 @default.
- W4385848788 hasRelatedWork W3019417368 @default.
- W4385848788 hasRelatedWork W3115385615 @default.
- W4385848788 hasRelatedWork W3129157811 @default.
- W4385848788 hasRelatedWork W3165012362 @default.
- W4385848788 hasRelatedWork W4319653417 @default.
- W4385848788 hasRelatedWork W4368304427 @default.
- W4385848788 isParatext "false" @default.
- W4385848788 isRetracted "false" @default.
- W4385848788 workType "article" @default.