Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386841260> ?p ?o ?g. }
Showing items 1 to 91 of
91
with 100 items per page.
- W4386841260 abstract "We present a novel inference scheme, self-speculative decoding, for accelerating Large Language Models (LLMs) without the need for an auxiliary model. This approach is characterized by a two-stage process: drafting and verification. The drafting stage generates draft tokens at a slightly lower quality but more quickly, which is achieved by selectively skipping certain intermediate layers during drafting Subsequently, the verification stage employs the original LLM to validate those draft output tokens in one forward pass. This process ensures the final output remains identical to that produced by the unaltered LLM, thereby maintaining output quality. The proposed method requires no additional neural network training and no extra memory footprint, making it a plug-and-play and cost-effective solution for inference acceleration. Benchmarks with LLaMA-2 and its fine-tuned models demonstrated a speedup up to 1.73$times$." @default.
- W4386841260 created "2023-09-19" @default.
- W4386841260 creator A5001014826 @default.
- W4386841260 creator A5010114060 @default.
- W4386841260 creator A5027918340 @default.
- W4386841260 creator A5049458306 @default.
- W4386841260 creator A5053843140 @default.
- W4386841260 creator A5062991118 @default.
- W4386841260 creator A5076036593 @default.
- W4386841260 date "2023-09-15" @default.
- W4386841260 modified "2023-09-27" @default.
- W4386841260 title "Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding" @default.
- W4386841260 doi "https://doi.org/10.48550/arxiv.2309.08168" @default.
- W4386841260 hasPublicationYear "2023" @default.
- W4386841260 type Work @default.
- W4386841260 citedByCount "0" @default.
- W4386841260 crossrefType "posted-content" @default.
- W4386841260 hasAuthorship W4386841260A5001014826 @default.
- W4386841260 hasAuthorship W4386841260A5010114060 @default.
- W4386841260 hasAuthorship W4386841260A5027918340 @default.
- W4386841260 hasAuthorship W4386841260A5049458306 @default.
- W4386841260 hasAuthorship W4386841260A5053843140 @default.
- W4386841260 hasAuthorship W4386841260A5062991118 @default.
- W4386841260 hasAuthorship W4386841260A5076036593 @default.
- W4386841260 hasBestOaLocation W43868412601 @default.
- W4386841260 hasConcept C111472728 @default.
- W4386841260 hasConcept C113775141 @default.
- W4386841260 hasConcept C11413529 @default.
- W4386841260 hasConcept C117896860 @default.
- W4386841260 hasConcept C121332964 @default.
- W4386841260 hasConcept C132943942 @default.
- W4386841260 hasConcept C134306372 @default.
- W4386841260 hasConcept C138885662 @default.
- W4386841260 hasConcept C151730666 @default.
- W4386841260 hasConcept C154945302 @default.
- W4386841260 hasConcept C173608175 @default.
- W4386841260 hasConcept C199360897 @default.
- W4386841260 hasConcept C2776214188 @default.
- W4386841260 hasConcept C2779530757 @default.
- W4386841260 hasConcept C33923547 @default.
- W4386841260 hasConcept C41008148 @default.
- W4386841260 hasConcept C57273362 @default.
- W4386841260 hasConcept C68339613 @default.
- W4386841260 hasConcept C74650414 @default.
- W4386841260 hasConcept C74912251 @default.
- W4386841260 hasConcept C77618280 @default.
- W4386841260 hasConcept C78548338 @default.
- W4386841260 hasConcept C81081738 @default.
- W4386841260 hasConcept C86803240 @default.
- W4386841260 hasConcept C98045186 @default.
- W4386841260 hasConceptScore W4386841260C111472728 @default.
- W4386841260 hasConceptScore W4386841260C113775141 @default.
- W4386841260 hasConceptScore W4386841260C11413529 @default.
- W4386841260 hasConceptScore W4386841260C117896860 @default.
- W4386841260 hasConceptScore W4386841260C121332964 @default.
- W4386841260 hasConceptScore W4386841260C132943942 @default.
- W4386841260 hasConceptScore W4386841260C134306372 @default.
- W4386841260 hasConceptScore W4386841260C138885662 @default.
- W4386841260 hasConceptScore W4386841260C151730666 @default.
- W4386841260 hasConceptScore W4386841260C154945302 @default.
- W4386841260 hasConceptScore W4386841260C173608175 @default.
- W4386841260 hasConceptScore W4386841260C199360897 @default.
- W4386841260 hasConceptScore W4386841260C2776214188 @default.
- W4386841260 hasConceptScore W4386841260C2779530757 @default.
- W4386841260 hasConceptScore W4386841260C33923547 @default.
- W4386841260 hasConceptScore W4386841260C41008148 @default.
- W4386841260 hasConceptScore W4386841260C57273362 @default.
- W4386841260 hasConceptScore W4386841260C68339613 @default.
- W4386841260 hasConceptScore W4386841260C74650414 @default.
- W4386841260 hasConceptScore W4386841260C74912251 @default.
- W4386841260 hasConceptScore W4386841260C77618280 @default.
- W4386841260 hasConceptScore W4386841260C78548338 @default.
- W4386841260 hasConceptScore W4386841260C81081738 @default.
- W4386841260 hasConceptScore W4386841260C86803240 @default.
- W4386841260 hasConceptScore W4386841260C98045186 @default.
- W4386841260 hasLocation W43868412601 @default.
- W4386841260 hasOpenAccess W4386841260 @default.
- W4386841260 hasPrimaryLocation W43868412601 @default.
- W4386841260 hasRelatedWork W2727962976 @default.
- W4386841260 hasRelatedWork W2752721426 @default.
- W4386841260 hasRelatedWork W2981519481 @default.
- W4386841260 hasRelatedWork W3168991979 @default.
- W4386841260 hasRelatedWork W4225752358 @default.
- W4386841260 hasRelatedWork W4286233754 @default.
- W4386841260 hasRelatedWork W4307933444 @default.
- W4386841260 hasRelatedWork W4319995060 @default.
- W4386841260 hasRelatedWork W4378806073 @default.
- W4386841260 hasRelatedWork W593992998 @default.
- W4386841260 isParatext "false" @default.
- W4386841260 isRetracted "false" @default.
- W4386841260 workType "article" @default.