Matches in SemOpenAlex for { <https://semopenalex.org/work/W3110271845> ?p ?o ?g. }
- W3110271845 abstract "Transformer-based language models such as BERT provide significant accuracy improvement to a multitude of natural language processing (NLP) tasks. However, their hefty computational and memory demands make them challenging to deploy to resource-constrained edge platforms with strict latency requirements. We present EdgeBERT an in-depth and principled algorithm and hardware design methodology to achieve minimal latency and energy consumption on multi-task NLP inference. Compared to the ALBERT baseline, we achieve up to 2.4x and 13.4x inference latency and memory savings, respectively, with less than 1%-pt drop in accuracy on several GLUE benchmarks by employing a calibrated combination of 1) entropy-based early stopping, 2) adaptive attention span, 3) movement and magnitude pruning, and 4) floating-point quantization. Furthermore, in order to maximize the benefits of these algorithms in always-on and intermediate edge computing settings, we specialize a scalable hardware architecture wherein floating-point bit encodings of the shareable multi-task embedding parameters are stored in high-density non-volatile memory. Altogether, EdgeBERT enables fully on-chip inference acceleration of NLP workloads with 5.2x, and 157x lower energy than that of an un-optimized accelerator and CUDA adaptations on an Nvidia Jetson Tegra X2 mobile GPU, respectively." @default.
- W3110271845 created "2020-12-07" @default.
- W3110271845 creator A5005762501 @default.
- W3110271845 creator A5007438902 @default.
- W3110271845 creator A5026496503 @default.
- W3110271845 creator A5043327132 @default.
- W3110271845 creator A5049805631 @default.
- W3110271845 creator A5085355324 @default.
- W3110271845 creator A5086251333 @default.
- W3110271845 creator A5088648435 @default.
- W3110271845 creator A5088786790 @default.
- W3110271845 date "2020-11-28" @default.
- W3110271845 modified "2023-09-23" @default.
- W3110271845 title "EdgeBERT: Optimizing On-Chip Inference for Multi-Task NLP." @default.
- W3110271845 cites W1966939297 @default.
- W3110271845 cites W1981321271 @default.
- W3110271845 cites W2000967104 @default.
- W3110271845 cites W2010202670 @default.
- W3110271845 cites W2020740707 @default.
- W3110271845 cites W2063799733 @default.
- W3110271845 cites W2152839228 @default.
- W3110271845 cites W2285660444 @default.
- W3110271845 cites W2513554817 @default.
- W3110271845 cites W2515287984 @default.
- W3110271845 cites W2516141709 @default.
- W3110271845 cites W2562773490 @default.
- W3110271845 cites W2606722458 @default.
- W3110271845 cites W2625457103 @default.
- W3110271845 cites W2626991402 @default.
- W3110271845 cites W2809188712 @default.
- W3110271845 cites W2883030312 @default.
- W3110271845 cites W2883283076 @default.
- W3110271845 cites W2883542588 @default.
- W3110271845 cites W2883899390 @default.
- W3110271845 cites W2883920103 @default.
- W3110271845 cites W2945969196 @default.
- W3110271845 cites W2946567085 @default.
- W3110271845 cites W2949117887 @default.
- W3110271845 cites W2962677625 @default.
- W3110271845 cites W2963310665 @default.
- W3110271845 cites W2963341956 @default.
- W3110271845 cites W2963403868 @default.
- W3110271845 cites W2963594949 @default.
- W3110271845 cites W2964299589 @default.
- W3110271845 cites W2964308564 @default.
- W3110271845 cites W2965373594 @default.
- W3110271845 cites W2970565456 @default.
- W3110271845 cites W2977250065 @default.
- W3110271845 cites W2978017171 @default.
- W3110271845 cites W2980104813 @default.
- W3110271845 cites W2980282514 @default.
- W3110271845 cites W2980965328 @default.
- W3110271845 cites W2996428491 @default.
- W3110271845 cites W2997929983 @default.
- W3110271845 cites W2998183051 @default.
- W3110271845 cites W3006586535 @default.
- W3110271845 cites W3017024317 @default.
- W3110271845 cites W3034457371 @default.
- W3110271845 cites W3035038672 @default.
- W3110271845 cites W3035251378 @default.
- W3110271845 cites W3037132819 @default.
- W3110271845 cites W3042501387 @default.
- W3110271845 cites W3087738387 @default.
- W3110271845 cites W3177265267 @default.
- W3110271845 hasPublicationYear "2020" @default.
- W3110271845 type Work @default.
- W3110271845 sameAs 3110271845 @default.
- W3110271845 citedByCount "1" @default.
- W3110271845 countsByYear W31102718452021 @default.
- W3110271845 crossrefType "posted-content" @default.
- W3110271845 hasAuthorship W3110271845A5005762501 @default.
- W3110271845 hasAuthorship W3110271845A5007438902 @default.
- W3110271845 hasAuthorship W3110271845A5026496503 @default.
- W3110271845 hasAuthorship W3110271845A5043327132 @default.
- W3110271845 hasAuthorship W3110271845A5049805631 @default.
- W3110271845 hasAuthorship W3110271845A5085355324 @default.
- W3110271845 hasAuthorship W3110271845A5086251333 @default.
- W3110271845 hasAuthorship W3110271845A5088648435 @default.
- W3110271845 hasAuthorship W3110271845A5088786790 @default.
- W3110271845 hasConcept C111919701 @default.
- W3110271845 hasConcept C113775141 @default.
- W3110271845 hasConcept C121332964 @default.
- W3110271845 hasConcept C137293760 @default.
- W3110271845 hasConcept C154945302 @default.
- W3110271845 hasConcept C165801399 @default.
- W3110271845 hasConcept C173608175 @default.
- W3110271845 hasConcept C2776214188 @default.
- W3110271845 hasConcept C41008148 @default.
- W3110271845 hasConcept C48044578 @default.
- W3110271845 hasConcept C62520636 @default.
- W3110271845 hasConcept C66322947 @default.
- W3110271845 hasConceptScore W3110271845C111919701 @default.
- W3110271845 hasConceptScore W3110271845C113775141 @default.
- W3110271845 hasConceptScore W3110271845C121332964 @default.
- W3110271845 hasConceptScore W3110271845C137293760 @default.
- W3110271845 hasConceptScore W3110271845C154945302 @default.
- W3110271845 hasConceptScore W3110271845C165801399 @default.
- W3110271845 hasConceptScore W3110271845C173608175 @default.
- W3110271845 hasConceptScore W3110271845C2776214188 @default.
- W3110271845 hasConceptScore W3110271845C41008148 @default.