Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287632844> ?p ?o ?g. }
Showing items 1 to 61 of
61
with 100 items per page.
- W4287632844 abstract "In cloud ML inference systems, batching is an essential technique to increase throughput which helps optimize total-cost-of-ownership. Prior graph batching combines the individual DNN graphs into a single one, allowing multiple inputs to be concurrently executed in parallel. We observe that the coarse-grained graph batching becomes suboptimal in effectively handling the dynamic inference request traffic, leaving significant performance left on the table. This paper proposes LazyBatching, an SLA-aware batching system that considers both scheduling and batching in the granularity of individual graph nodes, rather than the entire graph for flexible batching. We show that LazyBatching can intelligently determine the set of nodes that can be efficiently batched together, achieving an average 15x, 1.5x, and 5.5x improvement than graph batching in terms of average response time, throughput, and SLA satisfaction, respectively." @default.
- W4287632844 created "2022-07-25" @default.
- W4287632844 creator A5057737838 @default.
- W4287632844 creator A5070087229 @default.
- W4287632844 creator A5091648103 @default.
- W4287632844 date "2020-10-25" @default.
- W4287632844 modified "2023-09-28" @default.
- W4287632844 title "LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference" @default.
- W4287632844 doi "https://doi.org/10.48550/arxiv.2010.13103" @default.
- W4287632844 hasPublicationYear "2020" @default.
- W4287632844 type Work @default.
- W4287632844 citedByCount "0" @default.
- W4287632844 crossrefType "posted-content" @default.
- W4287632844 hasAuthorship W4287632844A5057737838 @default.
- W4287632844 hasAuthorship W4287632844A5070087229 @default.
- W4287632844 hasAuthorship W4287632844A5091648103 @default.
- W4287632844 hasBestOaLocation W42876328441 @default.
- W4287632844 hasConcept C111919701 @default.
- W4287632844 hasConcept C120314980 @default.
- W4287632844 hasConcept C126255220 @default.
- W4287632844 hasConcept C132525143 @default.
- W4287632844 hasConcept C154945302 @default.
- W4287632844 hasConcept C157764524 @default.
- W4287632844 hasConcept C177774035 @default.
- W4287632844 hasConcept C206729178 @default.
- W4287632844 hasConcept C2776214188 @default.
- W4287632844 hasConcept C33923547 @default.
- W4287632844 hasConcept C41008148 @default.
- W4287632844 hasConcept C555944384 @default.
- W4287632844 hasConcept C79974875 @default.
- W4287632844 hasConcept C80444323 @default.
- W4287632844 hasConceptScore W4287632844C111919701 @default.
- W4287632844 hasConceptScore W4287632844C120314980 @default.
- W4287632844 hasConceptScore W4287632844C126255220 @default.
- W4287632844 hasConceptScore W4287632844C132525143 @default.
- W4287632844 hasConceptScore W4287632844C154945302 @default.
- W4287632844 hasConceptScore W4287632844C157764524 @default.
- W4287632844 hasConceptScore W4287632844C177774035 @default.
- W4287632844 hasConceptScore W4287632844C206729178 @default.
- W4287632844 hasConceptScore W4287632844C2776214188 @default.
- W4287632844 hasConceptScore W4287632844C33923547 @default.
- W4287632844 hasConceptScore W4287632844C41008148 @default.
- W4287632844 hasConceptScore W4287632844C555944384 @default.
- W4287632844 hasConceptScore W4287632844C79974875 @default.
- W4287632844 hasConceptScore W4287632844C80444323 @default.
- W4287632844 hasLocation W42876328441 @default.
- W4287632844 hasOpenAccess W4287632844 @default.
- W4287632844 hasPrimaryLocation W42876328441 @default.
- W4287632844 hasRelatedWork W1593422682 @default.
- W4287632844 hasRelatedWork W1882733036 @default.
- W4287632844 hasRelatedWork W1995399085 @default.
- W4287632844 hasRelatedWork W2015013785 @default.
- W4287632844 hasRelatedWork W2039968861 @default.
- W4287632844 hasRelatedWork W2160425906 @default.
- W4287632844 hasRelatedWork W2380023786 @default.
- W4287632844 hasRelatedWork W2546696010 @default.
- W4287632844 hasRelatedWork W275032887 @default.
- W4287632844 hasRelatedWork W3176398502 @default.
- W4287632844 isParatext "false" @default.
- W4287632844 isRetracted "false" @default.
- W4287632844 workType "article" @default.