Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313584989> ?p ?o ?g. }
- W4313584989 endingPage "240" @default.
- W4313584989 startingPage "240" @default.
- W4313584989 abstract "To accelerate the inference of machine-learning (ML) model serving, clusters of machines require the use of expensive hardware accelerators (e.g., GPUs) to reduce execution time. Advanced inference serving systems are needed to satisfy latency service-level objectives (SLOs) in a cost-effective manner. Novel autoscaling mechanisms that greedily minimize the number of service instances while ensuring SLO compliance are helpful. However, we find that it is not adequate to guarantee cost effectiveness across heterogeneous GPU hardware, and this does not maximize resource utilization. In this paper, we propose HetSev to address these challenges by incorporating heterogeneity-aware autoscaling and resource-efficient scheduling to achieve cost effectiveness. We develop an autoscaling mechanism which accounts for SLO compliance and GPU heterogeneity, thus provisioning the appropriate type and number of instances to guarantee cost effectiveness. We leverage multi-tenant inference to improve GPU resource utilization, while alleviating inter-tenant interference by avoiding the co-location of identical ML instances on the same GPU during placement decisions. HetSev is integrated into Kubernetes and deployed onto a heterogeneous GPU cluster. We evaluated the performance of HetSev using several representative ML models. Compared with default Kubernetes, HetSev reduces resource cost by up to 2.15× while meeting SLO requirements." @default.
- W4313584989 created "2023-01-06" @default.
- W4313584989 creator A5017195907 @default.
- W4313584989 creator A5019301429 @default.
- W4313584989 creator A5042228098 @default.
- W4313584989 creator A5052255801 @default.
- W4313584989 creator A5072803445 @default.
- W4313584989 date "2023-01-03" @default.
- W4313584989 modified "2023-10-14" @default.
- W4313584989 title "HetSev: Exploiting Heterogeneity-Aware Autoscaling and Resource-Efficient Scheduling for Cost-Effective Machine-Learning Model Serving" @default.
- W4313584989 cites W1978502921 @default.
- W4313584989 cites W2006197381 @default.
- W4313584989 cites W2019659243 @default.
- W4313584989 cites W2040222007 @default.
- W4313584989 cites W2075233755 @default.
- W4313584989 cites W2604514113 @default.
- W4313584989 cites W2772948367 @default.
- W4313584989 cites W2903278032 @default.
- W4313584989 cites W2906810629 @default.
- W4313584989 cites W2962826786 @default.
- W4313584989 cites W2982157693 @default.
- W4313584989 cites W3016939927 @default.
- W4313584989 cites W3037377931 @default.
- W4313584989 cites W3039010666 @default.
- W4313584989 cites W3043433718 @default.
- W4313584989 cites W3043571714 @default.
- W4313584989 cites W3048441649 @default.
- W4313584989 cites W3092052838 @default.
- W4313584989 cites W3097411828 @default.
- W4313584989 cites W3101026687 @default.
- W4313584989 cites W3106250896 @default.
- W4313584989 cites W3165292244 @default.
- W4313584989 cites W3217445637 @default.
- W4313584989 cites W4206425530 @default.
- W4313584989 cites W4236491543 @default.
- W4313584989 cites W4253824360 @default.
- W4313584989 cites W4286307983 @default.
- W4313584989 cites W4292975105 @default.
- W4313584989 cites W4293731856 @default.
- W4313584989 cites W4313229743 @default.
- W4313584989 cites W2883315052 @default.
- W4313584989 doi "https://doi.org/10.3390/electronics12010240" @default.
- W4313584989 hasPublicationYear "2023" @default.
- W4313584989 type Work @default.
- W4313584989 citedByCount "0" @default.
- W4313584989 crossrefType "journal-article" @default.
- W4313584989 hasAuthorship W4313584989A5017195907 @default.
- W4313584989 hasAuthorship W4313584989A5019301429 @default.
- W4313584989 hasAuthorship W4313584989A5042228098 @default.
- W4313584989 hasAuthorship W4313584989A5052255801 @default.
- W4313584989 hasAuthorship W4313584989A5072803445 @default.
- W4313584989 hasBestOaLocation W43135849891 @default.
- W4313584989 hasConcept C111919701 @default.
- W4313584989 hasConcept C119857082 @default.
- W4313584989 hasConcept C120314980 @default.
- W4313584989 hasConcept C153083717 @default.
- W4313584989 hasConcept C154945302 @default.
- W4313584989 hasConcept C162324750 @default.
- W4313584989 hasConcept C172191483 @default.
- W4313584989 hasConcept C206729178 @default.
- W4313584989 hasConcept C21547014 @default.
- W4313584989 hasConcept C2776214188 @default.
- W4313584989 hasConcept C31258907 @default.
- W4313584989 hasConcept C41008148 @default.
- W4313584989 hasConcept C76155785 @default.
- W4313584989 hasConcept C79974875 @default.
- W4313584989 hasConcept C82876162 @default.
- W4313584989 hasConceptScore W4313584989C111919701 @default.
- W4313584989 hasConceptScore W4313584989C119857082 @default.
- W4313584989 hasConceptScore W4313584989C120314980 @default.
- W4313584989 hasConceptScore W4313584989C153083717 @default.
- W4313584989 hasConceptScore W4313584989C154945302 @default.
- W4313584989 hasConceptScore W4313584989C162324750 @default.
- W4313584989 hasConceptScore W4313584989C172191483 @default.
- W4313584989 hasConceptScore W4313584989C206729178 @default.
- W4313584989 hasConceptScore W4313584989C21547014 @default.
- W4313584989 hasConceptScore W4313584989C2776214188 @default.
- W4313584989 hasConceptScore W4313584989C31258907 @default.
- W4313584989 hasConceptScore W4313584989C41008148 @default.
- W4313584989 hasConceptScore W4313584989C76155785 @default.
- W4313584989 hasConceptScore W4313584989C79974875 @default.
- W4313584989 hasConceptScore W4313584989C82876162 @default.
- W4313584989 hasIssue "1" @default.
- W4313584989 hasLocation W43135849891 @default.
- W4313584989 hasOpenAccess W4313584989 @default.
- W4313584989 hasPrimaryLocation W43135849891 @default.
- W4313584989 hasRelatedWork W1982410425 @default.
- W4313584989 hasRelatedWork W2084667297 @default.
- W4313584989 hasRelatedWork W2157044008 @default.
- W4313584989 hasRelatedWork W2478702624 @default.
- W4313584989 hasRelatedWork W2550639320 @default.
- W4313584989 hasRelatedWork W2560090078 @default.
- W4313584989 hasRelatedWork W2905824599 @default.
- W4313584989 hasRelatedWork W2938610524 @default.
- W4313584989 hasRelatedWork W2949516016 @default.
- W4313584989 hasRelatedWork W2182913205 @default.
- W4313584989 hasVolume "12" @default.
- W4313584989 isParatext "false" @default.