Matches in SemOpenAlex for { <https://semopenalex.org/work/W4384705416> ?p ?o ?g. }
Showing items 1 to 74 of
74
with 100 items per page.
- W4384705416 abstract "Function-as-a-Service (FaaS) is emerging as an important cloud computing service model as it can improve the scalability and usability of a wide range of applications, especially Machine-Learning (ML) inference tasks that require scalable resources and complex software configurations. These inference tasks heavily rely on GPUs to achieve high performance; however, support for GPUs is currently lacking in the existing FaaS solutions. The unique event-triggered and short-lived nature of functions poses new challenges to enabling GPUs on FaaS, which must consider the overhead of transferring data (e.g., ML model parameters and inputs/outputs) between GPU and host memory. This paper proposes a novel GPU-enabled FaaS solution that enables ML inference functions to efficiently utilize GPUs to accelerate their computations. First, it extends existing FaaS frameworks such as OpenFaaS to support the scheduling and execution of functions across GPUs in a FaaS cluster. Second, it provides caching of ML models in GPU memory to improve the performance of model inference functions and global management of GPU memories to improve cache utilization. Third, it offers co-designed GPU function scheduling and cache management to optimize the performance of ML inference functions. Specifically, the paper proposes locality-aware scheduling, which maximizes the utilization of both GPU memory for cache hits and GPU cores for parallel processing. A thorough evaluation based on real-world traces and ML models shows that the proposed GPU-enabled FaaS works well for ML inference tasks, and the proposed locality-aware scheduler achieves a speedup of 48x compared to the default, load balancing only schedulers." @default.
- W4384705416 created "2023-07-20" @default.
- W4384705416 creator A5010952470 @default.
- W4384705416 creator A5053975355 @default.
- W4384705416 creator A5061858636 @default.
- W4384705416 date "2023-05-01" @default.
- W4384705416 modified "2023-10-16" @default.
- W4384705416 title "GPU-enabled Function-as-a-Service for Machine Learning Inference" @default.
- W4384705416 cites W2005574683 @default.
- W4384705416 cites W2805722953 @default.
- W4384705416 cites W2970828176 @default.
- W4384705416 cites W2979826702 @default.
- W4384705416 cites W3003011076 @default.
- W4384705416 cites W3037377931 @default.
- W4384705416 cites W3173555539 @default.
- W4384705416 doi "https://doi.org/10.1109/ipdps54959.2023.00096" @default.
- W4384705416 hasPublicationYear "2023" @default.
- W4384705416 type Work @default.
- W4384705416 citedByCount "0" @default.
- W4384705416 crossrefType "proceedings-article" @default.
- W4384705416 hasAuthorship W4384705416A5010952470 @default.
- W4384705416 hasAuthorship W4384705416A5053975355 @default.
- W4384705416 hasAuthorship W4384705416A5061858636 @default.
- W4384705416 hasConcept C111919701 @default.
- W4384705416 hasConcept C115537543 @default.
- W4384705416 hasConcept C120314980 @default.
- W4384705416 hasConcept C138885662 @default.
- W4384705416 hasConcept C154945302 @default.
- W4384705416 hasConcept C162324750 @default.
- W4384705416 hasConcept C173608175 @default.
- W4384705416 hasConcept C206729178 @default.
- W4384705416 hasConcept C21547014 @default.
- W4384705416 hasConcept C2776214188 @default.
- W4384705416 hasConcept C2778119891 @default.
- W4384705416 hasConcept C2779808786 @default.
- W4384705416 hasConcept C2781335571 @default.
- W4384705416 hasConcept C41008148 @default.
- W4384705416 hasConcept C41895202 @default.
- W4384705416 hasConcept C48044578 @default.
- W4384705416 hasConcept C68339613 @default.
- W4384705416 hasConceptScore W4384705416C111919701 @default.
- W4384705416 hasConceptScore W4384705416C115537543 @default.
- W4384705416 hasConceptScore W4384705416C120314980 @default.
- W4384705416 hasConceptScore W4384705416C138885662 @default.
- W4384705416 hasConceptScore W4384705416C154945302 @default.
- W4384705416 hasConceptScore W4384705416C162324750 @default.
- W4384705416 hasConceptScore W4384705416C173608175 @default.
- W4384705416 hasConceptScore W4384705416C206729178 @default.
- W4384705416 hasConceptScore W4384705416C21547014 @default.
- W4384705416 hasConceptScore W4384705416C2776214188 @default.
- W4384705416 hasConceptScore W4384705416C2778119891 @default.
- W4384705416 hasConceptScore W4384705416C2779808786 @default.
- W4384705416 hasConceptScore W4384705416C2781335571 @default.
- W4384705416 hasConceptScore W4384705416C41008148 @default.
- W4384705416 hasConceptScore W4384705416C41895202 @default.
- W4384705416 hasConceptScore W4384705416C48044578 @default.
- W4384705416 hasConceptScore W4384705416C68339613 @default.
- W4384705416 hasFunder F4320306076 @default.
- W4384705416 hasLocation W43847054161 @default.
- W4384705416 hasOpenAccess W4384705416 @default.
- W4384705416 hasPrimaryLocation W43847054161 @default.
- W4384705416 hasRelatedWork W108745714 @default.
- W4384705416 hasRelatedWork W1498943610 @default.
- W4384705416 hasRelatedWork W2116951845 @default.
- W4384705416 hasRelatedWork W2139061608 @default.
- W4384705416 hasRelatedWork W2268138364 @default.
- W4384705416 hasRelatedWork W2582456645 @default.
- W4384705416 hasRelatedWork W2883383827 @default.
- W4384705416 hasRelatedWork W2981024064 @default.
- W4384705416 hasRelatedWork W3207203625 @default.
- W4384705416 hasRelatedWork W4235985402 @default.
- W4384705416 isParatext "false" @default.
- W4384705416 isRetracted "false" @default.
- W4384705416 workType "article" @default.