Matches in SemOpenAlex for { <https://semopenalex.org/work/W3111036603> ?p ?o ?g. }
- W3111036603 abstract "Despite existing work in machine learning inference serving, ease-of-use and cost efficiency remain challenges at large scales. Developers must manually search through thousands of model-variants -- versions of already-trained models that differ in hardware, resource footprints, latencies, costs, and accuracies -- to meet the diverse application requirements. Since requirements, query load, and applications themselves evolve over time, these decisions need to be made dynamically for each inference query to avoid excessive costs through naive autoscaling. To avoid navigating through the large and complex trade-off space of model-variants, developers often fix a variant across queries, and replicate it when load increases. However, given the diversity across variants and hardware platforms in the cloud, a lack of understanding of the trade-off space can incur significant costs to developers. This paper introduces INFaaS, a managed and model-less system for distributed inference serving, where developers simply specify the performance and accuracy requirements for their applications without needing to specify a specific model-variant for each query. INFaaS generates model-variants, and efficiently navigates the large trade-off space of model-variants on behalf of developers to meet application-specific objectives: (a) for each query, it selects a model, hardware architecture, and model optimizations, (b) it combines VM-level horizontal autoscaling with model-level autoscaling, where multiple, different model-variants are used to serve queries within each machine. By leveraging diverse variants and sharing hardware resources across models, INFaaS achieves 1.3x higher throughput, violates latency objectives 1.6x less often, and saves up to 21.6x in cost (8.5x on average) compared to state-of-the-art inference serving systems on AWS EC2." @default.
- W3111036603 created "2020-12-21" @default.
- W3111036603 creator A5014942240 @default.
- W3111036603 creator A5031510918 @default.
- W3111036603 creator A5042148531 @default.
- W3111036603 creator A5091028590 @default.
- W3111036603 date "2019-05-30" @default.
- W3111036603 modified "2023-09-27" @default.
- W3111036603 title "INFaaS: A Model-less and Managed Inference Serving System" @default.
- W3111036603 cites W2025549137 @default.
- W3111036603 cites W2036790532 @default.
- W3111036603 cites W2037487174 @default.
- W3111036603 cites W2071228820 @default.
- W3111036603 cites W2075233755 @default.
- W3111036603 cites W2078659331 @default.
- W3111036603 cites W2100142570 @default.
- W3111036603 cites W2108598243 @default.
- W3111036603 cites W2125901106 @default.
- W3111036603 cites W2127748654 @default.
- W3111036603 cites W2156077332 @default.
- W3111036603 cites W2163961697 @default.
- W3111036603 cites W2194775991 @default.
- W3111036603 cites W2293393493 @default.
- W3111036603 cites W2309679942 @default.
- W3111036603 cites W2547386789 @default.
- W3111036603 cites W2591324491 @default.
- W3111036603 cites W2595653137 @default.
- W3111036603 cites W2599379624 @default.
- W3111036603 cites W2604856537 @default.
- W3111036603 cites W2606722458 @default.
- W3111036603 cites W2607255160 @default.
- W3111036603 cites W2613597870 @default.
- W3111036603 cites W2626211758 @default.
- W3111036603 cites W2626985503 @default.
- W3111036603 cites W2752236330 @default.
- W3111036603 cites W2765200655 @default.
- W3111036603 cites W2772948367 @default.
- W3111036603 cites W2794670651 @default.
- W3111036603 cites W2804032941 @default.
- W3111036603 cites W2807462012 @default.
- W3111036603 cites W2883929540 @default.
- W3111036603 cites W2887117815 @default.
- W3111036603 cites W2888975141 @default.
- W3111036603 cites W2893813411 @default.
- W3111036603 cites W2895511477 @default.
- W3111036603 cites W2895934479 @default.
- W3111036603 cites W2899044382 @default.
- W3111036603 cites W2899071864 @default.
- W3111036603 cites W2899160556 @default.
- W3111036603 cites W2901305343 @default.
- W3111036603 cites W2906773779 @default.
- W3111036603 cites W2912234349 @default.
- W3111036603 cites W2919594608 @default.
- W3111036603 cites W2928897890 @default.
- W3111036603 cites W2956461999 @default.
- W3111036603 cites W2963065629 @default.
- W3111036603 cites W2963510045 @default.
- W3111036603 cites W2963733194 @default.
- W3111036603 cites W2964108773 @default.
- W3111036603 cites W2982157693 @default.
- W3111036603 cites W2999012726 @default.
- W3111036603 cites W3010637564 @default.
- W3111036603 cites W3011348040 @default.
- W3111036603 cites W3011434423 @default.
- W3111036603 cites W3016842236 @default.
- W3111036603 cites W3043571714 @default.
- W3111036603 cites W3043807073 @default.
- W3111036603 cites W272982077 @default.
- W3111036603 hasPublicationYear "2019" @default.
- W3111036603 type Work @default.
- W3111036603 sameAs 3111036603 @default.
- W3111036603 citedByCount "3" @default.
- W3111036603 countsByYear W31110366032020 @default.
- W3111036603 countsByYear W31110366032021 @default.
- W3111036603 crossrefType "posted-content" @default.
- W3111036603 hasAuthorship W3111036603A5014942240 @default.
- W3111036603 hasAuthorship W3111036603A5031510918 @default.
- W3111036603 hasAuthorship W3111036603A5042148531 @default.
- W3111036603 hasAuthorship W3111036603A5091028590 @default.
- W3111036603 hasConcept C105795698 @default.
- W3111036603 hasConcept C111919701 @default.
- W3111036603 hasConcept C119857082 @default.
- W3111036603 hasConcept C120314980 @default.
- W3111036603 hasConcept C123657996 @default.
- W3111036603 hasConcept C124101348 @default.
- W3111036603 hasConcept C142362112 @default.
- W3111036603 hasConcept C153349607 @default.
- W3111036603 hasConcept C154945302 @default.
- W3111036603 hasConcept C157764524 @default.
- W3111036603 hasConcept C206345919 @default.
- W3111036603 hasConcept C2776214188 @default.
- W3111036603 hasConcept C2781162219 @default.
- W3111036603 hasConcept C31258907 @default.
- W3111036603 hasConcept C33923547 @default.
- W3111036603 hasConcept C41008148 @default.
- W3111036603 hasConcept C555944384 @default.
- W3111036603 hasConcept C76155785 @default.
- W3111036603 hasConcept C79974875 @default.
- W3111036603 hasConcept C82876162 @default.
- W3111036603 hasConceptScore W3111036603C105795698 @default.