Matches in SemOpenAlex for { <https://semopenalex.org/work/W2947883979> ?p ?o ?g. }
- W2947883979 abstract "The number of applications relying on inference from machine learning models is already large and expected to keep growing. For instance, Facebook applications issue tens-of-trillions of inference queries per day with varying performance, accuracy, and cost constraints. Unfortunately, existing inference serving systems are neither easy to use nor cost effective. Developers must manually match the performance, accuracy, and cost constraints of their applications to a large design space that includes decisions such as selecting the right model and model optimizations, selecting the right hardware architecture, selecting the right scale-out factor, and avoiding cold-start effects. These interacting decisions are difficult to make, especially when the application load varies over time, applications evolve over time, and the available resources vary over time. We present INFaaS, an inference-as-a-service system that abstracts resource management and model selection. Users simply specify their inference task along with any performance and accuracy requirements for queries. Given the currently available resources, INFaaS automatically selects and serves inference queries using a specific model that satisfies these requirements. INFaaS autoscales resources as model load changes both within and across inference workers. It also shares workers across users and models to increase utilization. We evaluate INFaaS using 44 model architectures and their 270 model variants against serving systems that rely on users for model selection and pre-load models, fix the scale policy, or use dedicated hardware resources. Our evaluation on realistic workloads shows that INFaaS achieves 2$times$ higher throughput and violates latency SLO goals 3$times$ less frequently, while maintaining high utilization and having overheads that are less than 12% of millisecond-scale queries." @default.
- W2947883979 created "2019-06-07" @default.
- W2947883979 creator A5014942240 @default.
- W2947883979 creator A5031510918 @default.
- W2947883979 creator A5042148531 @default.
- W2947883979 creator A5091028590 @default.
- W2947883979 date "2019-05-30" @default.
- W2947883979 modified "2023-09-24" @default.
- W2947883979 title "INFaaS: Managed & Model-less Inference Serving" @default.
- W2947883979 cites W2025549137 @default.
- W2947883979 cites W2075233755 @default.
- W2947883979 cites W2100142570 @default.
- W2947883979 cites W2108598243 @default.
- W2947883979 cites W2125901106 @default.
- W2947883979 cites W2127748654 @default.
- W2947883979 cites W2156077332 @default.
- W2947883979 cites W2194775991 @default.
- W2947883979 cites W2293393493 @default.
- W2947883979 cites W2309679942 @default.
- W2947883979 cites W2591324491 @default.
- W2947883979 cites W2606722458 @default.
- W2947883979 cites W2607255160 @default.
- W2947883979 cites W2613597870 @default.
- W2947883979 cites W2626985503 @default.
- W2947883979 cites W2752236330 @default.
- W2947883979 cites W2772948367 @default.
- W2947883979 cites W2794670651 @default.
- W2947883979 cites W2804032941 @default.
- W2947883979 cites W2807462012 @default.
- W2947883979 cites W2887117815 @default.
- W2947883979 cites W2888975141 @default.
- W2947883979 cites W2895511477 @default.
- W2947883979 cites W2899071864 @default.
- W2947883979 cites W2906773779 @default.
- W2947883979 cites W2919594608 @default.
- W2947883979 cites W2928897890 @default.
- W2947883979 cites W2941938531 @default.
- W2947883979 cites W2963065629 @default.
- W2947883979 cites W2964108773 @default.
- W2947883979 hasPublicationYear "2019" @default.
- W2947883979 type Work @default.
- W2947883979 sameAs 2947883979 @default.
- W2947883979 citedByCount "4" @default.
- W2947883979 countsByYear W29478839792020 @default.
- W2947883979 countsByYear W29478839792021 @default.
- W2947883979 crossrefType "posted-content" @default.
- W2947883979 hasAuthorship W2947883979A5014942240 @default.
- W2947883979 hasAuthorship W2947883979A5031510918 @default.
- W2947883979 hasAuthorship W2947883979A5042148531 @default.
- W2947883979 hasAuthorship W2947883979A5091028590 @default.
- W2947883979 hasConcept C119857082 @default.
- W2947883979 hasConcept C120314980 @default.
- W2947883979 hasConcept C124101348 @default.
- W2947883979 hasConcept C154945302 @default.
- W2947883979 hasConcept C157764524 @default.
- W2947883979 hasConcept C162324750 @default.
- W2947883979 hasConcept C187736073 @default.
- W2947883979 hasConcept C206345919 @default.
- W2947883979 hasConcept C2776214188 @default.
- W2947883979 hasConcept C2780451532 @default.
- W2947883979 hasConcept C31258907 @default.
- W2947883979 hasConcept C41008148 @default.
- W2947883979 hasConcept C555944384 @default.
- W2947883979 hasConcept C76155785 @default.
- W2947883979 hasConcept C81917197 @default.
- W2947883979 hasConcept C82876162 @default.
- W2947883979 hasConcept C93959086 @default.
- W2947883979 hasConceptScore W2947883979C119857082 @default.
- W2947883979 hasConceptScore W2947883979C120314980 @default.
- W2947883979 hasConceptScore W2947883979C124101348 @default.
- W2947883979 hasConceptScore W2947883979C154945302 @default.
- W2947883979 hasConceptScore W2947883979C157764524 @default.
- W2947883979 hasConceptScore W2947883979C162324750 @default.
- W2947883979 hasConceptScore W2947883979C187736073 @default.
- W2947883979 hasConceptScore W2947883979C206345919 @default.
- W2947883979 hasConceptScore W2947883979C2776214188 @default.
- W2947883979 hasConceptScore W2947883979C2780451532 @default.
- W2947883979 hasConceptScore W2947883979C31258907 @default.
- W2947883979 hasConceptScore W2947883979C41008148 @default.
- W2947883979 hasConceptScore W2947883979C555944384 @default.
- W2947883979 hasConceptScore W2947883979C76155785 @default.
- W2947883979 hasConceptScore W2947883979C81917197 @default.
- W2947883979 hasConceptScore W2947883979C82876162 @default.
- W2947883979 hasConceptScore W2947883979C93959086 @default.
- W2947883979 hasLocation W29478839791 @default.
- W2947883979 hasOpenAccess W2947883979 @default.
- W2947883979 hasPrimaryLocation W29478839791 @default.
- W2947883979 hasRelatedWork W143037660 @default.
- W2947883979 hasRelatedWork W1524181014 @default.
- W2947883979 hasRelatedWork W196117061 @default.
- W2947883979 hasRelatedWork W2008552727 @default.
- W2947883979 hasRelatedWork W2056213438 @default.
- W2947883979 hasRelatedWork W2057546223 @default.
- W2947883979 hasRelatedWork W2089347259 @default.
- W2947883979 hasRelatedWork W2521550930 @default.
- W2947883979 hasRelatedWork W2581432713 @default.
- W2947883979 hasRelatedWork W2740480782 @default.
- W2947883979 hasRelatedWork W2910172404 @default.
- W2947883979 hasRelatedWork W2949177968 @default.
- W2947883979 hasRelatedWork W2952144099 @default.