Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386707613> ?p ?o ?g. }
Showing items 1 to 72 of
72
with 100 items per page.
- W4386707613 abstract "The inference workload redistribution is a technique for evacuating inference requests from hot edges to idle edges in edge collaborative systems, thereby achieving inference workload balancing for inference on different edges. However, with the continuous development of edge accelerators, the resource utilization of edge accelerators in executing inference requests in series is often low, and when executing multiple inference requests in parallel, it faces uncertain execution delays, different response-time Service Level Objectives (SLOs), and the generality of inference workloads in heterogeneous edge collaborative systems. To address these issues, for the first time in the domain of inference workload redistribution, we propose a Batch-aware Inference workload Redistribution and Parallel execution scheme, called BIRP, to reduce the additional latency caused by waiting for a single inference task during serial execution, thereby improving the overall inference accuracy. BIRP uses the Multi-Armed Bandit (MAB) algorithm to adjust hyperparameters of the Throughput Improvement Ratio (TIR) function online for improving the overall inference accuracy. For nonlinear terms in the problem, BIRP uses a piecewise linear approximation to convert it into a Quadratic Programming (QP) problem, ensuring the effectiveness of BIRP in theory. We prototype BIRP on an edge collaborative system composed of three heterogeneous edges. Based on real inference workload trace, we validate the superiority of our algorithm compared to the state-of-the-art model selection-based inference workload redistribution algorithm, with an overall inference loss reduction of at least 32.9% and the failure rate of SLO has been reduced to 19.8% of alternatives." @default.
- W4386707613 created "2023-09-14" @default.
- W4386707613 creator A5015547394 @default.
- W4386707613 creator A5038027117 @default.
- W4386707613 creator A5053632258 @default.
- W4386707613 creator A5075118448 @default.
- W4386707613 creator A5076659753 @default.
- W4386707613 creator A5080858807 @default.
- W4386707613 creator A5083803457 @default.
- W4386707613 creator A5088104709 @default.
- W4386707613 date "2023-08-07" @default.
- W4386707613 modified "2023-10-15" @default.
- W4386707613 title "BIRP: Batch-aware Inference Workload Redistribution and Parallel Scheme for Edge Collaboration" @default.
- W4386707613 cites W2889292885 @default.
- W4386707613 cites W2913740539 @default.
- W4386707613 cites W2950865323 @default.
- W4386707613 cites W2960833983 @default.
- W4386707613 cites W2970866953 @default.
- W4386707613 cites W2971544482 @default.
- W4386707613 cites W3047468380 @default.
- W4386707613 cites W3135486588 @default.
- W4386707613 cites W3165698711 @default.
- W4386707613 cites W3196226093 @default.
- W4386707613 cites W3201218664 @default.
- W4386707613 cites W3210635777 @default.
- W4386707613 cites W3210776666 @default.
- W4386707613 cites W4283732699 @default.
- W4386707613 doi "https://doi.org/10.1145/3605573.3605615" @default.
- W4386707613 hasPublicationYear "2023" @default.
- W4386707613 type Work @default.
- W4386707613 citedByCount "0" @default.
- W4386707613 crossrefType "proceedings-article" @default.
- W4386707613 hasAuthorship W4386707613A5015547394 @default.
- W4386707613 hasAuthorship W4386707613A5038027117 @default.
- W4386707613 hasAuthorship W4386707613A5053632258 @default.
- W4386707613 hasAuthorship W4386707613A5075118448 @default.
- W4386707613 hasAuthorship W4386707613A5076659753 @default.
- W4386707613 hasAuthorship W4386707613A5080858807 @default.
- W4386707613 hasAuthorship W4386707613A5083803457 @default.
- W4386707613 hasAuthorship W4386707613A5088104709 @default.
- W4386707613 hasConcept C111919701 @default.
- W4386707613 hasConcept C120314980 @default.
- W4386707613 hasConcept C154945302 @default.
- W4386707613 hasConcept C173608175 @default.
- W4386707613 hasConcept C198370458 @default.
- W4386707613 hasConcept C2776214188 @default.
- W4386707613 hasConcept C2778476105 @default.
- W4386707613 hasConcept C41008148 @default.
- W4386707613 hasConceptScore W4386707613C111919701 @default.
- W4386707613 hasConceptScore W4386707613C120314980 @default.
- W4386707613 hasConceptScore W4386707613C154945302 @default.
- W4386707613 hasConceptScore W4386707613C173608175 @default.
- W4386707613 hasConceptScore W4386707613C198370458 @default.
- W4386707613 hasConceptScore W4386707613C2776214188 @default.
- W4386707613 hasConceptScore W4386707613C2778476105 @default.
- W4386707613 hasConceptScore W4386707613C41008148 @default.
- W4386707613 hasLocation W43867076131 @default.
- W4386707613 hasOpenAccess W4386707613 @default.
- W4386707613 hasPrimaryLocation W43867076131 @default.
- W4386707613 hasRelatedWork W1941412300 @default.
- W4386707613 hasRelatedWork W2068383718 @default.
- W4386707613 hasRelatedWork W2352878646 @default.
- W4386707613 hasRelatedWork W2384410913 @default.
- W4386707613 hasRelatedWork W2804371217 @default.
- W4386707613 hasRelatedWork W2963764498 @default.
- W4386707613 hasRelatedWork W2990194547 @default.
- W4386707613 hasRelatedWork W4246881098 @default.
- W4386707613 hasRelatedWork W4297831890 @default.
- W4386707613 hasRelatedWork W986318368 @default.
- W4386707613 isParatext "false" @default.
- W4386707613 isRetracted "false" @default.
- W4386707613 workType "article" @default.