Matches in SemOpenAlex for { <https://semopenalex.org/work/W3116103263> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W3116103263 endingPage "1321" @default.
- W3116103263 startingPage "1307" @default.
- W3116103263 abstract "We aim to tackle existing problems about deep learning serving on GPUs in the view of the system. GPUs have been widely adopted to serve online deep learning-based services that have stringent QoS(Quality-of-Service) requirements. However, emerging deep learning serving systems often result in poor responsiveness and low throughput of the inferences that damage user experience and increase the number of GPUs required to host an online service. Our investigation shows that the poor batching operation and the lack of data transfer-computation overlap are the root causes of the poor responsiveness and low throughput. To this end, we propose E <sup xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>2</sup> bird, a deep learning serving system that is comprised of a GPU-resident memory pool, a multi-granularity inference engine, and an elastic batch scheduler. The memory pool eliminates the unnecessary waiting of the batching operation and enables data transfer-computation overlap. The inference engine enables concurrent execution of different batches, improving the GPU resource utilization. The batch scheduler organizes inferences elasticallyto guarantee the QoS. Our experimental results on an Nvidia Titan RTXGPU show that E <sup xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>2</sup> bird reduces the response latency of inferences by up to 82.4 percent and improves the throughput by up to 62.8 percent while guaranteeing the QoS target compared with TensorFlow Serving." @default.
- W3116103263 created "2021-01-05" @default.
- W3116103263 creator A5008837660 @default.
- W3116103263 creator A5013903721 @default.
- W3116103263 creator A5022827476 @default.
- W3116103263 creator A5026062552 @default.
- W3116103263 creator A5039318240 @default.
- W3116103263 creator A5088337560 @default.
- W3116103263 date "2021-06-01" @default.
- W3116103263 modified "2023-10-16" @default.
- W3116103263 title "E<sup>2</sup>bird: <u>E</u>nhanced <u>E</u>lastic <u>B</u>atch for <u>I</u>mproving <u>R</u>esponsiveness and Throughput of <u>D</u>eep Learning Services" @default.
- W3116103263 cites W1979527452 @default.
- W3116103263 cites W2076063813 @default.
- W3116103263 cites W2155893237 @default.
- W3116103263 cites W2172654076 @default.
- W3116103263 cites W2285660444 @default.
- W3116103263 cites W2542459869 @default.
- W3116103263 cites W2625231790 @default.
- W3116103263 cites W2767650207 @default.
- W3116103263 cites W2798291715 @default.
- W3116103263 cites W2929502194 @default.
- W3116103263 cites W2934853022 @default.
- W3116103263 cites W2982157693 @default.
- W3116103263 cites W2998218113 @default.
- W3116103263 cites W3005664618 @default.
- W3116103263 cites W4231332361 @default.
- W3116103263 cites W4233628754 @default.
- W3116103263 cites W4235357114 @default.
- W3116103263 cites W4244330903 @default.
- W3116103263 cites W4249935458 @default.
- W3116103263 cites W4301361180 @default.
- W3116103263 doi "https://doi.org/10.1109/tpds.2020.3047638" @default.
- W3116103263 hasPublicationYear "2021" @default.
- W3116103263 type Work @default.
- W3116103263 sameAs 3116103263 @default.
- W3116103263 citedByCount "11" @default.
- W3116103263 countsByYear W31161032632021 @default.
- W3116103263 countsByYear W31161032632022 @default.
- W3116103263 countsByYear W31161032632023 @default.
- W3116103263 crossrefType "journal-article" @default.
- W3116103263 hasAuthorship W3116103263A5008837660 @default.
- W3116103263 hasAuthorship W3116103263A5013903721 @default.
- W3116103263 hasAuthorship W3116103263A5022827476 @default.
- W3116103263 hasAuthorship W3116103263A5026062552 @default.
- W3116103263 hasAuthorship W3116103263A5039318240 @default.
- W3116103263 hasAuthorship W3116103263A5088337560 @default.
- W3116103263 hasConcept C108583219 @default.
- W3116103263 hasConcept C111919701 @default.
- W3116103263 hasConcept C11413529 @default.
- W3116103263 hasConcept C119857082 @default.
- W3116103263 hasConcept C154945302 @default.
- W3116103263 hasConcept C157764524 @default.
- W3116103263 hasConcept C173608175 @default.
- W3116103263 hasConcept C2776214188 @default.
- W3116103263 hasConcept C31258907 @default.
- W3116103263 hasConcept C41008148 @default.
- W3116103263 hasConcept C45374587 @default.
- W3116103263 hasConcept C5119721 @default.
- W3116103263 hasConcept C555944384 @default.
- W3116103263 hasConcept C76155785 @default.
- W3116103263 hasConcept C82876162 @default.
- W3116103263 hasConceptScore W3116103263C108583219 @default.
- W3116103263 hasConceptScore W3116103263C111919701 @default.
- W3116103263 hasConceptScore W3116103263C11413529 @default.
- W3116103263 hasConceptScore W3116103263C119857082 @default.
- W3116103263 hasConceptScore W3116103263C154945302 @default.
- W3116103263 hasConceptScore W3116103263C157764524 @default.
- W3116103263 hasConceptScore W3116103263C173608175 @default.
- W3116103263 hasConceptScore W3116103263C2776214188 @default.
- W3116103263 hasConceptScore W3116103263C31258907 @default.
- W3116103263 hasConceptScore W3116103263C41008148 @default.
- W3116103263 hasConceptScore W3116103263C45374587 @default.
- W3116103263 hasConceptScore W3116103263C5119721 @default.
- W3116103263 hasConceptScore W3116103263C555944384 @default.
- W3116103263 hasConceptScore W3116103263C76155785 @default.
- W3116103263 hasConceptScore W3116103263C82876162 @default.
- W3116103263 hasFunder F4320321001 @default.
- W3116103263 hasIssue "6" @default.
- W3116103263 hasLocation W31161032631 @default.
- W3116103263 hasOpenAccess W3116103263 @default.
- W3116103263 hasPrimaryLocation W31161032631 @default.
- W3116103263 hasRelatedWork W2055243143 @default.
- W3116103263 hasRelatedWork W2111238207 @default.
- W3116103263 hasRelatedWork W2136583354 @default.
- W3116103263 hasRelatedWork W2611989081 @default.
- W3116103263 hasRelatedWork W2760721665 @default.
- W3116103263 hasRelatedWork W3008625068 @default.
- W3116103263 hasRelatedWork W3035501883 @default.
- W3116103263 hasRelatedWork W3128807919 @default.
- W3116103263 hasRelatedWork W3176411177 @default.
- W3116103263 hasRelatedWork W4375867731 @default.
- W3116103263 hasVolume "32" @default.
- W3116103263 isParatext "false" @default.
- W3116103263 isRetracted "false" @default.
- W3116103263 magId "3116103263" @default.
- W3116103263 workType "article" @default.