Matches in SemOpenAlex for { <https://semopenalex.org/work/W4362653492> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W4362653492 endingPage "1829" @default.
- W4362653492 startingPage "1816" @default.
- W4362653492 abstract "Despite the recent success of Deep Reinforcement Learning (DRL) in self-driving cars, robotics and surveillance, training DRL agents takes tremendous amount of time and computation resources. In this article, we aim to accelerate DRL with Prioritized Replay Buffer due to its state-of-the-art performance on various benchmarks. The computation primitives of DRL with Prioritized Replay Buffer include environment emulation, neural network inference, sampling from Prioritized Replay Buffer, updating Prioritized Replay Buffer and neural network training. The speed of running these primitives varies for various DRL algorithms such as Deep Q Network and Deep Deterministic Policy Gradient. This makes a fixed mapping of DRL algorithms inefficient. In this work, we propose a framework for mapping DRL algorithms onto heterogeneous platforms consisting of a multi-core CPU, a GPU and a FPGA. First, we develop specific accelerators for each primitive on CPU, FPGA and GPU. Second, we relax the data dependency between priority update and sampling performed in the Prioritized Replay Buffer. By doing so, the latency caused by data transfer between GPU, FPGA and CPU can be completely hidden without sacrificing the rewards achieved by agents learned using the target DRL algorithms. Finally, given a DRL algorithm specification, our design space exploration automatically chooses the optimal mapping of various primitives based on an analytical performance model. On widely used benchmark environments, our experimental results demonstrate up to 997.3× improvement in training throughput compared with baseline mappings on the same heterogeneous platform. Compared with the state-of-the-art distributed Reinforcement Learning framework RLlib, we achieve 1.06 <inline-formula><tex-math notation=LaTeX>$times sim$</tex-math></inline-formula> 1005× improvement in training throughput." @default.
- W4362653492 created "2023-04-07" @default.
- W4362653492 creator A5020242643 @default.
- W4362653492 creator A5033166029 @default.
- W4362653492 creator A5050528589 @default.
- W4362653492 date "2023-06-01" @default.
- W4362653492 modified "2023-09-27" @default.
- W4362653492 title "A Framework for Mapping DRL Algorithms With Prioritized Replay Buffer Onto Heterogeneous Platforms" @default.
- W4362653492 cites W1994616650 @default.
- W4362653492 cites W2107726111 @default.
- W4362653492 cites W2152083440 @default.
- W4362653492 cites W2173213060 @default.
- W4362653492 cites W2257979135 @default.
- W4362653492 cites W2343695530 @default.
- W4362653492 cites W2604883922 @default.
- W4362653492 cites W2763502612 @default.
- W4362653492 cites W2889068523 @default.
- W4362653492 cites W2931767035 @default.
- W4362653492 cites W2971573826 @default.
- W4362653492 cites W3035681682 @default.
- W4362653492 cites W3108085245 @default.
- W4362653492 cites W4226369037 @default.
- W4362653492 cites W4229017035 @default.
- W4362653492 doi "https://doi.org/10.1109/tpds.2023.3264823" @default.
- W4362653492 hasPublicationYear "2023" @default.
- W4362653492 type Work @default.
- W4362653492 citedByCount "0" @default.
- W4362653492 crossrefType "journal-article" @default.
- W4362653492 hasAuthorship W4362653492A5020242643 @default.
- W4362653492 hasAuthorship W4362653492A5033166029 @default.
- W4362653492 hasAuthorship W4362653492A5050528589 @default.
- W4362653492 hasConcept C113775141 @default.
- W4362653492 hasConcept C11413529 @default.
- W4362653492 hasConcept C120314980 @default.
- W4362653492 hasConcept C13280743 @default.
- W4362653492 hasConcept C149635348 @default.
- W4362653492 hasConcept C154945302 @default.
- W4362653492 hasConcept C173608175 @default.
- W4362653492 hasConcept C185798385 @default.
- W4362653492 hasConcept C205649164 @default.
- W4362653492 hasConcept C41008148 @default.
- W4362653492 hasConcept C42935608 @default.
- W4362653492 hasConcept C45374587 @default.
- W4362653492 hasConcept C50644808 @default.
- W4362653492 hasConcept C97541855 @default.
- W4362653492 hasConceptScore W4362653492C113775141 @default.
- W4362653492 hasConceptScore W4362653492C11413529 @default.
- W4362653492 hasConceptScore W4362653492C120314980 @default.
- W4362653492 hasConceptScore W4362653492C13280743 @default.
- W4362653492 hasConceptScore W4362653492C149635348 @default.
- W4362653492 hasConceptScore W4362653492C154945302 @default.
- W4362653492 hasConceptScore W4362653492C173608175 @default.
- W4362653492 hasConceptScore W4362653492C185798385 @default.
- W4362653492 hasConceptScore W4362653492C205649164 @default.
- W4362653492 hasConceptScore W4362653492C41008148 @default.
- W4362653492 hasConceptScore W4362653492C42935608 @default.
- W4362653492 hasConceptScore W4362653492C45374587 @default.
- W4362653492 hasConceptScore W4362653492C50644808 @default.
- W4362653492 hasConceptScore W4362653492C97541855 @default.
- W4362653492 hasIssue "6" @default.
- W4362653492 hasLocation W43626534921 @default.
- W4362653492 hasOpenAccess W4362653492 @default.
- W4362653492 hasPrimaryLocation W43626534921 @default.
- W4362653492 hasRelatedWork W1485630101 @default.
- W4362653492 hasRelatedWork W1549325751 @default.
- W4362653492 hasRelatedWork W173521803 @default.
- W4362653492 hasRelatedWork W2020875658 @default.
- W4362653492 hasRelatedWork W3074294383 @default.
- W4362653492 hasRelatedWork W3153007185 @default.
- W4362653492 hasRelatedWork W4200614495 @default.
- W4362653492 hasRelatedWork W4252660273 @default.
- W4362653492 hasRelatedWork W4296474751 @default.
- W4362653492 hasRelatedWork W4308659767 @default.
- W4362653492 hasVolume "34" @default.
- W4362653492 isParatext "false" @default.
- W4362653492 isRetracted "false" @default.
- W4362653492 workType "article" @default.