Matches in SemOpenAlex for { <https://semopenalex.org/work/W4302012809> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W4302012809 abstract "Reinforcement learning (RL) trains many agents, which is resource-intensive and must scale to large GPU clusters. Different RL training algorithms offer different opportunities for distributing and parallelising the computation. Yet, current distributed RL systems tie the definition of RL algorithms to their distributed execution: they hard-code particular distribution strategies and only accelerate specific parts of the computation (e.g. policy network updates) on GPU workers. Fundamentally, current systems lack abstractions that decouple RL algorithms from their execution. We describe MindSpore Reinforcement Learning (MSRL), a distributed RL training system that supports distribution policies that govern how RL training computation is parallelised and distributed on cluster resources, without requiring changes to the algorithm implementation. MSRL introduces the new abstraction of a fragmented dataflow graph, which maps Python functions from an RL algorithm's training loop to parallel computational fragments. Fragments are executed on different devices by translating them to low-level dataflow representations, e.g. computational graphs as supported by deep learning engines, CUDA implementations or multi-threaded CPU processes. We show that MSRL subsumes the distribution strategies of existing systems, while scaling RL training to 64 GPUs." @default.
- W4302012809 created "2022-10-06" @default.
- W4302012809 creator A5010738468 @default.
- W4302012809 creator A5015924291 @default.
- W4302012809 creator A5045067652 @default.
- W4302012809 creator A5072678098 @default.
- W4302012809 creator A5073799726 @default.
- W4302012809 creator A5078842469 @default.
- W4302012809 creator A5085295350 @default.
- W4302012809 creator A5090203467 @default.
- W4302012809 date "2022-10-03" @default.
- W4302012809 modified "2023-10-18" @default.
- W4302012809 title "MSRL: Distributed Reinforcement Learning with Dataflow Fragments" @default.
- W4302012809 doi "https://doi.org/10.48550/arxiv.2210.00882" @default.
- W4302012809 hasPublicationYear "2022" @default.
- W4302012809 type Work @default.
- W4302012809 citedByCount "0" @default.
- W4302012809 crossrefType "posted-content" @default.
- W4302012809 hasAuthorship W4302012809A5010738468 @default.
- W4302012809 hasAuthorship W4302012809A5015924291 @default.
- W4302012809 hasAuthorship W4302012809A5045067652 @default.
- W4302012809 hasAuthorship W4302012809A5072678098 @default.
- W4302012809 hasAuthorship W4302012809A5073799726 @default.
- W4302012809 hasAuthorship W4302012809A5078842469 @default.
- W4302012809 hasAuthorship W4302012809A5085295350 @default.
- W4302012809 hasAuthorship W4302012809A5090203467 @default.
- W4302012809 hasBestOaLocation W43020128091 @default.
- W4302012809 hasConcept C120314980 @default.
- W4302012809 hasConcept C154945302 @default.
- W4302012809 hasConcept C173608175 @default.
- W4302012809 hasConcept C199360897 @default.
- W4302012809 hasConcept C26713055 @default.
- W4302012809 hasConcept C2778119891 @default.
- W4302012809 hasConcept C41008148 @default.
- W4302012809 hasConcept C45374587 @default.
- W4302012809 hasConcept C519991488 @default.
- W4302012809 hasConcept C96324660 @default.
- W4302012809 hasConcept C97541855 @default.
- W4302012809 hasConceptScore W4302012809C120314980 @default.
- W4302012809 hasConceptScore W4302012809C154945302 @default.
- W4302012809 hasConceptScore W4302012809C173608175 @default.
- W4302012809 hasConceptScore W4302012809C199360897 @default.
- W4302012809 hasConceptScore W4302012809C26713055 @default.
- W4302012809 hasConceptScore W4302012809C2778119891 @default.
- W4302012809 hasConceptScore W4302012809C41008148 @default.
- W4302012809 hasConceptScore W4302012809C45374587 @default.
- W4302012809 hasConceptScore W4302012809C519991488 @default.
- W4302012809 hasConceptScore W4302012809C96324660 @default.
- W4302012809 hasConceptScore W4302012809C97541855 @default.
- W4302012809 hasLocation W43020128091 @default.
- W4302012809 hasLocation W43020128092 @default.
- W4302012809 hasOpenAccess W4302012809 @default.
- W4302012809 hasPrimaryLocation W43020128091 @default.
- W4302012809 hasRelatedWork W1572523360 @default.
- W4302012809 hasRelatedWork W1587906417 @default.
- W4302012809 hasRelatedWork W184060744 @default.
- W4302012809 hasRelatedWork W1969934278 @default.
- W4302012809 hasRelatedWork W2017579069 @default.
- W4302012809 hasRelatedWork W2047588290 @default.
- W4302012809 hasRelatedWork W2100229967 @default.
- W4302012809 hasRelatedWork W2285697638 @default.
- W4302012809 hasRelatedWork W2295371547 @default.
- W4302012809 hasRelatedWork W2968111836 @default.
- W4302012809 isParatext "false" @default.
- W4302012809 isRetracted "false" @default.
- W4302012809 workType "article" @default.