Matches in SemOpenAlex for { <https://semopenalex.org/work/W4312804952> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4312804952 abstract "In this paper we propose Multi-Agent Proxy Proximal Policy Optimization (MA3PO), a novel multi-agent deep reinforcement learning algorithm that tackles the challenge of cooperative continuous multi-agent control. Our method is driven by the observation that most existing multi-agent reinforcement learning algorithms mainly focus on discrete state/action spaces and are thus computationally infeasible when extended to environments with continuous state/action spaces. To address the issue of computational complexity and to better model intra-agent collaboration, we make use of the recently successful Proximal Policy Optimization algorithm that effectively explores of continuous action spaces, and incorporate the notion of intrinsic motivation via meta-gradient methods so as to stimulate the behavior of individual agents in cooperative multi-agent settings. Towards these ends, we design proxy rewards to quantify the effect of individual agent-level intrinsic motivation onto the team-level reward, and apply meta-gradient methods to leverage such an addition so that our algorithm can learn the team-level cumulative reward effectively. Experiments on various multi-agent reinforcement learning benchmark environments with continuous action spaces demonstrate that our algorithm is not only comparable with the existing state-of-the-art benchmarks, but also significantly reduces training time complexity." @default.
- W4312804952 created "2023-01-05" @default.
- W4312804952 creator A5001836351 @default.
- W4312804952 creator A5063127492 @default.
- W4312804952 creator A5073310872 @default.
- W4312804952 creator A5087474202 @default.
- W4312804952 date "2022-07-18" @default.
- W4312804952 modified "2023-09-27" @default.
- W4312804952 title "Meta Proximal Policy Optimization for Cooperative Multi-Agent Continuous Control" @default.
- W4312804952 cites W2026662445 @default.
- W4312804952 cites W2158782408 @default.
- W4312804952 cites W2617547828 @default.
- W4312804952 cites W2747213132 @default.
- W4312804952 cites W2963871073 @default.
- W4312804952 cites W3102824929 @default.
- W4312804952 doi "https://doi.org/10.1109/ijcnn55064.2022.9892004" @default.
- W4312804952 hasPublicationYear "2022" @default.
- W4312804952 type Work @default.
- W4312804952 citedByCount "0" @default.
- W4312804952 crossrefType "proceedings-article" @default.
- W4312804952 hasAuthorship W4312804952A5001836351 @default.
- W4312804952 hasAuthorship W4312804952A5063127492 @default.
- W4312804952 hasAuthorship W4312804952A5073310872 @default.
- W4312804952 hasAuthorship W4312804952A5087474202 @default.
- W4312804952 hasConcept C119857082 @default.
- W4312804952 hasConcept C121332964 @default.
- W4312804952 hasConcept C126255220 @default.
- W4312804952 hasConcept C13280743 @default.
- W4312804952 hasConcept C153083717 @default.
- W4312804952 hasConcept C154945302 @default.
- W4312804952 hasConcept C185798385 @default.
- W4312804952 hasConcept C205649164 @default.
- W4312804952 hasConcept C2780148112 @default.
- W4312804952 hasConcept C2780791683 @default.
- W4312804952 hasConcept C33923547 @default.
- W4312804952 hasConcept C41008148 @default.
- W4312804952 hasConcept C41550386 @default.
- W4312804952 hasConcept C62520636 @default.
- W4312804952 hasConcept C97541855 @default.
- W4312804952 hasConceptScore W4312804952C119857082 @default.
- W4312804952 hasConceptScore W4312804952C121332964 @default.
- W4312804952 hasConceptScore W4312804952C126255220 @default.
- W4312804952 hasConceptScore W4312804952C13280743 @default.
- W4312804952 hasConceptScore W4312804952C153083717 @default.
- W4312804952 hasConceptScore W4312804952C154945302 @default.
- W4312804952 hasConceptScore W4312804952C185798385 @default.
- W4312804952 hasConceptScore W4312804952C205649164 @default.
- W4312804952 hasConceptScore W4312804952C2780148112 @default.
- W4312804952 hasConceptScore W4312804952C2780791683 @default.
- W4312804952 hasConceptScore W4312804952C33923547 @default.
- W4312804952 hasConceptScore W4312804952C41008148 @default.
- W4312804952 hasConceptScore W4312804952C41550386 @default.
- W4312804952 hasConceptScore W4312804952C62520636 @default.
- W4312804952 hasConceptScore W4312804952C97541855 @default.
- W4312804952 hasFunder F4320306076 @default.
- W4312804952 hasLocation W43128049521 @default.
- W4312804952 hasOpenAccess W4312804952 @default.
- W4312804952 hasPrimaryLocation W43128049521 @default.
- W4312804952 hasRelatedWork W1565759249 @default.
- W4312804952 hasRelatedWork W3022038857 @default.
- W4312804952 hasRelatedWork W3095449511 @default.
- W4312804952 hasRelatedWork W3203000071 @default.
- W4312804952 hasRelatedWork W4307308173 @default.
- W4312804952 hasRelatedWork W4318621078 @default.
- W4312804952 hasRelatedWork W4318719223 @default.
- W4312804952 hasRelatedWork W4319083788 @default.
- W4312804952 hasRelatedWork W4360764167 @default.
- W4312804952 hasRelatedWork W4367191088 @default.
- W4312804952 isParatext "false" @default.
- W4312804952 isRetracted "false" @default.
- W4312804952 workType "article" @default.