Matches in SemOpenAlex for { <https://semopenalex.org/work/W4324119062> ?p ?o ?g. }
Showing items 1 to 70 of
70
with 100 items per page.
- W4324119062 abstract "Multi-agent reinforcement learning (MARL) has been plagued by low sample efficiency. It needs far more samples than human learning to achieve convergence and learn successful strategies. And this situation is more serious in continuous state and policy space. Episodic memory (EM), as an effective method to improve the sample efficiency of reinforcement learning (RL) by imitating the ability of human rapid learning, has currently made little effort in continuous policy space and MARL. Therefore, we propose a continuous policy multi-agent reinforcement learning method with generalizable episodic memory (ECM). It establishes a centralized memory parameter network and memory buffer for each agent, and updates memory through implicit planning, so that the episodic memory model can use neural networks to learn successful strategies from the past successful experience. Thus, the model can adapt to the continuous policy space. Moreover, ECM combines MARL's idea of decentralized execution and centralized training (CTDE) with episodic memory model to make the model adapt to multi-agent task environment. Simulation results show that ECM method can effectively improve the sample efficiency of MARL algorithm, and the learned strategy has higher accuracy." @default.
- W4324119062 created "2023-03-15" @default.
- W4324119062 creator A5036419883 @default.
- W4324119062 creator A5038920560 @default.
- W4324119062 creator A5048052584 @default.
- W4324119062 creator A5056168495 @default.
- W4324119062 date "2022-11-25" @default.
- W4324119062 modified "2023-09-23" @default.
- W4324119062 title "Continuous Policy Multi-Agent Deep Reinforcement Learning with Generalizable Episodic Memory" @default.
- W4324119062 cites W1542941925 @default.
- W4324119062 cites W2145339207 @default.
- W4324119062 cites W2964082094 @default.
- W4324119062 cites W2972128755 @default.
- W4324119062 doi "https://doi.org/10.1109/cac57257.2022.10055953" @default.
- W4324119062 hasPublicationYear "2022" @default.
- W4324119062 type Work @default.
- W4324119062 citedByCount "0" @default.
- W4324119062 crossrefType "proceedings-article" @default.
- W4324119062 hasAuthorship W4324119062A5036419883 @default.
- W4324119062 hasAuthorship W4324119062A5038920560 @default.
- W4324119062 hasAuthorship W4324119062A5048052584 @default.
- W4324119062 hasAuthorship W4324119062A5056168495 @default.
- W4324119062 hasConcept C119857082 @default.
- W4324119062 hasConcept C127413603 @default.
- W4324119062 hasConcept C154945302 @default.
- W4324119062 hasConcept C15744967 @default.
- W4324119062 hasConcept C162324750 @default.
- W4324119062 hasConcept C169760540 @default.
- W4324119062 hasConcept C169900460 @default.
- W4324119062 hasConcept C201995342 @default.
- W4324119062 hasConcept C2777303404 @default.
- W4324119062 hasConcept C2779436431 @default.
- W4324119062 hasConcept C2780451532 @default.
- W4324119062 hasConcept C41008148 @default.
- W4324119062 hasConcept C50522688 @default.
- W4324119062 hasConcept C50644808 @default.
- W4324119062 hasConcept C88576662 @default.
- W4324119062 hasConcept C97541855 @default.
- W4324119062 hasConceptScore W4324119062C119857082 @default.
- W4324119062 hasConceptScore W4324119062C127413603 @default.
- W4324119062 hasConceptScore W4324119062C154945302 @default.
- W4324119062 hasConceptScore W4324119062C15744967 @default.
- W4324119062 hasConceptScore W4324119062C162324750 @default.
- W4324119062 hasConceptScore W4324119062C169760540 @default.
- W4324119062 hasConceptScore W4324119062C169900460 @default.
- W4324119062 hasConceptScore W4324119062C201995342 @default.
- W4324119062 hasConceptScore W4324119062C2777303404 @default.
- W4324119062 hasConceptScore W4324119062C2779436431 @default.
- W4324119062 hasConceptScore W4324119062C2780451532 @default.
- W4324119062 hasConceptScore W4324119062C41008148 @default.
- W4324119062 hasConceptScore W4324119062C50522688 @default.
- W4324119062 hasConceptScore W4324119062C50644808 @default.
- W4324119062 hasConceptScore W4324119062C88576662 @default.
- W4324119062 hasConceptScore W4324119062C97541855 @default.
- W4324119062 hasLocation W43241190621 @default.
- W4324119062 hasOpenAccess W4324119062 @default.
- W4324119062 hasPrimaryLocation W43241190621 @default.
- W4324119062 hasRelatedWork W2401692973 @default.
- W4324119062 hasRelatedWork W2605292759 @default.
- W4324119062 hasRelatedWork W2783543950 @default.
- W4324119062 hasRelatedWork W3022038857 @default.
- W4324119062 hasRelatedWork W3038067716 @default.
- W4324119062 hasRelatedWork W3044461295 @default.
- W4324119062 hasRelatedWork W3074656709 @default.
- W4324119062 hasRelatedWork W3106528330 @default.
- W4324119062 hasRelatedWork W4319083788 @default.
- W4324119062 hasRelatedWork W1629725936 @default.
- W4324119062 isParatext "false" @default.
- W4324119062 isRetracted "false" @default.
- W4324119062 workType "article" @default.