Matches in SemOpenAlex for { <https://semopenalex.org/work/W3031015890> ?p ?o ?g. }
- W3031015890 abstract "Nonstationarity is a fundamental problem in cooperative multi-agent reinforcement learning (MARL)--each agent must relearn information about the other agent's policies due to the other agents learning, causing information to ring between agents and convergence to be slow. The MAILP model, introduced by Terry and Grammel (2020), is a novel model of information transfer during multi-agent learning. We use the MAILP model to show that increasing training centralization arbitrarily mitigates the slowing of convergence due to nonstationarity. The most centralized case of learning is parameter sharing, an uncommonly used MARL method, specific to environments with homogeneous agents, that bootstraps a single-agent reinforcement learning (RL) methods and learns an identical policy for each agent. We experimentally replicate the result of increased learning centralization leading to better performance on the MARL benchmark set from Gupta et al. (2017). We further apply parameter sharing to 8 modern single-agent deep RL (DRL) methods for the first time in the literature. With this, we achieved the best documented performance on a set of MARL benchmarks and achieved upto 44 times more average reward in as little as 16% as many episodes compared to documented parameter sharing arrangement. We finally offer a formal proof of a set of methods that allow parameter sharing to serve in environments with heterogeneous agents." @default.
- W3031015890 created "2020-06-05" @default.
- W3031015890 creator A5004194238 @default.
- W3031015890 creator A5005482780 @default.
- W3031015890 creator A5031738594 @default.
- W3031015890 creator A5044358295 @default.
- W3031015890 creator A5046752681 @default.
- W3031015890 creator A5071777589 @default.
- W3031015890 date "2020-05-27" @default.
- W3031015890 modified "2023-09-24" @default.
- W3031015890 title "Parameter Sharing is Surprisingly Useful for Multi-Agent Deep Reinforcement Learning." @default.
- W3031015890 cites W1521003796 @default.
- W3031015890 cites W1757796397 @default.
- W3031015890 cites W201409579 @default.
- W3031015890 cites W2099618002 @default.
- W3031015890 cites W2173248099 @default.
- W3031015890 cites W2296073425 @default.
- W3031015890 cites W2575731723 @default.
- W3031015890 cites W2736601468 @default.
- W3031015890 cites W2740377041 @default.
- W3031015890 cites W2761873684 @default.
- W3031015890 cites W2763208138 @default.
- W3031015890 cites W2779040504 @default.
- W3031015890 cites W2781726626 @default.
- W3031015890 cites W2786036274 @default.
- W3031015890 cites W2786928559 @default.
- W3031015890 cites W2787938642 @default.
- W3031015890 cites W2949464762 @default.
- W3031015890 cites W2949608212 @default.
- W3031015890 cites W2951984055 @default.
- W3031015890 cites W2964043796 @default.
- W3031015890 cites W2973525135 @default.
- W3031015890 cites W3034870364 @default.
- W3031015890 cites W3054726674 @default.
- W3031015890 cites W3090386093 @default.
- W3031015890 cites W3093287223 @default.
- W3031015890 hasPublicationYear "2020" @default.
- W3031015890 type Work @default.
- W3031015890 sameAs 3031015890 @default.
- W3031015890 citedByCount "11" @default.
- W3031015890 countsByYear W30310158902019 @default.
- W3031015890 countsByYear W30310158902020 @default.
- W3031015890 countsByYear W30310158902021 @default.
- W3031015890 crossrefType "posted-content" @default.
- W3031015890 hasAuthorship W3031015890A5004194238 @default.
- W3031015890 hasAuthorship W3031015890A5005482780 @default.
- W3031015890 hasAuthorship W3031015890A5031738594 @default.
- W3031015890 hasAuthorship W3031015890A5044358295 @default.
- W3031015890 hasAuthorship W3031015890A5046752681 @default.
- W3031015890 hasAuthorship W3031015890A5071777589 @default.
- W3031015890 hasConcept C109007969 @default.
- W3031015890 hasConcept C114614502 @default.
- W3031015890 hasConcept C119857082 @default.
- W3031015890 hasConcept C13280743 @default.
- W3031015890 hasConcept C136764020 @default.
- W3031015890 hasConcept C151730666 @default.
- W3031015890 hasConcept C154945302 @default.
- W3031015890 hasConcept C15744967 @default.
- W3031015890 hasConcept C162324750 @default.
- W3031015890 hasConcept C177264268 @default.
- W3031015890 hasConcept C185798385 @default.
- W3031015890 hasConcept C199360897 @default.
- W3031015890 hasConcept C205649164 @default.
- W3031015890 hasConcept C2776854237 @default.
- W3031015890 hasConcept C2777303404 @default.
- W3031015890 hasConcept C2780873155 @default.
- W3031015890 hasConcept C33923547 @default.
- W3031015890 hasConcept C41008148 @default.
- W3031015890 hasConcept C50522688 @default.
- W3031015890 hasConcept C66882249 @default.
- W3031015890 hasConcept C67203356 @default.
- W3031015890 hasConcept C77805123 @default.
- W3031015890 hasConcept C86803240 @default.
- W3031015890 hasConcept C92927620 @default.
- W3031015890 hasConcept C97541855 @default.
- W3031015890 hasConceptScore W3031015890C109007969 @default.
- W3031015890 hasConceptScore W3031015890C114614502 @default.
- W3031015890 hasConceptScore W3031015890C119857082 @default.
- W3031015890 hasConceptScore W3031015890C13280743 @default.
- W3031015890 hasConceptScore W3031015890C136764020 @default.
- W3031015890 hasConceptScore W3031015890C151730666 @default.
- W3031015890 hasConceptScore W3031015890C154945302 @default.
- W3031015890 hasConceptScore W3031015890C15744967 @default.
- W3031015890 hasConceptScore W3031015890C162324750 @default.
- W3031015890 hasConceptScore W3031015890C177264268 @default.
- W3031015890 hasConceptScore W3031015890C185798385 @default.
- W3031015890 hasConceptScore W3031015890C199360897 @default.
- W3031015890 hasConceptScore W3031015890C205649164 @default.
- W3031015890 hasConceptScore W3031015890C2776854237 @default.
- W3031015890 hasConceptScore W3031015890C2777303404 @default.
- W3031015890 hasConceptScore W3031015890C2780873155 @default.
- W3031015890 hasConceptScore W3031015890C33923547 @default.
- W3031015890 hasConceptScore W3031015890C41008148 @default.
- W3031015890 hasConceptScore W3031015890C50522688 @default.
- W3031015890 hasConceptScore W3031015890C66882249 @default.
- W3031015890 hasConceptScore W3031015890C67203356 @default.
- W3031015890 hasConceptScore W3031015890C77805123 @default.
- W3031015890 hasConceptScore W3031015890C86803240 @default.
- W3031015890 hasConceptScore W3031015890C92927620 @default.
- W3031015890 hasConceptScore W3031015890C97541855 @default.