Matches in SemOpenAlex for { <https://semopenalex.org/work/W2891925283> ?p ?o ?g. }
- W2891925283 abstract "In Multi-Agent Reinforcement Learning (MA-RL), independent cooperative learners must overcome a number of pathologies to learn optimal joint policies. Addressing one pathology often leaves approaches vulnerable towards others. For instance, hysteretic Q-learning addresses miscoordination while leaving agents vulnerable towards misleading stochastic rewards. Other methods, such as leniency, have proven more robust when dealing with multiple pathologies simultaneously. However, leniency has predominately been studied within the context of strategic form games (bimatrix games) and fully observable Markov games consisting of a small number of probabilistic state transitions. This raises the question of whether these findings scale to more complex domains. For this purpose we implement a temporally extend version of the Climb Game, within which agents must overcome multiple pathologies simultaneously, including relative overgeneralisation, stochasticity, the alter-exploration and moving target problems, while learning from a large observation space. We find that existing lenient and hysteretic approaches fail to consistently learn near optimal joint-policies in this environment. To address these pathologies we introduce Negative Update Intervals-DDQN (NUI-DDQN), a Deep MA-RL algorithm which discards episodes yielding cumulative rewards outside the range of expanding intervals. NUI-DDQN consistently gravitates towards optimal joint-policies in our environment, overcoming the outlined pathologies." @default.
- W2891925283 created "2018-09-27" @default.
- W2891925283 creator A5008547992 @default.
- W2891925283 creator A5009933171 @default.
- W2891925283 creator A5084692264 @default.
- W2891925283 date "2018-09-13" @default.
- W2891925283 modified "2023-09-27" @default.
- W2891925283 title "Negative Update Intervals in Deep Multi-Agent Reinforcement Learning" @default.
- W2891925283 cites W137071854 @default.
- W2891925283 cites W1522301498 @default.
- W2891925283 cites W206679605 @default.
- W2891925283 cites W2096145798 @default.
- W2891925283 cites W2104602264 @default.
- W2891925283 cites W2108892923 @default.
- W2891925283 cites W2109910161 @default.
- W2891925283 cites W2113839990 @default.
- W2891925283 cites W2120327309 @default.
- W2891925283 cites W2138076440 @default.
- W2891925283 cites W2138757993 @default.
- W2891925283 cites W2141559645 @default.
- W2891925283 cites W2145339207 @default.
- W2891925283 cites W2155968351 @default.
- W2891925283 cites W2466211196 @default.
- W2891925283 cites W2523728418 @default.
- W2891925283 cites W2594794854 @default.
- W2891925283 cites W2623431351 @default.
- W2891925283 cites W2768629321 @default.
- W2891925283 cites W2788266634 @default.
- W2891925283 cites W2798511001 @default.
- W2891925283 cites W2949201811 @default.
- W2891925283 cites W2949561945 @default.
- W2891925283 cites W2951896791 @default.
- W2891925283 cites W2963041255 @default.
- W2891925283 cites W2963485523 @default.
- W2891925283 cites W2989068617 @default.
- W2891925283 cites W3093287223 @default.
- W2891925283 hasPublicationYear "2018" @default.
- W2891925283 type Work @default.
- W2891925283 sameAs 2891925283 @default.
- W2891925283 citedByCount "2" @default.
- W2891925283 countsByYear W28919252832018 @default.
- W2891925283 countsByYear W28919252832020 @default.
- W2891925283 crossrefType "posted-content" @default.
- W2891925283 hasAuthorship W2891925283A5008547992 @default.
- W2891925283 hasAuthorship W2891925283A5009933171 @default.
- W2891925283 hasAuthorship W2891925283A5084692264 @default.
- W2891925283 hasConcept C105795698 @default.
- W2891925283 hasConcept C106189395 @default.
- W2891925283 hasConcept C119857082 @default.
- W2891925283 hasConcept C127413603 @default.
- W2891925283 hasConcept C146978453 @default.
- W2891925283 hasConcept C151730666 @default.
- W2891925283 hasConcept C154945302 @default.
- W2891925283 hasConcept C159886148 @default.
- W2891925283 hasConcept C204323151 @default.
- W2891925283 hasConcept C2779343474 @default.
- W2891925283 hasConcept C33923547 @default.
- W2891925283 hasConcept C41008148 @default.
- W2891925283 hasConcept C49937458 @default.
- W2891925283 hasConcept C72434380 @default.
- W2891925283 hasConcept C86803240 @default.
- W2891925283 hasConcept C97541855 @default.
- W2891925283 hasConceptScore W2891925283C105795698 @default.
- W2891925283 hasConceptScore W2891925283C106189395 @default.
- W2891925283 hasConceptScore W2891925283C119857082 @default.
- W2891925283 hasConceptScore W2891925283C127413603 @default.
- W2891925283 hasConceptScore W2891925283C146978453 @default.
- W2891925283 hasConceptScore W2891925283C151730666 @default.
- W2891925283 hasConceptScore W2891925283C154945302 @default.
- W2891925283 hasConceptScore W2891925283C159886148 @default.
- W2891925283 hasConceptScore W2891925283C204323151 @default.
- W2891925283 hasConceptScore W2891925283C2779343474 @default.
- W2891925283 hasConceptScore W2891925283C33923547 @default.
- W2891925283 hasConceptScore W2891925283C41008148 @default.
- W2891925283 hasConceptScore W2891925283C49937458 @default.
- W2891925283 hasConceptScore W2891925283C72434380 @default.
- W2891925283 hasConceptScore W2891925283C86803240 @default.
- W2891925283 hasConceptScore W2891925283C97541855 @default.
- W2891925283 hasLocation W28919252831 @default.
- W2891925283 hasOpenAccess W2891925283 @default.
- W2891925283 hasPrimaryLocation W28919252831 @default.
- W2891925283 hasRelatedWork W103882264 @default.
- W2891925283 hasRelatedWork W1547532156 @default.
- W2891925283 hasRelatedWork W1809653203 @default.
- W2891925283 hasRelatedWork W2098992232 @default.
- W2891925283 hasRelatedWork W2148122001 @default.
- W2891925283 hasRelatedWork W2768498556 @default.
- W2891925283 hasRelatedWork W2775531040 @default.
- W2891925283 hasRelatedWork W2934523877 @default.
- W2891925283 hasRelatedWork W2945719718 @default.
- W2891925283 hasRelatedWork W2946845352 @default.
- W2891925283 hasRelatedWork W2950635145 @default.
- W2891925283 hasRelatedWork W2953749403 @default.
- W2891925283 hasRelatedWork W2986399309 @default.
- W2891925283 hasRelatedWork W3008974617 @default.
- W2891925283 hasRelatedWork W3022259484 @default.
- W2891925283 hasRelatedWork W3046275400 @default.
- W2891925283 hasRelatedWork W3101704277 @default.
- W2891925283 hasRelatedWork W3201286590 @default.
- W2891925283 hasRelatedWork W3209066245 @default.