Matches in SemOpenAlex for { <https://semopenalex.org/work/W4384519332> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- W4384519332 endingPage "1361" @default.
- W4384519332 startingPage "1351" @default.
- W4384519332 abstract "Value function factorization via centralized training and decentralized execution is promising for solving cooperative multi-agent reinforcement tasks. One of the approaches in this area, QMIX, has become state-of-the-art and achieved the best performance on the StarCraft II micromanagement benchmark. However, the monotonic-mixing of per agent estimates in QMIX is known to restrict the joint action Q-values it can represent, as well as the insufficient global state information for single agent value function estimation, often resulting in suboptimality. To this end, we present LSF-SAC, a novel framework that features a variational inference-based information-sharing mechanism as extra state information to assist individual agents in the value function factorization. We demonstrate that such latent individual state information sharing can significantly expand the power of value function factorization, while fully decentralized execution can still be maintained in LSF-SAC through a soft-actor-critic design. We evaluate LSF-SAC on the StarCraft II micromanagement challenge and demonstrate that it outperforms several state-of-the-art methods in challenging collaborative tasks. We further set extensive ablation studies for locating the key factors accounting for its performance improvements. We believe that this new insight can lead to new local value estimation methods and variational deep learning algorithms. A demo video and code of implementation can be found at https://sites.google.com/view/sacmm." @default.
- W4384519332 created "2023-07-18" @default.
- W4384519332 creator A5001402729 @default.
- W4384519332 creator A5018464968 @default.
- W4384519332 creator A5064822688 @default.
- W4384519332 date "2023-10-01" @default.
- W4384519332 modified "2023-10-17" @default.
- W4384519332 title "Value Functions Factorization With Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients" @default.
- W4384519332 doi "https://doi.org/10.1109/tetci.2023.3293193" @default.
- W4384519332 hasPublicationYear "2023" @default.
- W4384519332 type Work @default.
- W4384519332 citedByCount "1" @default.
- W4384519332 crossrefType "journal-article" @default.
- W4384519332 hasAuthorship W4384519332A5001402729 @default.
- W4384519332 hasAuthorship W4384519332A5018464968 @default.
- W4384519332 hasAuthorship W4384519332A5064822688 @default.
- W4384519332 hasBestOaLocation W43845193322 @default.
- W4384519332 hasConcept C11413529 @default.
- W4384519332 hasConcept C119857082 @default.
- W4384519332 hasConcept C126255220 @default.
- W4384519332 hasConcept C13280743 @default.
- W4384519332 hasConcept C14036430 @default.
- W4384519332 hasConcept C14646407 @default.
- W4384519332 hasConcept C154945302 @default.
- W4384519332 hasConcept C177264268 @default.
- W4384519332 hasConcept C185798385 @default.
- W4384519332 hasConcept C187834632 @default.
- W4384519332 hasConcept C199360897 @default.
- W4384519332 hasConcept C205649164 @default.
- W4384519332 hasConcept C2776214188 @default.
- W4384519332 hasConcept C33923547 @default.
- W4384519332 hasConcept C41008148 @default.
- W4384519332 hasConcept C48103436 @default.
- W4384519332 hasConcept C78458016 @default.
- W4384519332 hasConcept C80444323 @default.
- W4384519332 hasConcept C86803240 @default.
- W4384519332 hasConcept C97541855 @default.
- W4384519332 hasConceptScore W4384519332C11413529 @default.
- W4384519332 hasConceptScore W4384519332C119857082 @default.
- W4384519332 hasConceptScore W4384519332C126255220 @default.
- W4384519332 hasConceptScore W4384519332C13280743 @default.
- W4384519332 hasConceptScore W4384519332C14036430 @default.
- W4384519332 hasConceptScore W4384519332C14646407 @default.
- W4384519332 hasConceptScore W4384519332C154945302 @default.
- W4384519332 hasConceptScore W4384519332C177264268 @default.
- W4384519332 hasConceptScore W4384519332C185798385 @default.
- W4384519332 hasConceptScore W4384519332C187834632 @default.
- W4384519332 hasConceptScore W4384519332C199360897 @default.
- W4384519332 hasConceptScore W4384519332C205649164 @default.
- W4384519332 hasConceptScore W4384519332C2776214188 @default.
- W4384519332 hasConceptScore W4384519332C33923547 @default.
- W4384519332 hasConceptScore W4384519332C41008148 @default.
- W4384519332 hasConceptScore W4384519332C48103436 @default.
- W4384519332 hasConceptScore W4384519332C78458016 @default.
- W4384519332 hasConceptScore W4384519332C80444323 @default.
- W4384519332 hasConceptScore W4384519332C86803240 @default.
- W4384519332 hasConceptScore W4384519332C97541855 @default.
- W4384519332 hasIssue "5" @default.
- W4384519332 hasLocation W43845193321 @default.
- W4384519332 hasLocation W43845193322 @default.
- W4384519332 hasOpenAccess W4384519332 @default.
- W4384519332 hasPrimaryLocation W43845193321 @default.
- W4384519332 hasRelatedWork W2729602312 @default.
- W4384519332 hasRelatedWork W2950892788 @default.
- W4384519332 hasRelatedWork W3035493430 @default.
- W4384519332 hasRelatedWork W3040891685 @default.
- W4384519332 hasRelatedWork W3206637819 @default.
- W4384519332 hasRelatedWork W3214094365 @default.
- W4384519332 hasRelatedWork W4226345898 @default.
- W4384519332 hasRelatedWork W4303494752 @default.
- W4384519332 hasRelatedWork W4307468682 @default.
- W4384519332 hasRelatedWork W4319083788 @default.
- W4384519332 hasVolume "7" @default.
- W4384519332 isParatext "false" @default.
- W4384519332 isRetracted "false" @default.
- W4384519332 workType "article" @default.