Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385627070> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- W4385627070 endingPage "14" @default.
- W4385627070 startingPage "1" @default.
- W4385627070 abstract "State-of-the-art multi-agent policy gradient (MAPG) methods have demonstrated convincing capability in many cooperative games. However, the exponentially growing joint-action space severely challenges the critic's value evaluation and hinders performance of MAPG methods. To address this issue, we augment Central-Q policy gradient with a joint-action embedding function and propose Mutual-information Maximization MAPG (M3APG). The joint-action embedding function makes joint-actions contain information of state transitions, which will improve the critic's generalization over the joint-action space by allowing it to infer joint-actions' outcomes. We theoretically prove that with a fixed joint-action embedding function, the convergence of M3APG is guaranteed. Experiment results on the StarCraft Multi-Agent Challenge (SMAC) demonstrate that M3APG gives evaluation results with better accuracy and outperform other MAPG basic models across various maps of multiple difficulty levels. We empirically show that our joint-action embedding model can be extended to value-based multi-agent reinforcement learning methods and state-of-the-art MAPG methods. Finally, we run ablation study to show that the usage of mutual information in our method is necessary and effective." @default.
- W4385627070 created "2023-08-08" @default.
- W4385627070 creator A5000704525 @default.
- W4385627070 creator A5011772578 @default.
- W4385627070 creator A5028693655 @default.
- W4385627070 creator A5057318866 @default.
- W4385627070 creator A5074681163 @default.
- W4385627070 creator A5086169822 @default.
- W4385627070 date "2023-01-01" @default.
- W4385627070 modified "2023-10-12" @default.
- W4385627070 title "Leveraging Joint-action Embedding in Multi-agent Reinforcement Learning for Cooperative Games" @default.
- W4385627070 doi "https://doi.org/10.1109/tg.2023.3302694" @default.
- W4385627070 hasPublicationYear "2023" @default.
- W4385627070 type Work @default.
- W4385627070 citedByCount "0" @default.
- W4385627070 crossrefType "journal-article" @default.
- W4385627070 hasAuthorship W4385627070A5000704525 @default.
- W4385627070 hasAuthorship W4385627070A5011772578 @default.
- W4385627070 hasAuthorship W4385627070A5028693655 @default.
- W4385627070 hasAuthorship W4385627070A5057318866 @default.
- W4385627070 hasAuthorship W4385627070A5074681163 @default.
- W4385627070 hasAuthorship W4385627070A5086169822 @default.
- W4385627070 hasConcept C119857082 @default.
- W4385627070 hasConcept C121332964 @default.
- W4385627070 hasConcept C126255220 @default.
- W4385627070 hasConcept C127413603 @default.
- W4385627070 hasConcept C134306372 @default.
- W4385627070 hasConcept C14036430 @default.
- W4385627070 hasConcept C154945302 @default.
- W4385627070 hasConcept C170154142 @default.
- W4385627070 hasConcept C177148314 @default.
- W4385627070 hasConcept C18555067 @default.
- W4385627070 hasConcept C2776330181 @default.
- W4385627070 hasConcept C2780791683 @default.
- W4385627070 hasConcept C33923547 @default.
- W4385627070 hasConcept C41008148 @default.
- W4385627070 hasConcept C41608201 @default.
- W4385627070 hasConcept C62520636 @default.
- W4385627070 hasConcept C78458016 @default.
- W4385627070 hasConcept C86803240 @default.
- W4385627070 hasConcept C97541855 @default.
- W4385627070 hasConceptScore W4385627070C119857082 @default.
- W4385627070 hasConceptScore W4385627070C121332964 @default.
- W4385627070 hasConceptScore W4385627070C126255220 @default.
- W4385627070 hasConceptScore W4385627070C127413603 @default.
- W4385627070 hasConceptScore W4385627070C134306372 @default.
- W4385627070 hasConceptScore W4385627070C14036430 @default.
- W4385627070 hasConceptScore W4385627070C154945302 @default.
- W4385627070 hasConceptScore W4385627070C170154142 @default.
- W4385627070 hasConceptScore W4385627070C177148314 @default.
- W4385627070 hasConceptScore W4385627070C18555067 @default.
- W4385627070 hasConceptScore W4385627070C2776330181 @default.
- W4385627070 hasConceptScore W4385627070C2780791683 @default.
- W4385627070 hasConceptScore W4385627070C33923547 @default.
- W4385627070 hasConceptScore W4385627070C41008148 @default.
- W4385627070 hasConceptScore W4385627070C41608201 @default.
- W4385627070 hasConceptScore W4385627070C62520636 @default.
- W4385627070 hasConceptScore W4385627070C78458016 @default.
- W4385627070 hasConceptScore W4385627070C86803240 @default.
- W4385627070 hasConceptScore W4385627070C97541855 @default.
- W4385627070 hasLocation W43856270701 @default.
- W4385627070 hasOpenAccess W4385627070 @default.
- W4385627070 hasPrimaryLocation W43856270701 @default.
- W4385627070 hasRelatedWork W260766989 @default.
- W4385627070 hasRelatedWork W2959276766 @default.
- W4385627070 hasRelatedWork W2961085424 @default.
- W4385627070 hasRelatedWork W3022183679 @default.
- W4385627070 hasRelatedWork W3037422413 @default.
- W4385627070 hasRelatedWork W3139193008 @default.
- W4385627070 hasRelatedWork W4206669594 @default.
- W4385627070 hasRelatedWork W4295941380 @default.
- W4385627070 hasRelatedWork W4319083788 @default.
- W4385627070 hasRelatedWork W4377293004 @default.
- W4385627070 isParatext "false" @default.
- W4385627070 isRetracted "false" @default.
- W4385627070 workType "article" @default.