Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386869541> ?p ?o ?g. }
Showing items 1 to 70 of
70
with 100 items per page.
- W4386869541 endingPage "11" @default.
- W4386869541 startingPage "1" @default.
- W4386869541 abstract "Despite the potential of Multi-Agent Reinforcement Learning (MARL) in addressing numerous complex tasks, training a single team of MARL agents to handle multiple diverse team tasks remains a challenge. In this paper, we introduce a novel Multi-task method based on Knowledge Transfer in cooperative MARL (MKT-MARL). By learning from task-specific teachers, our approach empowers a single team of agents to attain expert-level performance in multiple tasks. MKT-MARL utilizes a knowledge distillation algorithm specifically designed for the multi-agent architecture, which rapidly learns a team control policy incorporating common coordinated knowledge from the experience of task-specific teachers. Additionally, we enhance this training with teacher annealing, gradually shifting the model's learning from distillation towards environmental rewards. This enhancement helps the multi-task model surpass its single-task teachers. We extensively evaluate our algorithm using two commonly-used benchmarks: StarCraft II micro-management and multi-agent particle environment. The experimental results demonstrate that our algorithm outperforms both the single-task teachers and a jointly-trained team of agents. Extensive ablation experiments illustrate the effectiveness of the supervised knowledge transfer and the teacher annealing strategy." @default.
- W4386869541 created "2023-09-20" @default.
- W4386869541 creator A5028693655 @default.
- W4386869541 creator A5049451354 @default.
- W4386869541 creator A5052043913 @default.
- W4386869541 creator A5068063235 @default.
- W4386869541 creator A5083652259 @default.
- W4386869541 date "2023-01-01" @default.
- W4386869541 modified "2023-09-27" @default.
- W4386869541 title "Deep Multi-Task Multi-Agent Reinforcement Learning With Knowledge Transfer" @default.
- W4386869541 doi "https://doi.org/10.1109/tg.2023.3316697" @default.
- W4386869541 hasPublicationYear "2023" @default.
- W4386869541 type Work @default.
- W4386869541 citedByCount "0" @default.
- W4386869541 crossrefType "journal-article" @default.
- W4386869541 hasAuthorship W4386869541A5028693655 @default.
- W4386869541 hasAuthorship W4386869541A5049451354 @default.
- W4386869541 hasAuthorship W4386869541A5052043913 @default.
- W4386869541 hasAuthorship W4386869541A5068063235 @default.
- W4386869541 hasAuthorship W4386869541A5083652259 @default.
- W4386869541 hasConcept C107457646 @default.
- W4386869541 hasConcept C109007969 @default.
- W4386869541 hasConcept C119857082 @default.
- W4386869541 hasConcept C126980161 @default.
- W4386869541 hasConcept C127413603 @default.
- W4386869541 hasConcept C150899416 @default.
- W4386869541 hasConcept C151730666 @default.
- W4386869541 hasConcept C154945302 @default.
- W4386869541 hasConcept C201995342 @default.
- W4386869541 hasConcept C2776960227 @default.
- W4386869541 hasConcept C2780451532 @default.
- W4386869541 hasConcept C41008148 @default.
- W4386869541 hasConcept C56739046 @default.
- W4386869541 hasConcept C86803240 @default.
- W4386869541 hasConcept C92927620 @default.
- W4386869541 hasConcept C97541855 @default.
- W4386869541 hasConceptScore W4386869541C107457646 @default.
- W4386869541 hasConceptScore W4386869541C109007969 @default.
- W4386869541 hasConceptScore W4386869541C119857082 @default.
- W4386869541 hasConceptScore W4386869541C126980161 @default.
- W4386869541 hasConceptScore W4386869541C127413603 @default.
- W4386869541 hasConceptScore W4386869541C150899416 @default.
- W4386869541 hasConceptScore W4386869541C151730666 @default.
- W4386869541 hasConceptScore W4386869541C154945302 @default.
- W4386869541 hasConceptScore W4386869541C201995342 @default.
- W4386869541 hasConceptScore W4386869541C2776960227 @default.
- W4386869541 hasConceptScore W4386869541C2780451532 @default.
- W4386869541 hasConceptScore W4386869541C41008148 @default.
- W4386869541 hasConceptScore W4386869541C56739046 @default.
- W4386869541 hasConceptScore W4386869541C86803240 @default.
- W4386869541 hasConceptScore W4386869541C92927620 @default.
- W4386869541 hasConceptScore W4386869541C97541855 @default.
- W4386869541 hasLocation W43868695411 @default.
- W4386869541 hasOpenAccess W4386869541 @default.
- W4386869541 hasPrimaryLocation W43868695411 @default.
- W4386869541 hasRelatedWork W2960456850 @default.
- W4386869541 hasRelatedWork W3021430260 @default.
- W4386869541 hasRelatedWork W4281645081 @default.
- W4386869541 hasRelatedWork W4308262314 @default.
- W4386869541 hasRelatedWork W4312200629 @default.
- W4386869541 hasRelatedWork W4319083788 @default.
- W4386869541 hasRelatedWork W4379662533 @default.
- W4386869541 hasRelatedWork W4379983844 @default.
- W4386869541 hasRelatedWork W4382286161 @default.
- W4386869541 hasRelatedWork W4386213806 @default.
- W4386869541 isParatext "false" @default.
- W4386869541 isRetracted "false" @default.
- W4386869541 workType "article" @default.