Matches in SemOpenAlex for { <https://semopenalex.org/work/W4298091911> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W4298091911 endingPage "260" @default.
- W4298091911 startingPage "213" @default.
- W4298091911 abstract "StarCraft II (SC2) poses a grand challenge for reinforcement learning (RL), of which the main difficulties include huge state space, varying action space, and a long time horizon. In this work, we investigate a set of RL techniques for the full-length game of StarCraft II. We investigate a hierarchical RL approach, where the hierarchy involves two. One is the extracted macro-actions from experts’ demonstration trajectories to reduce the action space in an order of magnitude. The other is a hierarchical architecture of neural networks, which is modular and facilitates scale. We investigate a curriculum transfer training procedure that trains the agent from the simplest level to the hardest level. We train the agent on a single machine with 4 GPUs and 48 CPU threads. On a 64x64 map and using restrictive units, we achieve a win rate of 99% against the difficulty level-1 built-in AI. Through the curriculum transfer learning algorithm and a mixture of combat models, we achieve a 93% win rate against the most difficult non-cheating level built-in AI (level-7). In this extended version of the paper, we improve our architecture to train the agent against the most difficult cheating level AIs (level-8, level-9, and level-10). We also test our method on different maps to evaluate the extensibility of our approach. By a final 3-layer hierarchical architecture and applying significant tricks to train SC2 agents, we increase the win rate against the level-8, level-9, and level-10 to 96%, 97%, and 94%, respectively. Our codes and models are all open-sourced now at https://github.com/liuruoze/HierNet-SC2. To provide a baseline referring the AlphaStar for our work as well as the research and open-source community, we reproduce a scaled-down version of it, mini-AlphaStar (mAS). The latest version of mAS is 1.07, which can be trained using supervised learning and reinforcement learning on the raw action space which has 564 actions. It is designed to run training on a single common machine, by making the hyper-parameters adjustable and some settings simplified. We then can compare our work with mAS using the same computing resources and training time. By experiment results, we show that our method is more effective when using limited resources. The inference and training codes of mini-AlphaStar are all open-sourced at https://github.com/liuruoze/mini-AlphaStar. We hope our study could shed some light on the future research of efficient reinforcement learning on SC2 and other large-scale games." @default.
- W4298091911 created "2022-10-01" @default.
- W4298091911 creator A5021058812 @default.
- W4298091911 creator A5039038726 @default.
- W4298091911 creator A5062687402 @default.
- W4298091911 creator A5073240013 @default.
- W4298091911 creator A5086664264 @default.
- W4298091911 creator A5088385429 @default.
- W4298091911 date "2022-09-29" @default.
- W4298091911 modified "2023-10-01" @default.
- W4298091911 title "On Efficient Reinforcement Learning for Full-length Game of StarCraft II" @default.
- W4298091911 doi "https://doi.org/10.1613/jair.1.13743" @default.
- W4298091911 hasPublicationYear "2022" @default.
- W4298091911 type Work @default.
- W4298091911 citedByCount "0" @default.
- W4298091911 crossrefType "journal-article" @default.
- W4298091911 hasAuthorship W4298091911A5021058812 @default.
- W4298091911 hasAuthorship W4298091911A5039038726 @default.
- W4298091911 hasAuthorship W4298091911A5062687402 @default.
- W4298091911 hasAuthorship W4298091911A5073240013 @default.
- W4298091911 hasAuthorship W4298091911A5086664264 @default.
- W4298091911 hasAuthorship W4298091911A5088385429 @default.
- W4298091911 hasBestOaLocation W42980919111 @default.
- W4298091911 hasConcept C101468663 @default.
- W4298091911 hasConcept C111919701 @default.
- W4298091911 hasConcept C119857082 @default.
- W4298091911 hasConcept C121332964 @default.
- W4298091911 hasConcept C123657996 @default.
- W4298091911 hasConcept C142362112 @default.
- W4298091911 hasConcept C153349607 @default.
- W4298091911 hasConcept C154945302 @default.
- W4298091911 hasConcept C162324750 @default.
- W4298091911 hasConcept C177264268 @default.
- W4298091911 hasConcept C199360897 @default.
- W4298091911 hasConcept C2780791683 @default.
- W4298091911 hasConcept C31170391 @default.
- W4298091911 hasConcept C34447519 @default.
- W4298091911 hasConcept C41008148 @default.
- W4298091911 hasConcept C62520636 @default.
- W4298091911 hasConcept C97541855 @default.
- W4298091911 hasConceptScore W4298091911C101468663 @default.
- W4298091911 hasConceptScore W4298091911C111919701 @default.
- W4298091911 hasConceptScore W4298091911C119857082 @default.
- W4298091911 hasConceptScore W4298091911C121332964 @default.
- W4298091911 hasConceptScore W4298091911C123657996 @default.
- W4298091911 hasConceptScore W4298091911C142362112 @default.
- W4298091911 hasConceptScore W4298091911C153349607 @default.
- W4298091911 hasConceptScore W4298091911C154945302 @default.
- W4298091911 hasConceptScore W4298091911C162324750 @default.
- W4298091911 hasConceptScore W4298091911C177264268 @default.
- W4298091911 hasConceptScore W4298091911C199360897 @default.
- W4298091911 hasConceptScore W4298091911C2780791683 @default.
- W4298091911 hasConceptScore W4298091911C31170391 @default.
- W4298091911 hasConceptScore W4298091911C34447519 @default.
- W4298091911 hasConceptScore W4298091911C41008148 @default.
- W4298091911 hasConceptScore W4298091911C62520636 @default.
- W4298091911 hasConceptScore W4298091911C97541855 @default.
- W4298091911 hasLocation W42980919111 @default.
- W4298091911 hasLocation W42980919112 @default.
- W4298091911 hasOpenAccess W4298091911 @default.
- W4298091911 hasPrimaryLocation W42980919111 @default.
- W4298091911 hasRelatedWork W2045236383 @default.
- W4298091911 hasRelatedWork W260766989 @default.
- W4298091911 hasRelatedWork W2959276766 @default.
- W4298091911 hasRelatedWork W2961085424 @default.
- W4298091911 hasRelatedWork W2976657239 @default.
- W4298091911 hasRelatedWork W3074294383 @default.
- W4298091911 hasRelatedWork W4206669594 @default.
- W4298091911 hasRelatedWork W4295941380 @default.
- W4298091911 hasRelatedWork W4319083788 @default.
- W4298091911 hasRelatedWork W2121778218 @default.
- W4298091911 hasVolume "75" @default.
- W4298091911 isParatext "false" @default.
- W4298091911 isRetracted "false" @default.
- W4298091911 workType "article" @default.