Matches in SemOpenAlex for { <https://semopenalex.org/work/W4367190914> ?p ?o ?g. }
- W4367190914 abstract "We investigate whether Deep Reinforcement Learning (Deep RL) is able to synthesize sophisticated and safe movement skills for a low-cost, miniature humanoid robot that can be composed into complex behavioral strategies in dynamic environments. We used Deep RL to train a humanoid robot with 20 actuated joints to play a simplified one-versus-one (1v1) soccer game. We first trained individual skills in isolation and then composed those skills end-to-end in a self-play setting. The resulting policy exhibits robust and dynamic movement skills such as rapid fall recovery, walking, turning, kicking and more; and transitions between them in a smooth, stable, and efficient manner - well beyond what is intuitively expected from the robot. The agents also developed a basic strategic understanding of the game, and learned, for instance, to anticipate ball movements and to block opponent shots. The full range of behaviors emerged from a small set of simple rewards. Our agents were trained in simulation and transferred to real robots zero-shot. We found that a combination of sufficiently high-frequency control, targeted dynamics randomization, and perturbations during training in simulation enabled good-quality transfer, despite significant unmodeled effects and variations across robot instances. Although the robots are inherently fragile, minor hardware modifications together with basic regularization of the behavior during training led the robots to learn safe and effective movements while still performing in a dynamic and agile way. Indeed, even though the agents were optimized for scoring, in experiments they walked 156% faster, took 63% less time to get up, and kicked 24% faster than a scripted baseline, while efficiently combining the skills to achieve the longer term objectives. Examples of the emergent behaviors and full 1v1 matches are available on the supplementary website." @default.
- W4367190914 created "2023-04-28" @default.
- W4367190914 creator A5000038545 @default.
- W4367190914 creator A5002747297 @default.
- W4367190914 creator A5004482443 @default.
- W4367190914 creator A5005912318 @default.
- W4367190914 creator A5006431582 @default.
- W4367190914 creator A5008259889 @default.
- W4367190914 creator A5010509017 @default.
- W4367190914 creator A5010818667 @default.
- W4367190914 creator A5014567358 @default.
- W4367190914 creator A5018196238 @default.
- W4367190914 creator A5028236088 @default.
- W4367190914 creator A5030073861 @default.
- W4367190914 creator A5031943811 @default.
- W4367190914 creator A5037305533 @default.
- W4367190914 creator A5039840350 @default.
- W4367190914 creator A5054501100 @default.
- W4367190914 creator A5058953211 @default.
- W4367190914 creator A5062886897 @default.
- W4367190914 creator A5062951341 @default.
- W4367190914 creator A5063532849 @default.
- W4367190914 creator A5065489996 @default.
- W4367190914 creator A5071434961 @default.
- W4367190914 creator A5074380892 @default.
- W4367190914 creator A5077653936 @default.
- W4367190914 creator A5078419629 @default.
- W4367190914 creator A5079415139 @default.
- W4367190914 creator A5082738872 @default.
- W4367190914 creator A5090567851 @default.
- W4367190914 date "2023-04-26" @default.
- W4367190914 modified "2023-10-14" @default.
- W4367190914 title "Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning" @default.
- W4367190914 doi "https://doi.org/10.48550/arxiv.2304.13653" @default.
- W4367190914 hasPublicationYear "2023" @default.
- W4367190914 type Work @default.
- W4367190914 citedByCount "0" @default.
- W4367190914 crossrefType "posted-content" @default.
- W4367190914 hasAuthorship W4367190914A5000038545 @default.
- W4367190914 hasAuthorship W4367190914A5002747297 @default.
- W4367190914 hasAuthorship W4367190914A5004482443 @default.
- W4367190914 hasAuthorship W4367190914A5005912318 @default.
- W4367190914 hasAuthorship W4367190914A5006431582 @default.
- W4367190914 hasAuthorship W4367190914A5008259889 @default.
- W4367190914 hasAuthorship W4367190914A5010509017 @default.
- W4367190914 hasAuthorship W4367190914A5010818667 @default.
- W4367190914 hasAuthorship W4367190914A5014567358 @default.
- W4367190914 hasAuthorship W4367190914A5018196238 @default.
- W4367190914 hasAuthorship W4367190914A5028236088 @default.
- W4367190914 hasAuthorship W4367190914A5030073861 @default.
- W4367190914 hasAuthorship W4367190914A5031943811 @default.
- W4367190914 hasAuthorship W4367190914A5037305533 @default.
- W4367190914 hasAuthorship W4367190914A5039840350 @default.
- W4367190914 hasAuthorship W4367190914A5054501100 @default.
- W4367190914 hasAuthorship W4367190914A5058953211 @default.
- W4367190914 hasAuthorship W4367190914A5062886897 @default.
- W4367190914 hasAuthorship W4367190914A5062951341 @default.
- W4367190914 hasAuthorship W4367190914A5063532849 @default.
- W4367190914 hasAuthorship W4367190914A5065489996 @default.
- W4367190914 hasAuthorship W4367190914A5071434961 @default.
- W4367190914 hasAuthorship W4367190914A5074380892 @default.
- W4367190914 hasAuthorship W4367190914A5077653936 @default.
- W4367190914 hasAuthorship W4367190914A5078419629 @default.
- W4367190914 hasAuthorship W4367190914A5079415139 @default.
- W4367190914 hasAuthorship W4367190914A5082738872 @default.
- W4367190914 hasAuthorship W4367190914A5090567851 @default.
- W4367190914 hasBestOaLocation W43671909141 @default.
- W4367190914 hasConcept C107457646 @default.
- W4367190914 hasConcept C115903868 @default.
- W4367190914 hasConcept C127413603 @default.
- W4367190914 hasConcept C14185376 @default.
- W4367190914 hasConcept C154945302 @default.
- W4367190914 hasConcept C207451115 @default.
- W4367190914 hasConcept C41008148 @default.
- W4367190914 hasConcept C44154836 @default.
- W4367190914 hasConcept C60692881 @default.
- W4367190914 hasConcept C78519656 @default.
- W4367190914 hasConcept C90509273 @default.
- W4367190914 hasConcept C97541855 @default.
- W4367190914 hasConceptScore W4367190914C107457646 @default.
- W4367190914 hasConceptScore W4367190914C115903868 @default.
- W4367190914 hasConceptScore W4367190914C127413603 @default.
- W4367190914 hasConceptScore W4367190914C14185376 @default.
- W4367190914 hasConceptScore W4367190914C154945302 @default.
- W4367190914 hasConceptScore W4367190914C207451115 @default.
- W4367190914 hasConceptScore W4367190914C41008148 @default.
- W4367190914 hasConceptScore W4367190914C44154836 @default.
- W4367190914 hasConceptScore W4367190914C60692881 @default.
- W4367190914 hasConceptScore W4367190914C78519656 @default.
- W4367190914 hasConceptScore W4367190914C90509273 @default.
- W4367190914 hasConceptScore W4367190914C97541855 @default.
- W4367190914 hasLocation W43671909141 @default.
- W4367190914 hasOpenAccess W4367190914 @default.
- W4367190914 hasPrimaryLocation W43671909141 @default.
- W4367190914 hasRelatedWork W1566684879 @default.
- W4367190914 hasRelatedWork W1804325819 @default.
- W4367190914 hasRelatedWork W2016588944 @default.
- W4367190914 hasRelatedWork W2031129854 @default.
- W4367190914 hasRelatedWork W2085492181 @default.
- W4367190914 hasRelatedWork W2113151339 @default.