Matches in SemOpenAlex for { <https://semopenalex.org/work/W2900926445> ?p ?o ?g. }
- W2900926445 abstract "In hierarchical reinforcement learning a major challenge is determining appropriate low-level policies. We propose an unsupervised learning scheme, based on asymmetric self-play from Sukhbaatar et al. (2018), that automatically learns a good representation of sub-goals in the environment and a low-level policy that can execute them. A high-level policy can then direct the lower one by generating a sequence of continuous sub-goal vectors. We evaluate our model using Mazebase and Mujoco environments, including the challenging AntGather task. Visualizations of the sub-goal embeddings reveal a logical decomposition of tasks within the environment. Quantitatively, our approach obtains compelling performance gains over non-hierarchical approaches." @default.
- W2900926445 created "2018-11-29" @default.
- W2900926445 creator A5045383831 @default.
- W2900926445 creator A5053145694 @default.
- W2900926445 creator A5060255128 @default.
- W2900926445 creator A5089960673 @default.
- W2900926445 date "2018-11-22" @default.
- W2900926445 modified "2023-09-27" @default.
- W2900926445 title "Learning Goal Embeddings via Self-Play for Hierarchical Reinforcement Learning." @default.
- W2900926445 cites W1591713425 @default.
- W2900926445 cites W1863227302 @default.
- W2900926445 cites W2000514530 @default.
- W2900926445 cites W2109910161 @default.
- W2900926445 cites W2119717200 @default.
- W2900926445 cites W2139612737 @default.
- W2900926445 cites W2158782408 @default.
- W2900926445 cites W2181068523 @default.
- W2900926445 cites W2342662072 @default.
- W2900926445 cites W2523728418 @default.
- W2900926445 cites W2736601468 @default.
- W2900926445 cites W2737215407 @default.
- W2900926445 cites W2785342287 @default.
- W2900926445 cites W2786917922 @default.
- W2900926445 cites W2796206912 @default.
- W2900926445 cites W2949267040 @default.
- W2900926445 cites W2963177395 @default.
- W2900926445 cites W2963577640 @default.
- W2900926445 hasPublicationYear "2018" @default.
- W2900926445 type Work @default.
- W2900926445 sameAs 2900926445 @default.
- W2900926445 citedByCount "16" @default.
- W2900926445 countsByYear W29009264452019 @default.
- W2900926445 countsByYear W29009264452020 @default.
- W2900926445 countsByYear W29009264452021 @default.
- W2900926445 countsByYear W29009264452022 @default.
- W2900926445 crossrefType "posted-content" @default.
- W2900926445 hasAuthorship W2900926445A5045383831 @default.
- W2900926445 hasAuthorship W2900926445A5053145694 @default.
- W2900926445 hasAuthorship W2900926445A5060255128 @default.
- W2900926445 hasAuthorship W2900926445A5089960673 @default.
- W2900926445 hasConcept C119857082 @default.
- W2900926445 hasConcept C124101348 @default.
- W2900926445 hasConcept C124681953 @default.
- W2900926445 hasConcept C127413603 @default.
- W2900926445 hasConcept C134306372 @default.
- W2900926445 hasConcept C144986985 @default.
- W2900926445 hasConcept C154945302 @default.
- W2900926445 hasConcept C17744445 @default.
- W2900926445 hasConcept C18903297 @default.
- W2900926445 hasConcept C199539241 @default.
- W2900926445 hasConcept C201995342 @default.
- W2900926445 hasConcept C2776359362 @default.
- W2900926445 hasConcept C2778112365 @default.
- W2900926445 hasConcept C2780451532 @default.
- W2900926445 hasConcept C33923547 @default.
- W2900926445 hasConcept C41008148 @default.
- W2900926445 hasConcept C54355233 @default.
- W2900926445 hasConcept C77618280 @default.
- W2900926445 hasConcept C86803240 @default.
- W2900926445 hasConcept C94625758 @default.
- W2900926445 hasConcept C97541855 @default.
- W2900926445 hasConceptScore W2900926445C119857082 @default.
- W2900926445 hasConceptScore W2900926445C124101348 @default.
- W2900926445 hasConceptScore W2900926445C124681953 @default.
- W2900926445 hasConceptScore W2900926445C127413603 @default.
- W2900926445 hasConceptScore W2900926445C134306372 @default.
- W2900926445 hasConceptScore W2900926445C144986985 @default.
- W2900926445 hasConceptScore W2900926445C154945302 @default.
- W2900926445 hasConceptScore W2900926445C17744445 @default.
- W2900926445 hasConceptScore W2900926445C18903297 @default.
- W2900926445 hasConceptScore W2900926445C199539241 @default.
- W2900926445 hasConceptScore W2900926445C201995342 @default.
- W2900926445 hasConceptScore W2900926445C2776359362 @default.
- W2900926445 hasConceptScore W2900926445C2778112365 @default.
- W2900926445 hasConceptScore W2900926445C2780451532 @default.
- W2900926445 hasConceptScore W2900926445C33923547 @default.
- W2900926445 hasConceptScore W2900926445C41008148 @default.
- W2900926445 hasConceptScore W2900926445C54355233 @default.
- W2900926445 hasConceptScore W2900926445C77618280 @default.
- W2900926445 hasConceptScore W2900926445C86803240 @default.
- W2900926445 hasConceptScore W2900926445C94625758 @default.
- W2900926445 hasConceptScore W2900926445C97541855 @default.
- W2900926445 hasLocation W29009264451 @default.
- W2900926445 hasOpenAccess W2900926445 @default.
- W2900926445 hasPrimaryLocation W29009264451 @default.
- W2900926445 hasRelatedWork W2121863487 @default.
- W2900926445 hasRelatedWork W2145339207 @default.
- W2900926445 hasRelatedWork W2296073425 @default.
- W2900926445 hasRelatedWork W2736601468 @default.
- W2900926445 hasRelatedWork W2751973545 @default.
- W2900926445 hasRelatedWork W2803281228 @default.
- W2900926445 hasRelatedWork W2895192407 @default.
- W2900926445 hasRelatedWork W2903048334 @default.
- W2900926445 hasRelatedWork W2903447353 @default.
- W2900926445 hasRelatedWork W2914261249 @default.
- W2900926445 hasRelatedWork W2920215304 @default.
- W2900926445 hasRelatedWork W2962823158 @default.
- W2900926445 hasRelatedWork W2963262099 @default.
- W2900926445 hasRelatedWork W2963293881 @default.
- W2900926445 hasRelatedWork W2963321092 @default.