Matches in SemOpenAlex for { <https://semopenalex.org/work/W3206746211> ?p ?o ?g. }
- W3206746211 abstract "We present a hierarchical planning and control framework that enables an agent to perform various tasks and adapt to a new task flexibly. Rather than learning an individual policy for each particular task, the proposed framework, DISH, distills a hierarchical policy from a set of tasks by representation and reinforcement learning. The framework is based on the idea of latent variable models that represent high-dimensional observations using low-dimensional latent variables. The resulting policy consists of two levels of hierarchy: (i) a planning module that reasons a sequence of latent intentions that would lead to an optimistic future and (ii) a feedback control policy, shared across the tasks, that executes the inferred intention. Because the planning is performed in low-dimensional latent space, the learned policy can immediately be used to solve or adapt to new tasks without additional training. We demonstrate the proposed framework can learn compact representations (3- and 1-dimensional latent states and commands for a humanoid with 197- and 36-dimensional state features and actions) while solving a small number of imitation tasks, and the resulting policy is directly applicable to other types of tasks, i.e., navigation in cluttered environments." @default.
- W3206746211 created "2021-10-25" @default.
- W3206746211 creator A5003306942 @default.
- W3206746211 creator A5012265266 @default.
- W3206746211 creator A5048997069 @default.
- W3206746211 creator A5049291168 @default.
- W3206746211 creator A5090276849 @default.
- W3206746211 date "2021-05-30" @default.
- W3206746211 modified "2023-10-03" @default.
- W3206746211 title "Distilling a Hierarchical Policy for Planning and Control via Representation and Reinforcement Learning" @default.
- W3206746211 cites W1637456626 @default.
- W3206746211 cites W1752690783 @default.
- W3206746211 cites W1959608418 @default.
- W3206746211 cites W2018705428 @default.
- W3206746211 cites W2029776687 @default.
- W3206746211 cites W2098774185 @default.
- W3206746211 cites W2107464055 @default.
- W3206746211 cites W2145339207 @default.
- W3206746211 cites W2155772159 @default.
- W3206746211 cites W2188365844 @default.
- W3206746211 cites W2296360731 @default.
- W3206746211 cites W2736601468 @default.
- W3206746211 cites W2738778707 @default.
- W3206746211 cites W2766447205 @default.
- W3206746211 cites W2785342287 @default.
- W3206746211 cites W2788904251 @default.
- W3206746211 cites W2796290181 @default.
- W3206746211 cites W2799151646 @default.
- W3206746211 cites W2803281228 @default.
- W3206746211 cites W2890208753 @default.
- W3206746211 cites W2900152462 @default.
- W3206746211 cites W2902711054 @default.
- W3206746211 cites W2918645572 @default.
- W3206746211 cites W2920362155 @default.
- W3206746211 cites W2954142106 @default.
- W3206746211 cites W2963166838 @default.
- W3206746211 cites W2963286043 @default.
- W3206746211 cites W2963344681 @default.
- W3206746211 cites W2963414638 @default.
- W3206746211 cites W2963438456 @default.
- W3206746211 cites W2963722050 @default.
- W3206746211 cites W2963960193 @default.
- W3206746211 cites W2964077562 @default.
- W3206746211 cites W2964232608 @default.
- W3206746211 cites W2964338130 @default.
- W3206746211 cites W2970990801 @default.
- W3206746211 cites W3031840745 @default.
- W3206746211 cites W3034888459 @default.
- W3206746211 cites W3093010610 @default.
- W3206746211 cites W3103834977 @default.
- W3206746211 cites W2882052894 @default.
- W3206746211 doi "https://doi.org/10.1109/icra48506.2021.9561017" @default.
- W3206746211 hasPublicationYear "2021" @default.
- W3206746211 type Work @default.
- W3206746211 sameAs 3206746211 @default.
- W3206746211 citedByCount "0" @default.
- W3206746211 crossrefType "proceedings-article" @default.
- W3206746211 hasAuthorship W3206746211A5003306942 @default.
- W3206746211 hasAuthorship W3206746211A5012265266 @default.
- W3206746211 hasAuthorship W3206746211A5048997069 @default.
- W3206746211 hasAuthorship W3206746211A5049291168 @default.
- W3206746211 hasAuthorship W3206746211A5090276849 @default.
- W3206746211 hasBestOaLocation W32067462112 @default.
- W3206746211 hasConcept C105795698 @default.
- W3206746211 hasConcept C107457646 @default.
- W3206746211 hasConcept C119857082 @default.
- W3206746211 hasConcept C126388530 @default.
- W3206746211 hasConcept C127413603 @default.
- W3206746211 hasConcept C154945302 @default.
- W3206746211 hasConcept C15744967 @default.
- W3206746211 hasConcept C162324750 @default.
- W3206746211 hasConcept C177264268 @default.
- W3206746211 hasConcept C17744445 @default.
- W3206746211 hasConcept C199360897 @default.
- W3206746211 hasConcept C199539241 @default.
- W3206746211 hasConcept C201995342 @default.
- W3206746211 hasConcept C2775924081 @default.
- W3206746211 hasConcept C2776359362 @default.
- W3206746211 hasConcept C2778112365 @default.
- W3206746211 hasConcept C2780451532 @default.
- W3206746211 hasConcept C31170391 @default.
- W3206746211 hasConcept C33923547 @default.
- W3206746211 hasConcept C34447519 @default.
- W3206746211 hasConcept C41008148 @default.
- W3206746211 hasConcept C51167844 @default.
- W3206746211 hasConcept C54355233 @default.
- W3206746211 hasConcept C72434380 @default.
- W3206746211 hasConcept C77805123 @default.
- W3206746211 hasConcept C86803240 @default.
- W3206746211 hasConcept C94625758 @default.
- W3206746211 hasConcept C97541855 @default.
- W3206746211 hasConceptScore W3206746211C105795698 @default.
- W3206746211 hasConceptScore W3206746211C107457646 @default.
- W3206746211 hasConceptScore W3206746211C119857082 @default.
- W3206746211 hasConceptScore W3206746211C126388530 @default.
- W3206746211 hasConceptScore W3206746211C127413603 @default.
- W3206746211 hasConceptScore W3206746211C154945302 @default.
- W3206746211 hasConceptScore W3206746211C15744967 @default.
- W3206746211 hasConceptScore W3206746211C162324750 @default.
- W3206746211 hasConceptScore W3206746211C177264268 @default.