Matches in SemOpenAlex for { <https://semopenalex.org/work/W2991616621> ?p ?o ?g. }
- W2991616621 abstract "Planning methods can solve temporally extended sequential decision making problems by composing simple behaviors. However, planning requires suitable abstractions for the states and transitions, which typically need to be designed by hand. In contrast, model-free reinforcement learning (RL) can acquire behaviors from low-level inputs directly, but often struggles with temporally extended tasks. Can we utilize reinforcement learning to automatically form the abstractions needed for planning, thus obtaining the best of both approaches? We show that goal-conditioned policies learned with RL can be incorporated into planning, so that a planner can focus on which states to reach, rather than how those states are reached. However, with complex state observations such as images, not all inputs represent valid states. We therefore also propose using a latent variable model to compactly represent the set of valid states for the planner, so that the policies provide an abstraction of actions, and the latent variable model provides an abstraction of states. We compare our method with planning-based and model-free methods and find that our method significantly outperforms prior work when evaluated on image-based robot navigation and manipulation tasks that require non-greedy, multi-staged behavior." @default.
- W2991616621 created "2019-12-05" @default.
- W2991616621 creator A5001658549 @default.
- W2991616621 creator A5022444244 @default.
- W2991616621 creator A5026322200 @default.
- W2991616621 creator A5044471507 @default.
- W2991616621 date "2019-11-19" @default.
- W2991616621 modified "2023-09-27" @default.
- W2991616621 title "Planning with Goal-Conditioned Policies" @default.
- W2991616621 cites W118144931 @default.
- W2991616621 cites W1491843047 @default.
- W2991616621 cites W1507087299 @default.
- W2991616621 cites W1571530861 @default.
- W2991616621 cites W1594201624 @default.
- W2991616621 cites W1598748993 @default.
- W2991616621 cites W1756110333 @default.
- W2991616621 cites W1959608418 @default.
- W2991616621 cites W1977687214 @default.
- W2991616621 cites W2121517924 @default.
- W2991616621 cites W2132083787 @default.
- W2991616621 cites W2132622533 @default.
- W2991616621 cites W2153244676 @default.
- W2991616621 cites W2160371091 @default.
- W2991616621 cites W2258731934 @default.
- W2991616621 cites W2281096776 @default.
- W2991616621 cites W2594829461 @default.
- W2991616621 cites W2763676071 @default.
- W2991616621 cites W2789008106 @default.
- W2991616621 cites W2796303840 @default.
- W2991616621 cites W2803281228 @default.
- W2991616621 cites W2883471708 @default.
- W2991616621 cites W2886380293 @default.
- W2991616621 cites W2889347284 @default.
- W2991616621 cites W2894991981 @default.
- W2991616621 cites W2897007337 @default.
- W2991616621 cites W2900152462 @default.
- W2991616621 cites W2902125520 @default.
- W2991616621 cites W2913535645 @default.
- W2991616621 cites W2920215304 @default.
- W2991616621 cites W2922007426 @default.
- W2991616621 cites W2951557330 @default.
- W2991616621 cites W2953317238 @default.
- W2991616621 cites W2962820504 @default.
- W2991616621 cites W2962872206 @default.
- W2991616621 cites W2962897886 @default.
- W2991616621 cites W2963125871 @default.
- W2991616621 cites W2963293533 @default.
- W2991616621 cites W2963293881 @default.
- W2991616621 cites W2963321092 @default.
- W2991616621 cites W2963406904 @default.
- W2991616621 cites W2963430173 @default.
- W2991616621 cites W2963435596 @default.
- W2991616621 cites W2963629403 @default.
- W2991616621 cites W2963634205 @default.
- W2991616621 cites W2963794592 @default.
- W2991616621 cites W2963923407 @default.
- W2991616621 cites W2963960193 @default.
- W2991616621 cites W2964001908 @default.
- W2991616621 cites W2964006217 @default.
- W2991616621 cites W2964036701 @default.
- W2991616621 cites W2964220198 @default.
- W2991616621 cites W2964295739 @default.
- W2991616621 cites W2964342357 @default.
- W2991616621 cites W2965662123 @default.
- W2991616621 cites W3029645440 @default.
- W2991616621 cites W3038022805 @default.
- W2991616621 cites W567721252 @default.
- W2991616621 hasPublicationYear "2019" @default.
- W2991616621 type Work @default.
- W2991616621 sameAs 2991616621 @default.
- W2991616621 citedByCount "0" @default.
- W2991616621 crossrefType "posted-content" @default.
- W2991616621 hasAuthorship W2991616621A5001658549 @default.
- W2991616621 hasAuthorship W2991616621A5022444244 @default.
- W2991616621 hasAuthorship W2991616621A5026322200 @default.
- W2991616621 hasAuthorship W2991616621A5044471507 @default.
- W2991616621 hasConcept C111472728 @default.
- W2991616621 hasConcept C119857082 @default.
- W2991616621 hasConcept C120665830 @default.
- W2991616621 hasConcept C121332964 @default.
- W2991616621 hasConcept C124304363 @default.
- W2991616621 hasConcept C134306372 @default.
- W2991616621 hasConcept C138885662 @default.
- W2991616621 hasConcept C154945302 @default.
- W2991616621 hasConcept C177264268 @default.
- W2991616621 hasConcept C182365436 @default.
- W2991616621 hasConcept C192209626 @default.
- W2991616621 hasConcept C199360897 @default.
- W2991616621 hasConcept C2776502983 @default.
- W2991616621 hasConcept C2776999362 @default.
- W2991616621 hasConcept C2780586882 @default.
- W2991616621 hasConcept C33923547 @default.
- W2991616621 hasConcept C41008148 @default.
- W2991616621 hasConcept C51167844 @default.
- W2991616621 hasConcept C90509273 @default.
- W2991616621 hasConcept C97541855 @default.
- W2991616621 hasConceptScore W2991616621C111472728 @default.
- W2991616621 hasConceptScore W2991616621C119857082 @default.
- W2991616621 hasConceptScore W2991616621C120665830 @default.
- W2991616621 hasConceptScore W2991616621C121332964 @default.