Matches in SemOpenAlex for { <https://semopenalex.org/work/W2980016056> ?p ?o ?g. }
- W2980016056 abstract "Humans are masters at quickly learning many complex tasks, relying on an approximate understanding of the dynamics of their environments. In much the same way, we would like our learning agents to quickly adapt to new tasks. In this paper, we explore how model-based Reinforcement Learning (RL) can facilitate transfer to new tasks. We develop an algorithm that learns an action-conditional, predictive model of expected future observations, rewards and values from which a policy can be derived by following the gradient of the estimated value along imagined trajectories. We show how robust policy optimization can be achieved in robot manipulation tasks even with approximate models that are learned directly from vision and proprioception. We evaluate the efficacy of our approach in a transfer learning scenario, re-using previously learned models on tasks with different reward structures and visual distractors, and show a significant improvement in learning speed compared to strong off-policy baselines. Videos with results can be found at this https URL" @default.
- W2980016056 created "2019-10-18" @default.
- W2980016056 creator A5007133617 @default.
- W2980016056 creator A5017985443 @default.
- W2980016056 creator A5018196238 @default.
- W2980016056 creator A5037305533 @default.
- W2980016056 creator A5041323275 @default.
- W2980016056 creator A5053312475 @default.
- W2980016056 creator A5054636066 @default.
- W2980016056 creator A5062951341 @default.
- W2980016056 creator A5065489996 @default.
- W2980016056 date "2019-10-09" @default.
- W2980016056 modified "2023-09-27" @default.
- W2980016056 title "Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models" @default.
- W2980016056 cites W1491843047 @default.
- W2980016056 cites W155952036 @default.
- W2980016056 cites W1909320841 @default.
- W2980016056 cites W1957496711 @default.
- W2980016056 cites W1959608418 @default.
- W2980016056 cites W2012587148 @default.
- W2980016056 cites W2119717200 @default.
- W2980016056 cites W2140135625 @default.
- W2980016056 cites W2145339207 @default.
- W2980016056 cites W2173248099 @default.
- W2980016056 cites W2176412452 @default.
- W2980016056 cites W2290354866 @default.
- W2980016056 cites W2401231614 @default.
- W2980016056 cites W2604763608 @default.
- W2980016056 cites W2726187156 @default.
- W2980016056 cites W2736601468 @default.
- W2980016056 cites W2738669288 @default.
- W2980016056 cites W2738804062 @default.
- W2980016056 cites W2785342287 @default.
- W2980016056 cites W2785738552 @default.
- W2980016056 cites W2786036274 @default.
- W2980016056 cites W2789824229 @default.
- W2980016056 cites W2794752562 @default.
- W2980016056 cites W2885163910 @default.
- W2980016056 cites W2889347284 @default.
- W2980016056 cites W2889732123 @default.
- W2980016056 cites W2892230114 @default.
- W2980016056 cites W2900152462 @default.
- W2980016056 cites W2902125520 @default.
- W2980016056 cites W2902286283 @default.
- W2980016056 cites W2904246096 @default.
- W2980016056 cites W2905822515 @default.
- W2980016056 cites W2920362155 @default.
- W2980016056 cites W2921528247 @default.
- W2980016056 cites W2944892105 @default.
- W2980016056 cites W2962717849 @default.
- W2980016056 cites W2962804251 @default.
- W2980016056 cites W2962974944 @default.
- W2980016056 cites W2963286043 @default.
- W2980016056 cites W2963430173 @default.
- W2980016056 cites W2963484919 @default.
- W2980016056 cites W2963498534 @default.
- W2980016056 cites W2963521487 @default.
- W2980016056 cites W2964006217 @default.
- W2980016056 cites W2964043796 @default.
- W2980016056 cites W2964093801 @default.
- W2980016056 cites W2964118020 @default.
- W2980016056 cites W2964121744 @default.
- W2980016056 cites W2964161785 @default.
- W2980016056 cites W2964220198 @default.
- W2980016056 cites W2968340082 @default.
- W2980016056 hasPublicationYear "2019" @default.
- W2980016056 type Work @default.
- W2980016056 sameAs 2980016056 @default.
- W2980016056 citedByCount "4" @default.
- W2980016056 countsByYear W29800160562019 @default.
- W2980016056 countsByYear W29800160562020 @default.
- W2980016056 countsByYear W29800160562021 @default.
- W2980016056 crossrefType "posted-content" @default.
- W2980016056 hasAuthorship W2980016056A5007133617 @default.
- W2980016056 hasAuthorship W2980016056A5017985443 @default.
- W2980016056 hasAuthorship W2980016056A5018196238 @default.
- W2980016056 hasAuthorship W2980016056A5037305533 @default.
- W2980016056 hasAuthorship W2980016056A5041323275 @default.
- W2980016056 hasAuthorship W2980016056A5053312475 @default.
- W2980016056 hasAuthorship W2980016056A5054636066 @default.
- W2980016056 hasAuthorship W2980016056A5062951341 @default.
- W2980016056 hasAuthorship W2980016056A5065489996 @default.
- W2980016056 hasConcept C119857082 @default.
- W2980016056 hasConcept C121332964 @default.
- W2980016056 hasConcept C145912823 @default.
- W2980016056 hasConcept C150899416 @default.
- W2980016056 hasConcept C154945302 @default.
- W2980016056 hasConcept C15744967 @default.
- W2980016056 hasConcept C19417346 @default.
- W2980016056 hasConcept C2776291640 @default.
- W2980016056 hasConcept C2779436431 @default.
- W2980016056 hasConcept C2780791683 @default.
- W2980016056 hasConcept C41008148 @default.
- W2980016056 hasConcept C62520636 @default.
- W2980016056 hasConcept C97541855 @default.
- W2980016056 hasConceptScore W2980016056C119857082 @default.
- W2980016056 hasConceptScore W2980016056C121332964 @default.
- W2980016056 hasConceptScore W2980016056C145912823 @default.
- W2980016056 hasConceptScore W2980016056C150899416 @default.
- W2980016056 hasConceptScore W2980016056C154945302 @default.