Matches in SemOpenAlex for { <https://semopenalex.org/work/W2979605429> ?p ?o ?g. }
- W2979605429 abstract "Model-based reinforcement learning could enable sample-efficient learning by quickly acquiring rich knowledge about the world and using it to improve behaviour without additional data. Learned dynamics models can be directly used for planning actions but this has been challenging because of inaccuracies in the learned models. In this paper, we focus on planning with learned dynamics models and propose to regularize it using energy estimates of state transitions in the environment. We visually demonstrate the effectiveness of the proposed method and show that off-policy training of an energy estimator can be effectively used to regularize planning with pre-trained dynamics models. Further, we demonstrate that the proposed method enables sample-efficient learning to achieve competitive performance in challenging continuous control tasks such as Half-cheetah and Ant in just a few minutes of experience." @default.
- W2979605429 created "2019-10-18" @default.
- W2979605429 creator A5057931031 @default.
- W2979605429 creator A5061369039 @default.
- W2979605429 creator A5065850081 @default.
- W2979605429 date "2019-10-12" @default.
- W2979605429 modified "2023-09-25" @default.
- W2979605429 title "Regularizing Model-Based Planning with Energy-Based Models." @default.
- W2979605429 cites W1491843047 @default.
- W2979605429 cites W1590636096 @default.
- W2979605429 cites W1959608418 @default.
- W2979605429 cites W2013035813 @default.
- W2979605429 cites W2087617385 @default.
- W2979605429 cites W2104733512 @default.
- W2979605429 cites W2140135625 @default.
- W2979605429 cites W2145339207 @default.
- W2979605429 cites W2212271491 @default.
- W2979605429 cites W2416041116 @default.
- W2979605429 cites W2614634292 @default.
- W2979605429 cites W2614839826 @default.
- W2979605429 cites W2736601468 @default.
- W2979605429 cites W2766447205 @default.
- W2979605429 cites W2804596319 @default.
- W2979605429 cites W2804821933 @default.
- W2979605429 cites W2890208753 @default.
- W2979605429 cites W2892230114 @default.
- W2979605429 cites W2900152462 @default.
- W2979605429 cites W2922772346 @default.
- W2979605429 cites W2962872206 @default.
- W2979605429 cites W2962902376 @default.
- W2979605429 cites W2962974944 @default.
- W2979605429 cites W2963146015 @default.
- W2979605429 cites W2963355572 @default.
- W2979605429 cites W2963641140 @default.
- W2979605429 cites W2963846183 @default.
- W2979605429 cites W2963923407 @default.
- W2979605429 cites W2963960193 @default.
- W2979605429 cites W2971134205 @default.
- W2979605429 cites W2997570648 @default.
- W2979605429 cites W3030947604 @default.
- W2979605429 hasPublicationYear "2019" @default.
- W2979605429 type Work @default.
- W2979605429 sameAs 2979605429 @default.
- W2979605429 citedByCount "1" @default.
- W2979605429 countsByYear W29796054292022 @default.
- W2979605429 crossrefType "posted-content" @default.
- W2979605429 hasAuthorship W2979605429A5057931031 @default.
- W2979605429 hasAuthorship W2979605429A5061369039 @default.
- W2979605429 hasAuthorship W2979605429A5065850081 @default.
- W2979605429 hasConcept C105795698 @default.
- W2979605429 hasConcept C11413529 @default.
- W2979605429 hasConcept C119857082 @default.
- W2979605429 hasConcept C120665830 @default.
- W2979605429 hasConcept C121332964 @default.
- W2979605429 hasConcept C154945302 @default.
- W2979605429 hasConcept C185429906 @default.
- W2979605429 hasConcept C185592680 @default.
- W2979605429 hasConcept C186370098 @default.
- W2979605429 hasConcept C192209626 @default.
- W2979605429 hasConcept C198531522 @default.
- W2979605429 hasConcept C2775924081 @default.
- W2979605429 hasConcept C2984536560 @default.
- W2979605429 hasConcept C33923547 @default.
- W2979605429 hasConcept C41008148 @default.
- W2979605429 hasConcept C43617362 @default.
- W2979605429 hasConcept C48103436 @default.
- W2979605429 hasConcept C97541855 @default.
- W2979605429 hasConceptScore W2979605429C105795698 @default.
- W2979605429 hasConceptScore W2979605429C11413529 @default.
- W2979605429 hasConceptScore W2979605429C119857082 @default.
- W2979605429 hasConceptScore W2979605429C120665830 @default.
- W2979605429 hasConceptScore W2979605429C121332964 @default.
- W2979605429 hasConceptScore W2979605429C154945302 @default.
- W2979605429 hasConceptScore W2979605429C185429906 @default.
- W2979605429 hasConceptScore W2979605429C185592680 @default.
- W2979605429 hasConceptScore W2979605429C186370098 @default.
- W2979605429 hasConceptScore W2979605429C192209626 @default.
- W2979605429 hasConceptScore W2979605429C198531522 @default.
- W2979605429 hasConceptScore W2979605429C2775924081 @default.
- W2979605429 hasConceptScore W2979605429C2984536560 @default.
- W2979605429 hasConceptScore W2979605429C33923547 @default.
- W2979605429 hasConceptScore W2979605429C41008148 @default.
- W2979605429 hasConceptScore W2979605429C43617362 @default.
- W2979605429 hasConceptScore W2979605429C48103436 @default.
- W2979605429 hasConceptScore W2979605429C97541855 @default.
- W2979605429 hasLocation W29796054291 @default.
- W2979605429 hasOpenAccess W2979605429 @default.
- W2979605429 hasPrimaryLocation W29796054291 @default.
- W2979605429 hasRelatedWork W2124674171 @default.
- W2979605429 hasRelatedWork W2198225532 @default.
- W2979605429 hasRelatedWork W2295763688 @default.
- W2979605429 hasRelatedWork W2770014065 @default.
- W2979605429 hasRelatedWork W2796364651 @default.
- W2979605429 hasRelatedWork W2892978054 @default.
- W2979605429 hasRelatedWork W2947525798 @default.
- W2979605429 hasRelatedWork W2953084784 @default.
- W2979605429 hasRelatedWork W2968854004 @default.
- W2979605429 hasRelatedWork W3015442441 @default.
- W2979605429 hasRelatedWork W3026304494 @default.
- W2979605429 hasRelatedWork W3034552332 @default.