Matches in SemOpenAlex for { <https://semopenalex.org/work/W2949571867> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W2949571867 abstract "Model-based reinforcement learning approaches carry the promise of being data efficient. However, due to challenges in learning dynamics models that sufficiently match the real-world dynamics, they struggle to achieve the same asymptotic performance as model-free methods. We propose Model-Based Meta-Policy-Optimization (MB-MPO), an approach that foregoes the strong reliance on accurate learned dynamics models. Using an ensemble of learned dynamic models, MB-MPO meta-learns a policy that can quickly adapt to any model in the ensemble with one policy gradient step. This steers the meta-policy towards internalizing consistent dynamics predictions among the ensemble while shifting the burden of behaving optimally w.r.t. the model discrepancies towards the adaptation step. Our experiments show that MB-MPO is more robust to model imperfections than previous model-based approaches. Finally, we demonstrate that our approach is able to match the asymptotic performance of model-free methods while requiring significantly less experience." @default.
- W2949571867 created "2019-06-27" @default.
- W2949571867 creator A5009037058 @default.
- W2949571867 creator A5012730104 @default.
- W2949571867 creator A5049349154 @default.
- W2949571867 creator A5064529422 @default.
- W2949571867 creator A5073533020 @default.
- W2949571867 creator A5088717202 @default.
- W2949571867 date "2018-09-14" @default.
- W2949571867 modified "2023-09-27" @default.
- W2949571867 title "Model-Based Reinforcement Learning via Meta-Policy Optimization" @default.
- W2949571867 hasPublicationYear "2018" @default.
- W2949571867 type Work @default.
- W2949571867 sameAs 2949571867 @default.
- W2949571867 citedByCount "0" @default.
- W2949571867 crossrefType "posted-content" @default.
- W2949571867 hasAuthorship W2949571867A5009037058 @default.
- W2949571867 hasAuthorship W2949571867A5012730104 @default.
- W2949571867 hasAuthorship W2949571867A5049349154 @default.
- W2949571867 hasAuthorship W2949571867A5064529422 @default.
- W2949571867 hasAuthorship W2949571867A5073533020 @default.
- W2949571867 hasAuthorship W2949571867A5088717202 @default.
- W2949571867 hasConcept C119857082 @default.
- W2949571867 hasConcept C120665830 @default.
- W2949571867 hasConcept C121332964 @default.
- W2949571867 hasConcept C139807058 @default.
- W2949571867 hasConcept C145912823 @default.
- W2949571867 hasConcept C154945302 @default.
- W2949571867 hasConcept C162324750 @default.
- W2949571867 hasConcept C187736073 @default.
- W2949571867 hasConcept C24890656 @default.
- W2949571867 hasConcept C2780451532 @default.
- W2949571867 hasConcept C2781002164 @default.
- W2949571867 hasConcept C41008148 @default.
- W2949571867 hasConcept C97541855 @default.
- W2949571867 hasConceptScore W2949571867C119857082 @default.
- W2949571867 hasConceptScore W2949571867C120665830 @default.
- W2949571867 hasConceptScore W2949571867C121332964 @default.
- W2949571867 hasConceptScore W2949571867C139807058 @default.
- W2949571867 hasConceptScore W2949571867C145912823 @default.
- W2949571867 hasConceptScore W2949571867C154945302 @default.
- W2949571867 hasConceptScore W2949571867C162324750 @default.
- W2949571867 hasConceptScore W2949571867C187736073 @default.
- W2949571867 hasConceptScore W2949571867C24890656 @default.
- W2949571867 hasConceptScore W2949571867C2780451532 @default.
- W2949571867 hasConceptScore W2949571867C2781002164 @default.
- W2949571867 hasConceptScore W2949571867C41008148 @default.
- W2949571867 hasConceptScore W2949571867C97541855 @default.
- W2949571867 hasLocation W29495718671 @default.
- W2949571867 hasOpenAccess W2949571867 @default.
- W2949571867 hasPrimaryLocation W29495718671 @default.
- W2949571867 hasRelatedWork W2132602063 @default.
- W2949571867 hasRelatedWork W2145957964 @default.
- W2949571867 hasRelatedWork W2767328598 @default.
- W2949571867 hasRelatedWork W2789824229 @default.
- W2949571867 hasRelatedWork W2892230114 @default.
- W2949571867 hasRelatedWork W2953030299 @default.
- W2949571867 hasRelatedWork W2989988749 @default.
- W2949571867 hasRelatedWork W2995316301 @default.
- W2949571867 hasRelatedWork W3001528895 @default.
- W2949571867 hasRelatedWork W3024178066 @default.
- W2949571867 hasRelatedWork W3031182738 @default.
- W2949571867 hasRelatedWork W3035389468 @default.
- W2949571867 hasRelatedWork W3037828233 @default.
- W2949571867 hasRelatedWork W3096519373 @default.
- W2949571867 hasRelatedWork W3126916000 @default.
- W2949571867 hasRelatedWork W3130718980 @default.
- W2949571867 hasRelatedWork W3138900711 @default.
- W2949571867 hasRelatedWork W3169796273 @default.
- W2949571867 hasRelatedWork W3177100477 @default.
- W2949571867 hasRelatedWork W3196896000 @default.
- W2949571867 isParatext "false" @default.
- W2949571867 isRetracted "false" @default.
- W2949571867 magId "2949571867" @default.
- W2949571867 workType "article" @default.