Matches in SemOpenAlex for { <https://semopenalex.org/work/W3008408456> ?p ?o ?g. }
- W3008408456 abstract "Gradient-based meta-learners such as Model-Agnostic Meta-Learning (MAML) have shown strong few-shot performance in supervised and reinforcement learning settings. However, specifically in the case of meta-reinforcement learning (meta-RL), we can show that gradient-based meta-learners are sensitive to task distributions. With the wrong curriculum, agents suffer the effects of meta-overfitting, shallow adaptation, and adaptation instability. In this work, we begin by highlighting intriguing failure cases of gradient-based meta-RL and show that task distributions can wildly affect algorithmic outputs, stability, and performance. To address this problem, we leverage insights from recent literature on domain randomization and propose meta Active Domain Randomization (meta-ADR), which learns a curriculum of tasks for gradient-based meta-RL in a similar as ADR does for sim2real transfer. We show that this approach induces more stable policies on a variety of simulated locomotion and navigation tasks. We assess in- and out-of-distribution generalization and find that the learned task distributions, even in an unstructured task space, greatly improve the adaptation performance of MAML. Finally, we motivate the need for better benchmarking in meta-RL that prioritizes textit{generalization} over single-task adaption performance." @default.
- W3008408456 created "2020-03-06" @default.
- W3008408456 creator A5020127805 @default.
- W3008408456 creator A5037065865 @default.
- W3008408456 creator A5040523178 @default.
- W3008408456 creator A5074204532 @default.
- W3008408456 creator A5075885606 @default.
- W3008408456 date "2020-02-19" @default.
- W3008408456 modified "2023-09-27" @default.
- W3008408456 title "Curriculum in Gradient-Based Meta-Reinforcement Learning." @default.
- W3008408456 cites W2119717200 @default.
- W3008408456 cites W2121863487 @default.
- W3008408456 cites W2296073425 @default.
- W3008408456 cites W2472819217 @default.
- W3008408456 cites W2550182557 @default.
- W3008408456 cites W2601450892 @default.
- W3008408456 cites W2604763608 @default.
- W3008408456 cites W2605102758 @default.
- W3008408456 cites W2606757878 @default.
- W3008408456 cites W2726187156 @default.
- W3008408456 cites W2737215407 @default.
- W3008408456 cites W2753160622 @default.
- W3008408456 cites W2785397462 @default.
- W3008408456 cites W2787501667 @default.
- W3008408456 cites W2788904251 @default.
- W3008408456 cites W2896092084 @default.
- W3008408456 cites W2902723124 @default.
- W3008408456 cites W2908460759 @default.
- W3008408456 cites W2922007426 @default.
- W3008408456 cites W2923504512 @default.
- W3008408456 cites W2947469668 @default.
- W3008408456 cites W2952765942 @default.
- W3008408456 cites W2963341924 @default.
- W3008408456 cites W2964327384 @default.
- W3008408456 cites W2981030070 @default.
- W3008408456 cites W2981344907 @default.
- W3008408456 cites W2982795998 @default.
- W3008408456 cites W3030141984 @default.
- W3008408456 cites W3032377877 @default.
- W3008408456 cites W3037856073 @default.
- W3008408456 cites W3091905774 @default.
- W3008408456 cites W64088143 @default.
- W3008408456 hasPublicationYear "2020" @default.
- W3008408456 type Work @default.
- W3008408456 sameAs 3008408456 @default.
- W3008408456 citedByCount "2" @default.
- W3008408456 countsByYear W30084084562021 @default.
- W3008408456 crossrefType "posted-content" @default.
- W3008408456 hasAuthorship W3008408456A5020127805 @default.
- W3008408456 hasAuthorship W3008408456A5037065865 @default.
- W3008408456 hasAuthorship W3008408456A5040523178 @default.
- W3008408456 hasAuthorship W3008408456A5074204532 @default.
- W3008408456 hasAuthorship W3008408456A5075885606 @default.
- W3008408456 hasConcept C112972136 @default.
- W3008408456 hasConcept C119857082 @default.
- W3008408456 hasConcept C134306372 @default.
- W3008408456 hasConcept C139807058 @default.
- W3008408456 hasConcept C153083717 @default.
- W3008408456 hasConcept C154945302 @default.
- W3008408456 hasConcept C15744967 @default.
- W3008408456 hasConcept C162324750 @default.
- W3008408456 hasConcept C169760540 @default.
- W3008408456 hasConcept C177148314 @default.
- W3008408456 hasConcept C187736073 @default.
- W3008408456 hasConcept C22019652 @default.
- W3008408456 hasConcept C2780451532 @default.
- W3008408456 hasConcept C2781002164 @default.
- W3008408456 hasConcept C33923547 @default.
- W3008408456 hasConcept C41008148 @default.
- W3008408456 hasConcept C50644808 @default.
- W3008408456 hasConcept C97541855 @default.
- W3008408456 hasConceptScore W3008408456C112972136 @default.
- W3008408456 hasConceptScore W3008408456C119857082 @default.
- W3008408456 hasConceptScore W3008408456C134306372 @default.
- W3008408456 hasConceptScore W3008408456C139807058 @default.
- W3008408456 hasConceptScore W3008408456C153083717 @default.
- W3008408456 hasConceptScore W3008408456C154945302 @default.
- W3008408456 hasConceptScore W3008408456C15744967 @default.
- W3008408456 hasConceptScore W3008408456C162324750 @default.
- W3008408456 hasConceptScore W3008408456C169760540 @default.
- W3008408456 hasConceptScore W3008408456C177148314 @default.
- W3008408456 hasConceptScore W3008408456C187736073 @default.
- W3008408456 hasConceptScore W3008408456C22019652 @default.
- W3008408456 hasConceptScore W3008408456C2780451532 @default.
- W3008408456 hasConceptScore W3008408456C2781002164 @default.
- W3008408456 hasConceptScore W3008408456C33923547 @default.
- W3008408456 hasConceptScore W3008408456C41008148 @default.
- W3008408456 hasConceptScore W3008408456C50644808 @default.
- W3008408456 hasConceptScore W3008408456C97541855 @default.
- W3008408456 hasLocation W30084084561 @default.
- W3008408456 hasOpenAccess W3008408456 @default.
- W3008408456 hasPrimaryLocation W30084084561 @default.
- W3008408456 hasRelatedWork W2891227010 @default.
- W3008408456 hasRelatedWork W2896092084 @default.
- W3008408456 hasRelatedWork W2902723124 @default.
- W3008408456 hasRelatedWork W2953292661 @default.
- W3008408456 hasRelatedWork W2967645217 @default.
- W3008408456 hasRelatedWork W2971218263 @default.
- W3008408456 hasRelatedWork W2995444997 @default.
- W3008408456 hasRelatedWork W2996148148 @default.