Matches in SemOpenAlex for { <https://semopenalex.org/work/W3035216917> ?p ?o ?g. }
- W3035216917 abstract "Reinforcement learning algorithms can acquire policies for complex tasks autonomously. However, the number of samples required to learn a diverse set of skills can be prohibitively large. While meta-reinforcement learning methods have enabled agents to leverage prior experience to adapt quickly to new tasks, their performance depends crucially on how close the new task is to the previously experienced tasks. Current approaches are either not able to extrapolate well, or can do so at the expense of requiring extremely large amounts of data for on-policy meta-training. In this work, we present model identification and experience relabeling (MIER), a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time. Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data, more easily than policies and value functions. These dynamics models can then be used to continue training policies and value functions for out-of-distribution tasks without using meta-reinforcement learning at all, by generating synthetic experience for the new task." @default.
- W3035216917 created "2020-06-19" @default.
- W3035216917 creator A5005431772 @default.
- W3035216917 creator A5026322200 @default.
- W3035216917 creator A5075935964 @default.
- W3035216917 creator A5081645573 @default.
- W3035216917 date "2021-05-04" @default.
- W3035216917 modified "2023-09-25" @default.
- W3035216917 title "Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling" @default.
- W3035216917 cites W1542791059 @default.
- W3035216917 cites W1594201624 @default.
- W3035216917 cites W1980035368 @default.
- W3035216917 cites W2123399796 @default.
- W3035216917 cites W2123459560 @default.
- W3035216917 cites W2140804329 @default.
- W3035216917 cites W2145339207 @default.
- W3035216917 cites W2158782408 @default.
- W3035216917 cites W2550182557 @default.
- W3035216917 cites W2578206533 @default.
- W3035216917 cites W2604763608 @default.
- W3035216917 cites W2726717203 @default.
- W3035216917 cites W2781726626 @default.
- W3035216917 cites W2785397462 @default.
- W3035216917 cites W2787387965 @default.
- W3035216917 cites W2788904251 @default.
- W3035216917 cites W2790355818 @default.
- W3035216917 cites W2884901161 @default.
- W3035216917 cites W2906926620 @default.
- W3035216917 cites W2923504512 @default.
- W3035216917 cites W2944900455 @default.
- W3035216917 cites W2945020056 @default.
- W3035216917 cites W2949608212 @default.
- W3035216917 cites W2951881474 @default.
- W3035216917 cites W2952003143 @default.
- W3035216917 cites W2962974944 @default.
- W3035216917 cites W2963371846 @default.
- W3035216917 cites W2963581679 @default.
- W3035216917 cites W2964161785 @default.
- W3035216917 cites W2971014752 @default.
- W3035216917 cites W3102923851 @default.
- W3035216917 cites W3158726102 @default.
- W3035216917 cites W567721252 @default.
- W3035216917 hasPublicationYear "2021" @default.
- W3035216917 type Work @default.
- W3035216917 sameAs 3035216917 @default.
- W3035216917 citedByCount "9" @default.
- W3035216917 countsByYear W30352169172020 @default.
- W3035216917 countsByYear W30352169172021 @default.
- W3035216917 countsByYear W30352169172022 @default.
- W3035216917 crossrefType "journal-article" @default.
- W3035216917 hasAuthorship W3035216917A5005431772 @default.
- W3035216917 hasAuthorship W3035216917A5026322200 @default.
- W3035216917 hasAuthorship W3035216917A5075935964 @default.
- W3035216917 hasAuthorship W3035216917A5081645573 @default.
- W3035216917 hasConcept C116834253 @default.
- W3035216917 hasConcept C119857082 @default.
- W3035216917 hasConcept C127413603 @default.
- W3035216917 hasConcept C153083717 @default.
- W3035216917 hasConcept C154945302 @default.
- W3035216917 hasConcept C177264268 @default.
- W3035216917 hasConcept C199360897 @default.
- W3035216917 hasConcept C201995342 @default.
- W3035216917 hasConcept C2780451532 @default.
- W3035216917 hasConcept C41008148 @default.
- W3035216917 hasConcept C59822182 @default.
- W3035216917 hasConcept C66938386 @default.
- W3035216917 hasConcept C67203356 @default.
- W3035216917 hasConcept C86803240 @default.
- W3035216917 hasConcept C97541855 @default.
- W3035216917 hasConceptScore W3035216917C116834253 @default.
- W3035216917 hasConceptScore W3035216917C119857082 @default.
- W3035216917 hasConceptScore W3035216917C127413603 @default.
- W3035216917 hasConceptScore W3035216917C153083717 @default.
- W3035216917 hasConceptScore W3035216917C154945302 @default.
- W3035216917 hasConceptScore W3035216917C177264268 @default.
- W3035216917 hasConceptScore W3035216917C199360897 @default.
- W3035216917 hasConceptScore W3035216917C201995342 @default.
- W3035216917 hasConceptScore W3035216917C2780451532 @default.
- W3035216917 hasConceptScore W3035216917C41008148 @default.
- W3035216917 hasConceptScore W3035216917C59822182 @default.
- W3035216917 hasConceptScore W3035216917C66938386 @default.
- W3035216917 hasConceptScore W3035216917C67203356 @default.
- W3035216917 hasConceptScore W3035216917C86803240 @default.
- W3035216917 hasConceptScore W3035216917C97541855 @default.
- W3035216917 hasLocation W30352169171 @default.
- W3035216917 hasOpenAccess W3035216917 @default.
- W3035216917 hasPrimaryLocation W30352169171 @default.
- W3035216917 hasRelatedWork W2550182557 @default.
- W3035216917 hasRelatedWork W2604763608 @default.
- W3035216917 hasRelatedWork W2800367501 @default.
- W3035216917 hasRelatedWork W2950197980 @default.
- W3035216917 hasRelatedWork W2951881474 @default.
- W3035216917 hasRelatedWork W2952193948 @default.
- W3035216917 hasRelatedWork W2952526277 @default.
- W3035216917 hasRelatedWork W2963199420 @default.
- W3035216917 hasRelatedWork W2963581679 @default.
- W3035216917 hasRelatedWork W2971014752 @default.
- W3035216917 hasRelatedWork W2981344907 @default.
- W3035216917 hasRelatedWork W2999490157 @default.
- W3035216917 hasRelatedWork W3009245728 @default.