Matches in SemOpenAlex for { <https://semopenalex.org/work/W2918894365> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W2918894365 endingPage "331" @default.
- W2918894365 startingPage "323" @default.
- W2918894365 abstract "Efficiently adapting to new environments and changes in dynamics is critical for agents to successfully operate in the real world. Reinforcement learning (RL) based approaches typically rely on external reward feedback for adaptation. However, in many scenarios this reward signal might not be readily available for the target task, or the difference between the environments can be implicit and only observable from the dynamics. To this end, we introduce a method that allows for self-adaptation of learned policies: No-Reward Meta Learning (NoRML). NoRML extends Model Agnostic Meta Learning (MAML) for RL and uses observable dynamics of the environment instead of an explicit reward function in MAML's finetune step. Our method has a more expressive update step than MAML, while maintaining MAML's gradient based foundation. Additionally, in order to allow more targeted exploration, we implement an extension to MAML that effectively disconnects the meta-policy parameters from the fine-tuned policies' parameters. We first study our method on a number of synthetic control problems and then validate our method on common benchmark environments, showing that NoRML outperforms MAML when the dynamics change between tasks." @default.
- W2918894365 created "2019-03-11" @default.
- W2918894365 creator A5005431772 @default.
- W2918894365 creator A5005904006 @default.
- W2918894365 creator A5015686392 @default.
- W2918894365 creator A5068453232 @default.
- W2918894365 creator A5084870242 @default.
- W2918894365 date "2019-05-08" @default.
- W2918894365 modified "2023-09-25" @default.
- W2918894365 title "NoRML: No-Reward Meta Learning" @default.
- W2918894365 hasPublicationYear "2019" @default.
- W2918894365 type Work @default.
- W2918894365 sameAs 2918894365 @default.
- W2918894365 citedByCount "15" @default.
- W2918894365 countsByYear W29188943652019 @default.
- W2918894365 countsByYear W29188943652020 @default.
- W2918894365 countsByYear W29188943652021 @default.
- W2918894365 countsByYear W29188943652022 @default.
- W2918894365 crossrefType "proceedings-article" @default.
- W2918894365 hasAuthorship W2918894365A5005431772 @default.
- W2918894365 hasAuthorship W2918894365A5005904006 @default.
- W2918894365 hasAuthorship W2918894365A5015686392 @default.
- W2918894365 hasAuthorship W2918894365A5068453232 @default.
- W2918894365 hasAuthorship W2918894365A5084870242 @default.
- W2918894365 hasConcept C119857082 @default.
- W2918894365 hasConcept C120665830 @default.
- W2918894365 hasConcept C121332964 @default.
- W2918894365 hasConcept C127413603 @default.
- W2918894365 hasConcept C13280743 @default.
- W2918894365 hasConcept C139807058 @default.
- W2918894365 hasConcept C14036430 @default.
- W2918894365 hasConcept C154945302 @default.
- W2918894365 hasConcept C185798385 @default.
- W2918894365 hasConcept C201995342 @default.
- W2918894365 hasConcept C205649164 @default.
- W2918894365 hasConcept C2780451532 @default.
- W2918894365 hasConcept C2781002164 @default.
- W2918894365 hasConcept C41008148 @default.
- W2918894365 hasConcept C78458016 @default.
- W2918894365 hasConcept C86803240 @default.
- W2918894365 hasConcept C97541855 @default.
- W2918894365 hasConceptScore W2918894365C119857082 @default.
- W2918894365 hasConceptScore W2918894365C120665830 @default.
- W2918894365 hasConceptScore W2918894365C121332964 @default.
- W2918894365 hasConceptScore W2918894365C127413603 @default.
- W2918894365 hasConceptScore W2918894365C13280743 @default.
- W2918894365 hasConceptScore W2918894365C139807058 @default.
- W2918894365 hasConceptScore W2918894365C14036430 @default.
- W2918894365 hasConceptScore W2918894365C154945302 @default.
- W2918894365 hasConceptScore W2918894365C185798385 @default.
- W2918894365 hasConceptScore W2918894365C201995342 @default.
- W2918894365 hasConceptScore W2918894365C205649164 @default.
- W2918894365 hasConceptScore W2918894365C2780451532 @default.
- W2918894365 hasConceptScore W2918894365C2781002164 @default.
- W2918894365 hasConceptScore W2918894365C41008148 @default.
- W2918894365 hasConceptScore W2918894365C78458016 @default.
- W2918894365 hasConceptScore W2918894365C86803240 @default.
- W2918894365 hasConceptScore W2918894365C97541855 @default.
- W2918894365 hasLocation W29188943651 @default.
- W2918894365 hasOpenAccess W2918894365 @default.
- W2918894365 hasPrimaryLocation W29188943651 @default.
- W2918894365 hasRelatedWork W2145339207 @default.
- W2918894365 hasRelatedWork W2550182557 @default.
- W2918894365 hasRelatedWork W2596367596 @default.
- W2918894365 hasRelatedWork W2601450892 @default.
- W2918894365 hasRelatedWork W2604763608 @default.
- W2918894365 hasRelatedWork W2736601468 @default.
- W2918894365 hasRelatedWork W2787501667 @default.
- W2918894365 hasRelatedWork W2789525339 @default.
- W2918894365 hasRelatedWork W2904167876 @default.
- W2918894365 hasRelatedWork W2915039521 @default.
- W2918894365 hasRelatedWork W2923504512 @default.
- W2918894365 hasRelatedWork W2949929714 @default.
- W2918894365 hasRelatedWork W2962974944 @default.
- W2918894365 hasRelatedWork W2963176272 @default.
- W2918894365 hasRelatedWork W2963371846 @default.
- W2918894365 hasRelatedWork W2963581679 @default.
- W2918894365 hasRelatedWork W2964227899 @default.
- W2918894365 hasRelatedWork W3035021992 @default.
- W2918894365 hasRelatedWork W3091905527 @default.
- W2918894365 hasRelatedWork W3184646938 @default.
- W2918894365 isParatext "false" @default.
- W2918894365 isRetracted "false" @default.
- W2918894365 magId "2918894365" @default.
- W2918894365 workType "article" @default.