Matches in SemOpenAlex for { <https://semopenalex.org/work/W4382402725> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W4382402725 abstract "Reinforcement Learning (RL) has been successful in various domains like robotics, game playing, and simulation. While RL agents have shown impressive capabilities in their specific tasks, they insufficiently adapt to new tasks. In supervised learning, this adaptation problem is addressed by large-scale pre-training followed by fine-tuning to new down-stream tasks. Recently, pre-training on multiple tasks has been gaining traction in RL. However, fine-tuning a pre-trained model often suffers from catastrophic forgetting, that is, the performance on the pre-training tasks deteriorates when fine-tuning on new tasks. To investigate the catastrophic forgetting phenomenon, we first jointly pre-train a model on datasets from two benchmark suites, namely Meta-World and DMControl. Then, we evaluate and compare a variety of fine-tuning methods prevalent in natural language processing, both in terms of performance on new tasks, and how well performance on pre-training tasks is retained. Our study shows that with most fine-tuning approaches, the performance on pre-training tasks deteriorates significantly. Therefore, we propose a novel method, Learning-to-Modulate (L2M), that avoids the degradation of learned skills by modulating the information flow of the frozen pre-trained model via a learnable modulation pool. Our method achieves state-of-the-art performance on the Continual-World benchmark, while retaining performance on the pre-training tasks. Finally, to aid future research in this area, we release a dataset encompassing 50 Meta-World and 16 DMControl tasks." @default.
- W4382402725 created "2023-06-29" @default.
- W4382402725 creator A5034004608 @default.
- W4382402725 creator A5043910056 @default.
- W4382402725 creator A5052543224 @default.
- W4382402725 creator A5053148274 @default.
- W4382402725 creator A5070276097 @default.
- W4382402725 date "2023-06-26" @default.
- W4382402725 modified "2023-09-26" @default.
- W4382402725 title "Learning to Modulate pre-trained Models in RL" @default.
- W4382402725 doi "https://doi.org/10.48550/arxiv.2306.14884" @default.
- W4382402725 hasPublicationYear "2023" @default.
- W4382402725 type Work @default.
- W4382402725 citedByCount "0" @default.
- W4382402725 crossrefType "posted-content" @default.
- W4382402725 hasAuthorship W4382402725A5034004608 @default.
- W4382402725 hasAuthorship W4382402725A5043910056 @default.
- W4382402725 hasAuthorship W4382402725A5052543224 @default.
- W4382402725 hasAuthorship W4382402725A5053148274 @default.
- W4382402725 hasAuthorship W4382402725A5070276097 @default.
- W4382402725 hasBestOaLocation W43824027251 @default.
- W4382402725 hasConcept C119857082 @default.
- W4382402725 hasConcept C120665830 @default.
- W4382402725 hasConcept C121332964 @default.
- W4382402725 hasConcept C13280743 @default.
- W4382402725 hasConcept C138885662 @default.
- W4382402725 hasConcept C139807058 @default.
- W4382402725 hasConcept C154945302 @default.
- W4382402725 hasConcept C185798385 @default.
- W4382402725 hasConcept C205649164 @default.
- W4382402725 hasConcept C41008148 @default.
- W4382402725 hasConcept C41895202 @default.
- W4382402725 hasConcept C7149132 @default.
- W4382402725 hasConcept C97541855 @default.
- W4382402725 hasConceptScore W4382402725C119857082 @default.
- W4382402725 hasConceptScore W4382402725C120665830 @default.
- W4382402725 hasConceptScore W4382402725C121332964 @default.
- W4382402725 hasConceptScore W4382402725C13280743 @default.
- W4382402725 hasConceptScore W4382402725C138885662 @default.
- W4382402725 hasConceptScore W4382402725C139807058 @default.
- W4382402725 hasConceptScore W4382402725C154945302 @default.
- W4382402725 hasConceptScore W4382402725C185798385 @default.
- W4382402725 hasConceptScore W4382402725C205649164 @default.
- W4382402725 hasConceptScore W4382402725C41008148 @default.
- W4382402725 hasConceptScore W4382402725C41895202 @default.
- W4382402725 hasConceptScore W4382402725C7149132 @default.
- W4382402725 hasConceptScore W4382402725C97541855 @default.
- W4382402725 hasLocation W43824027251 @default.
- W4382402725 hasOpenAccess W4382402725 @default.
- W4382402725 hasPrimaryLocation W43824027251 @default.
- W4382402725 hasRelatedWork W2902332709 @default.
- W4382402725 hasRelatedWork W3021857098 @default.
- W4382402725 hasRelatedWork W3022038857 @default.
- W4382402725 hasRelatedWork W3095449511 @default.
- W4382402725 hasRelatedWork W3132110306 @default.
- W4382402725 hasRelatedWork W3206856753 @default.
- W4382402725 hasRelatedWork W4224329779 @default.
- W4382402725 hasRelatedWork W4295936673 @default.
- W4382402725 hasRelatedWork W4319083788 @default.
- W4382402725 hasRelatedWork W4286900340 @default.
- W4382402725 isParatext "false" @default.
- W4382402725 isRetracted "false" @default.
- W4382402725 workType "article" @default.