Matches in SemOpenAlex for { <https://semopenalex.org/work/W3213132814> ?p ?o ?g. }
- W3213132814 abstract "The study of generalisation in deep Reinforcement Learning (RL) aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting to their training environments. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios, where the environment will be diverse, dynamic and unpredictable. This survey is an overview of this nascent field. We provide a unifying formalism and terminology for discussing different generalisation problems, building upon previous works. We go on to categorise existing benchmarks for generalisation, as well as current methods for tackling the generalisation problem. Finally, we provide a critical discussion of the current state of the field, including recommendations for future work. Among other conclusions, we argue that taking a purely procedural content generation approach to benchmark design is not conducive to progress in generalisation, we suggest fast online adaptation and tackling RL-specific problems as some areas for future work on methods for generalisation, and we recommend building benchmarks in underexplored problem settings such as offline RL generalisation and reward-function variation." @default.
- W3213132814 created "2021-11-22" @default.
- W3213132814 creator A5023508792 @default.
- W3213132814 creator A5052068951 @default.
- W3213132814 creator A5078999974 @default.
- W3213132814 creator A5079315903 @default.
- W3213132814 date "2021-11-18" @default.
- W3213132814 modified "2023-09-27" @default.
- W3213132814 title "A Survey of Generalisation in Deep Reinforcement Learning" @default.
- W3213132814 cites W1537180453 @default.
- W3213132814 cites W1576539603 @default.
- W3213132814 cites W195596278 @default.
- W3213132814 cites W1997477668 @default.
- W3213132814 cites W2061562262 @default.
- W3213132814 cites W2104228245 @default.
- W3213132814 cites W2105078254 @default.
- W3213132814 cites W2128957716 @default.
- W3213132814 cites W2151554678 @default.
- W3213132814 cites W2158782408 @default.
- W3213132814 cites W2480004914 @default.
- W3213132814 cites W2550182557 @default.
- W3213132814 cites W2578206533 @default.
- W3213132814 cites W2602963933 @default.
- W3213132814 cites W2605102758 @default.
- W3213132814 cites W2736601468 @default.
- W3213132814 cites W2746314669 @default.
- W3213132814 cites W2753738274 @default.
- W3213132814 cites W2769112066 @default.
- W3213132814 cites W2781585732 @default.
- W3213132814 cites W2797527950 @default.
- W3213132814 cites W2805516822 @default.
- W3213132814 cites W2807340089 @default.
- W3213132814 cites W2809461852 @default.
- W3213132814 cites W2809668646 @default.
- W3213132814 cites W2893662673 @default.
- W3213132814 cites W2898436992 @default.
- W3213132814 cites W2901796235 @default.
- W3213132814 cites W2907502844 @default.
- W3213132814 cites W2907923173 @default.
- W3213132814 cites W2908460759 @default.
- W3213132814 cites W2912399346 @default.
- W3213132814 cites W2916826721 @default.
- W3213132814 cites W2944362136 @default.
- W3213132814 cites W2947217676 @default.
- W3213132814 cites W2953494151 @default.
- W3213132814 cites W2954378135 @default.
- W3213132814 cites W2954579883 @default.
- W3213132814 cites W2954996726 @default.
- W3213132814 cites W2962754721 @default.
- W3213132814 cites W2962764591 @default.
- W3213132814 cites W2962957005 @default.
- W3213132814 cites W2963120839 @default.
- W3213132814 cites W2963369679 @default.
- W3213132814 cites W2963390419 @default.
- W3213132814 cites W2963399829 @default.
- W3213132814 cites W2963403143 @default.
- W3213132814 cites W2963438456 @default.
- W3213132814 cites W2963611966 @default.
- W3213132814 cites W2963680188 @default.
- W3213132814 cites W2963884015 @default.
- W3213132814 cites W2964184826 @default.
- W3213132814 cites W2964915587 @default.
- W3213132814 cites W2966556569 @default.
- W3213132814 cites W2966684444 @default.
- W3213132814 cites W2970214542 @default.
- W3213132814 cites W2971705906 @default.
- W3213132814 cites W2973821523 @default.
- W3213132814 cites W2975112661 @default.
- W3213132814 cites W2976046470 @default.
- W3213132814 cites W2981030070 @default.
- W3213132814 cites W2981344907 @default.
- W3213132814 cites W2982316857 @default.
- W3213132814 cites W2990408532 @default.
- W3213132814 cites W2995264919 @default.
- W3213132814 cites W2995574330 @default.
- W3213132814 cites W2995638039 @default.
- W3213132814 cites W2996037775 @default.
- W3213132814 cites W2996070979 @default.
- W3213132814 cites W2996094825 @default.
- W3213132814 cites W2996148148 @default.
- W3213132814 cites W2996283175 @default.
- W3213132814 cites W2998557583 @default.
- W3213132814 cites W2999617596 @default.
- W3213132814 cites W3006409413 @default.
- W3213132814 cites W3007769740 @default.
- W3213132814 cites W3013146177 @default.
- W3213132814 cites W3013821552 @default.
- W3213132814 cites W3016525976 @default.
- W3213132814 cites W3017374003 @default.
- W3213132814 cites W3020277389 @default.
- W3213132814 cites W3021796787 @default.
- W3213132814 cites W3022566517 @default.
- W3213132814 cites W3029947299 @default.
- W3213132814 cites W3034218175 @default.
- W3213132814 cites W3034453920 @default.
- W3213132814 cites W3034786558 @default.
- W3213132814 cites W3034848825 @default.
- W3213132814 cites W3034932139 @default.
- W3213132814 cites W3034946435 @default.
- W3213132814 cites W3035041226 @default.