Matches in SemOpenAlex for { <https://semopenalex.org/work/W3048681560> ?p ?o ?g. }
- W3048681560 abstract "Deep reinforcement learning has shown remarkable success in the past few years. Highly complex sequential decision making problems have been solved in tasks such as game playing and robotics. Unfortunately, the sample complexity of most deep reinforcement learning methods is high, precluding their use in some important applications. Model-based reinforcement learning creates an explicit model of the environment dynamics to reduce the need for environment samples. Current deep learning methods use high-capacity networks to solve high-dimensional problems. Unfortunately, high-capacity models typically require many samples, negating the potential benefit of lower sample complexity in model-based methods. A challenge for deep model-based methods is therefore to achieve high predictive power while maintaining low sample complexity. In recent years, many model-based methods have been introduced to address this challenge. In this paper, we survey the contemporary model-based landscape. First we discuss definitions and relations to other fields. We propose a taxonomy based on three approaches: using explicit planning on given transitions, using explicit planning on learned transitions, and end-to-end learning of both planning and transitions. We use these approaches to organize a comprehensive overview of important recent developments such as latent models. We describe methods and benchmarks, and we suggest directions for future work for each of the approaches. Among promising research directions are curriculum learning, uncertainty modeling, and use of latent models for transfer learning." @default.
- W3048681560 created "2020-08-18" @default.
- W3048681560 creator A5054569164 @default.
- W3048681560 creator A5062774048 @default.
- W3048681560 creator A5085542421 @default.
- W3048681560 date "2020-08-11" @default.
- W3048681560 modified "2023-09-27" @default.
- W3048681560 title "Model-Based Deep Reinforcement Learning for High-Dimensional Problems, a Survey." @default.
- W3048681560 cites W1485009520 @default.
- W3048681560 cites W1491843047 @default.
- W3048681560 cites W1500431805 @default.
- W3048681560 cites W1506806321 @default.
- W3048681560 cites W1555005026 @default.
- W3048681560 cites W1559956479 @default.
- W3048681560 cites W1570233100 @default.
- W3048681560 cites W1579853615 @default.
- W3048681560 cites W16011919 @default.
- W3048681560 cites W1636614024 @default.
- W3048681560 cites W1714211023 @default.
- W3048681560 cites W1757796397 @default.
- W3048681560 cites W1810943226 @default.
- W3048681560 cites W1882937458 @default.
- W3048681560 cites W1926428750 @default.
- W3048681560 cites W1959608418 @default.
- W3048681560 cites W1969483458 @default.
- W3048681560 cites W1971129545 @default.
- W3048681560 cites W1971439026 @default.
- W3048681560 cites W1977655452 @default.
- W3048681560 cites W1977989560 @default.
- W3048681560 cites W1980035368 @default.
- W3048681560 cites W1982678075 @default.
- W3048681560 cites W2018705428 @default.
- W3048681560 cites W2056568601 @default.
- W3048681560 cites W2084920657 @default.
- W3048681560 cites W2087617385 @default.
- W3048681560 cites W2102256448 @default.
- W3048681560 cites W2104733512 @default.
- W3048681560 cites W2107726111 @default.
- W3048681560 cites W2118688707 @default.
- W3048681560 cites W2120501001 @default.
- W3048681560 cites W2121103318 @default.
- W3048681560 cites W2121517924 @default.
- W3048681560 cites W2121863487 @default.
- W3048681560 cites W2123491406 @default.
- W3048681560 cites W2145339207 @default.
- W3048681560 cites W2153039919 @default.
- W3048681560 cites W2163605009 @default.
- W3048681560 cites W2165698076 @default.
- W3048681560 cites W2173248099 @default.
- W3048681560 cites W2194775991 @default.
- W3048681560 cites W2202140284 @default.
- W3048681560 cites W2258731934 @default.
- W3048681560 cites W2296073425 @default.
- W3048681560 cites W2335959470 @default.
- W3048681560 cites W2341171179 @default.
- W3048681560 cites W2528489519 @default.
- W3048681560 cites W2534314849 @default.
- W3048681560 cites W2549891446 @default.
- W3048681560 cites W2578206533 @default.
- W3048681560 cites W2596142952 @default.
- W3048681560 cites W2604763608 @default.
- W3048681560 cites W2606047872 @default.
- W3048681560 cites W2613603362 @default.
- W3048681560 cites W2618097077 @default.
- W3048681560 cites W2736601468 @default.
- W3048681560 cites W2738675347 @default.
- W3048681560 cites W2747402019 @default.
- W3048681560 cites W2749807327 @default.
- W3048681560 cites W2754517384 @default.
- W3048681560 cites W2761873684 @default.
- W3048681560 cites W2773381986 @default.
- W3048681560 cites W2774354230 @default.
- W3048681560 cites W2781585732 @default.
- W3048681560 cites W2781726626 @default.
- W3048681560 cites W2786019934 @default.
- W3048681560 cites W2787074649 @default.
- W3048681560 cites W2789824229 @default.
- W3048681560 cites W2795756076 @default.
- W3048681560 cites W2803928381 @default.
- W3048681560 cites W2807340089 @default.
- W3048681560 cites W2809824610 @default.
- W3048681560 cites W2890208753 @default.
- W3048681560 cites W2892230114 @default.
- W3048681560 cites W2900152462 @default.
- W3048681560 cites W2902125520 @default.
- W3048681560 cites W2903552445 @default.
- W3048681560 cites W2913201078 @default.
- W3048681560 cites W2914898814 @default.
- W3048681560 cites W2919115771 @default.
- W3048681560 cites W2920362155 @default.
- W3048681560 cites W2931903779 @default.
- W3048681560 cites W2932752459 @default.
- W3048681560 cites W2944895446 @default.
- W3048681560 cites W2952095743 @default.
- W3048681560 cites W2953708620 @default.
- W3048681560 cites W2954989419 @default.
- W3048681560 cites W2962712652 @default.
- W3048681560 cites W2962872206 @default.
- W3048681560 cites W2962991582 @default.
- W3048681560 cites W2963074410 @default.