Matches in SemOpenAlex for { <https://semopenalex.org/work/W3107370812> ?p ?o ?g. }
- W3107370812 abstract "Deep reinforcement learning has shown remarkable success in the past few years. Highly complex sequential decision making problems have been solved in tasks such as game playing and robotics. Unfortunately, the sample complexity of most deep reinforcement learning methods is high, precluding their use in some important applications. Model-based reinforcement learning creates an explicit model of the environment dynamics to reduce the need for environment samples. Current deep learning methods use high-capacity networks to solve high-dimensional problems. Unfortunately, high-capacity models typically require many samples, negating the potential benefit of lower sample complexity in model-based methods. A challenge for deep model-based methods is therefore to achieve high predictive power while maintaining low sample complexity. In recent years, many model-based methods have been introduced to address this challenge. In this paper, we survey the contemporary model-based landscape. First we discuss definitions and relations to other fields. We propose a taxonomy based on three approaches: using explicit planning on given transitions, using explicit planning on learned transitions, and end-to-end learning of both planning and transitions. We use these approaches to organize a comprehensive overview of important recent developments such as latent models. We describe methods and benchmarks, and we suggest directions for future work for each of the approaches. Among promising research directions are curriculum learning, uncertainty modeling, and use of latent models for transfer learning." @default.
- W3107370812 created "2020-12-07" @default.
- W3107370812 creator A5054569164 @default.
- W3107370812 creator A5062774048 @default.
- W3107370812 creator A5085542421 @default.
- W3107370812 date "2020-08-11" @default.
- W3107370812 modified "2023-09-27" @default.
- W3107370812 title "Deep Model-Based Reinforcement Learning for High-Dimensional Problems, a Survey" @default.
- W3107370812 cites W1485009520 @default.
- W3107370812 cites W1500431805 @default.
- W3107370812 cites W1506806321 @default.
- W3107370812 cites W1515019551 @default.
- W3107370812 cites W1555915743 @default.
- W3107370812 cites W1559956479 @default.
- W3107370812 cites W1570233100 @default.
- W3107370812 cites W1579853615 @default.
- W3107370812 cites W16011919 @default.
- W3107370812 cites W1625390266 @default.
- W3107370812 cites W1636614024 @default.
- W3107370812 cites W1757796397 @default.
- W3107370812 cites W1810943226 @default.
- W3107370812 cites W1882937458 @default.
- W3107370812 cites W1959608418 @default.
- W3107370812 cites W1969483458 @default.
- W3107370812 cites W1971129545 @default.
- W3107370812 cites W1971439026 @default.
- W3107370812 cites W1977655452 @default.
- W3107370812 cites W1977989560 @default.
- W3107370812 cites W1980035368 @default.
- W3107370812 cites W1982678075 @default.
- W3107370812 cites W1987150989 @default.
- W3107370812 cites W2014932765 @default.
- W3107370812 cites W2035033474 @default.
- W3107370812 cites W2037090034 @default.
- W3107370812 cites W2084920657 @default.
- W3107370812 cites W2087617385 @default.
- W3107370812 cites W2102256448 @default.
- W3107370812 cites W2104733512 @default.
- W3107370812 cites W2107726111 @default.
- W3107370812 cites W2120501001 @default.
- W3107370812 cites W2121103318 @default.
- W3107370812 cites W2121863487 @default.
- W3107370812 cites W2122410182 @default.
- W3107370812 cites W2123491406 @default.
- W3107370812 cites W2126316555 @default.
- W3107370812 cites W2145339207 @default.
- W3107370812 cites W2150470619 @default.
- W3107370812 cites W2153039919 @default.
- W3107370812 cites W2156737235 @default.
- W3107370812 cites W2157803532 @default.
- W3107370812 cites W2158782408 @default.
- W3107370812 cites W2163605009 @default.
- W3107370812 cites W2165698076 @default.
- W3107370812 cites W2169209873 @default.
- W3107370812 cites W2173248099 @default.
- W3107370812 cites W2176950688 @default.
- W3107370812 cites W2194775991 @default.
- W3107370812 cites W2202140284 @default.
- W3107370812 cites W2275430110 @default.
- W3107370812 cites W2296073425 @default.
- W3107370812 cites W2341171179 @default.
- W3107370812 cites W2534314849 @default.
- W3107370812 cites W2580909119 @default.
- W3107370812 cites W2596142952 @default.
- W3107370812 cites W2606047872 @default.
- W3107370812 cites W2613603362 @default.
- W3107370812 cites W2736601468 @default.
- W3107370812 cites W2738675347 @default.
- W3107370812 cites W2747402019 @default.
- W3107370812 cites W2749807327 @default.
- W3107370812 cites W2752099845 @default.
- W3107370812 cites W2754517384 @default.
- W3107370812 cites W2761873684 @default.
- W3107370812 cites W2766447205 @default.
- W3107370812 cites W2774354230 @default.
- W3107370812 cites W2786019934 @default.
- W3107370812 cites W2795756076 @default.
- W3107370812 cites W2803928381 @default.
- W3107370812 cites W2807340089 @default.
- W3107370812 cites W2809824610 @default.
- W3107370812 cites W2890208753 @default.
- W3107370812 cites W2900152462 @default.
- W3107370812 cites W2902907165 @default.
- W3107370812 cites W2903552445 @default.
- W3107370812 cites W2914898814 @default.
- W3107370812 cites W2919614896 @default.
- W3107370812 cites W2920362155 @default.
- W3107370812 cites W2931903779 @default.
- W3107370812 cites W2932752459 @default.
- W3107370812 cites W2944895446 @default.
- W3107370812 cites W2952095743 @default.
- W3107370812 cites W2953708620 @default.
- W3107370812 cites W2954989419 @default.
- W3107370812 cites W2962712652 @default.
- W3107370812 cites W2962841471 @default.
- W3107370812 cites W2962872206 @default.
- W3107370812 cites W2962991582 @default.
- W3107370812 cites W2963072115 @default.
- W3107370812 cites W2963238274 @default.
- W3107370812 cites W2963262099 @default.