Matches in SemOpenAlex for { <https://semopenalex.org/work/W2417786368> ?p ?o ?g. }
- W2417786368 abstract "Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards." @default.
- W2417786368 created "2016-06-24" @default.
- W2417786368 creator A5003630193 @default.
- W2417786368 creator A5008171524 @default.
- W2417786368 creator A5009037058 @default.
- W2417786368 creator A5027941146 @default.
- W2417786368 creator A5049349154 @default.
- W2417786368 creator A5051784384 @default.
- W2417786368 date "2016-05-31" @default.
- W2417786368 modified "2023-09-27" @default.
- W2417786368 title "VIME: Variational Information Maximizing Exploration" @default.
- W2417786368 cites W1505937442 @default.
- W2417786368 cites W1515851193 @default.
- W2417786368 cites W1542595278 @default.
- W2417786368 cites W1550989509 @default.
- W2417786368 cites W1747856733 @default.
- W2417786368 cites W1850488217 @default.
- W2417786368 cites W1863227302 @default.
- W2417786368 cites W2000514530 @default.
- W2417786368 cites W2034806191 @default.
- W2417786368 cites W2047229728 @default.
- W2417786368 cites W2078150668 @default.
- W2417786368 cites W2108114251 @default.
- W2417786368 cites W2108677974 @default.
- W2417786368 cites W2109169869 @default.
- W2417786368 cites W2114275913 @default.
- W2417786368 cites W2118688707 @default.
- W2417786368 cites W2119717200 @default.
- W2417786368 cites W2120889539 @default.
- W2417786368 cites W2124035349 @default.
- W2417786368 cites W2124352385 @default.
- W2417786368 cites W2145339207 @default.
- W2417786368 cites W2160589914 @default.
- W2417786368 cites W2167117957 @default.
- W2417786368 cites W2181068523 @default.
- W2417786368 cites W21891419 @default.
- W2417786368 cites W2280163991 @default.
- W2417786368 cites W2342662072 @default.
- W2417786368 cites W2412288866 @default.
- W2417786368 cites W2415726935 @default.
- W2417786368 cites W2489939061 @default.
- W2417786368 cites W2949608212 @default.
- W2417786368 cites W2950096292 @default.
- W2417786368 cites W2951266961 @default.
- W2417786368 cites W2951595529 @default.
- W2417786368 cites W2964121744 @default.
- W2417786368 cites W3100183562 @default.
- W2417786368 cites W3123298421 @default.
- W2417786368 cites W326419249 @default.
- W2417786368 cites W779494576 @default.
- W2417786368 hasPublicationYear "2016" @default.
- W2417786368 type Work @default.
- W2417786368 sameAs 2417786368 @default.
- W2417786368 citedByCount "155" @default.
- W2417786368 countsByYear W24177863682016 @default.
- W2417786368 countsByYear W24177863682017 @default.
- W2417786368 countsByYear W24177863682018 @default.
- W2417786368 countsByYear W24177863682019 @default.
- W2417786368 countsByYear W24177863682020 @default.
- W2417786368 countsByYear W24177863682021 @default.
- W2417786368 crossrefType "posted-content" @default.
- W2417786368 hasAuthorship W2417786368A5003630193 @default.
- W2417786368 hasAuthorship W2417786368A5008171524 @default.
- W2417786368 hasAuthorship W2417786368A5009037058 @default.
- W2417786368 hasAuthorship W2417786368A5027941146 @default.
- W2417786368 hasAuthorship W2417786368A5049349154 @default.
- W2417786368 hasAuthorship W2417786368A5051784384 @default.
- W2417786368 hasConcept C111919701 @default.
- W2417786368 hasConcept C11413529 @default.
- W2417786368 hasConcept C119857082 @default.
- W2417786368 hasConcept C126255220 @default.
- W2417786368 hasConcept C127705205 @default.
- W2417786368 hasConcept C13280743 @default.
- W2417786368 hasConcept C136197465 @default.
- W2417786368 hasConcept C154945302 @default.
- W2417786368 hasConcept C173801870 @default.
- W2417786368 hasConcept C185798385 @default.
- W2417786368 hasConcept C205649164 @default.
- W2417786368 hasConcept C2776330181 @default.
- W2417786368 hasConcept C33923547 @default.
- W2417786368 hasConcept C41008148 @default.
- W2417786368 hasConcept C48044578 @default.
- W2417786368 hasConcept C51823790 @default.
- W2417786368 hasConcept C77088390 @default.
- W2417786368 hasConcept C97541855 @default.
- W2417786368 hasConceptScore W2417786368C111919701 @default.
- W2417786368 hasConceptScore W2417786368C11413529 @default.
- W2417786368 hasConceptScore W2417786368C119857082 @default.
- W2417786368 hasConceptScore W2417786368C126255220 @default.
- W2417786368 hasConceptScore W2417786368C127705205 @default.
- W2417786368 hasConceptScore W2417786368C13280743 @default.
- W2417786368 hasConceptScore W2417786368C136197465 @default.
- W2417786368 hasConceptScore W2417786368C154945302 @default.
- W2417786368 hasConceptScore W2417786368C173801870 @default.
- W2417786368 hasConceptScore W2417786368C185798385 @default.
- W2417786368 hasConceptScore W2417786368C205649164 @default.
- W2417786368 hasConceptScore W2417786368C2776330181 @default.
- W2417786368 hasConceptScore W2417786368C33923547 @default.
- W2417786368 hasConceptScore W2417786368C41008148 @default.
- W2417786368 hasConceptScore W2417786368C48044578 @default.