Matches in SemOpenAlex for { <https://semopenalex.org/work/W3035155163> ?p ?o ?g. }
Showing items 1 to 100 of
100
with 100 items per page.
- W3035155163 abstract "We propose a method for optimizing the random action selection of the ε-greedy policy to facilitate more efficient exploration of an environment by a reinforcement learning agent. Our directed ε-greedy policy selects actions with a biased probability where some actions are more likely to be selected than others. The probability distribution used for selecting random actions is the one that tends to lead to actions which increase the agent's uncertainty about its environment. The agent's uncertainty is measured by the error in self-supervised prediction of future environment states at the pixel level, given the previous states and the probabilities of next actions. By propagating the reverse gradient from the future state predictor model to a model generating probability distributions from random noise we create an end-to-end trainable model which learns to generate such action probability distributions for ε-greedy, so as to facilitate directed exploration of the environment. We evaluate our method in two environments: Minecraft and Super Mario Bros. The directed ε-greedy policy achieves an efficient curiosity-driven exploration without the use of any intrinsic reward function, outperforming vanilla ε-greedy exploration, softmax exploration and exploration using intrinsic rewards." @default.
- W3035155163 created "2020-06-19" @default.
- W3035155163 creator A5047221264 @default.
- W3035155163 creator A5072314554 @default.
- W3035155163 creator A5078107720 @default.
- W3035155163 date "2020-07-01" @default.
- W3035155163 modified "2023-09-26" @default.
- W3035155163 title "Directed Exploration Via Learnable Probability Distribution For Random Action Selection" @default.
- W3035155163 cites W135283623 @default.
- W3035155163 cites W1863227302 @default.
- W3035155163 cites W2061868368 @default.
- W3035155163 cites W2099044475 @default.
- W3035155163 cites W2099471712 @default.
- W3035155163 cites W2116459397 @default.
- W3035155163 cites W2139612737 @default.
- W3035155163 cites W2145339207 @default.
- W3035155163 cites W2155968351 @default.
- W3035155163 cites W2160589914 @default.
- W3035155163 cites W2167489871 @default.
- W3035155163 cites W2173520492 @default.
- W3035155163 cites W2293667215 @default.
- W3035155163 cites W2417786368 @default.
- W3035155163 cites W2480004914 @default.
- W3035155163 cites W2614839826 @default.
- W3035155163 cites W2620974420 @default.
- W3035155163 cites W2811046100 @default.
- W3035155163 cites W2917052767 @default.
- W3035155163 cites W2962730405 @default.
- W3035155163 cites W2962938178 @default.
- W3035155163 cites W2963160877 @default.
- W3035155163 cites W2963276097 @default.
- W3035155163 cites W2963790038 @default.
- W3035155163 cites W2963826681 @default.
- W3035155163 cites W779494576 @default.
- W3035155163 doi "https://doi.org/10.1109/icme46284.2020.9102959" @default.
- W3035155163 hasPublicationYear "2020" @default.
- W3035155163 type Work @default.
- W3035155163 sameAs 3035155163 @default.
- W3035155163 citedByCount "0" @default.
- W3035155163 crossrefType "proceedings-article" @default.
- W3035155163 hasAuthorship W3035155163A5047221264 @default.
- W3035155163 hasAuthorship W3035155163A5072314554 @default.
- W3035155163 hasAuthorship W3035155163A5078107720 @default.
- W3035155163 hasConcept C105795698 @default.
- W3035155163 hasConcept C108583219 @default.
- W3035155163 hasConcept C11413529 @default.
- W3035155163 hasConcept C119857082 @default.
- W3035155163 hasConcept C121332964 @default.
- W3035155163 hasConcept C126255220 @default.
- W3035155163 hasConcept C149441793 @default.
- W3035155163 hasConcept C154945302 @default.
- W3035155163 hasConcept C166109690 @default.
- W3035155163 hasConcept C169760540 @default.
- W3035155163 hasConcept C188441871 @default.
- W3035155163 hasConcept C26760741 @default.
- W3035155163 hasConcept C2780791683 @default.
- W3035155163 hasConcept C33923547 @default.
- W3035155163 hasConcept C41008148 @default.
- W3035155163 hasConcept C51823790 @default.
- W3035155163 hasConcept C62520636 @default.
- W3035155163 hasConcept C81917197 @default.
- W3035155163 hasConcept C86803240 @default.
- W3035155163 hasConcept C97541855 @default.
- W3035155163 hasConceptScore W3035155163C105795698 @default.
- W3035155163 hasConceptScore W3035155163C108583219 @default.
- W3035155163 hasConceptScore W3035155163C11413529 @default.
- W3035155163 hasConceptScore W3035155163C119857082 @default.
- W3035155163 hasConceptScore W3035155163C121332964 @default.
- W3035155163 hasConceptScore W3035155163C126255220 @default.
- W3035155163 hasConceptScore W3035155163C149441793 @default.
- W3035155163 hasConceptScore W3035155163C154945302 @default.
- W3035155163 hasConceptScore W3035155163C166109690 @default.
- W3035155163 hasConceptScore W3035155163C169760540 @default.
- W3035155163 hasConceptScore W3035155163C188441871 @default.
- W3035155163 hasConceptScore W3035155163C26760741 @default.
- W3035155163 hasConceptScore W3035155163C2780791683 @default.
- W3035155163 hasConceptScore W3035155163C33923547 @default.
- W3035155163 hasConceptScore W3035155163C41008148 @default.
- W3035155163 hasConceptScore W3035155163C51823790 @default.
- W3035155163 hasConceptScore W3035155163C62520636 @default.
- W3035155163 hasConceptScore W3035155163C81917197 @default.
- W3035155163 hasConceptScore W3035155163C86803240 @default.
- W3035155163 hasConceptScore W3035155163C97541855 @default.
- W3035155163 hasLocation W30351551631 @default.
- W3035155163 hasOpenAccess W3035155163 @default.
- W3035155163 hasPrimaryLocation W30351551631 @default.
- W3035155163 hasRelatedWork W1587318060 @default.
- W3035155163 hasRelatedWork W2131054638 @default.
- W3035155163 hasRelatedWork W3022038857 @default.
- W3035155163 hasRelatedWork W3035155163 @default.
- W3035155163 hasRelatedWork W3126156081 @default.
- W3035155163 hasRelatedWork W4221031036 @default.
- W3035155163 hasRelatedWork W4298167479 @default.
- W3035155163 hasRelatedWork W4313549111 @default.
- W3035155163 hasRelatedWork W4319083788 @default.
- W3035155163 hasRelatedWork W66717747 @default.
- W3035155163 isParatext "false" @default.
- W3035155163 isRetracted "false" @default.
- W3035155163 magId "3035155163" @default.
- W3035155163 workType "article" @default.