Matches in SemOpenAlex for { <https://semopenalex.org/work/W3006773485> ?p ?o ?g. }
- W3006773485 abstract "Exploration in sparse reward environments remains one of the key challenges of model-free reinforcement learning. Instead of solely relying on extrinsic rewards provided by the environment, many state-of-the-art methods use intrinsic rewards to encourage exploration. However, we show that existing methods fall short in procedurally-generated environments where an agent is unlikely to visit a state more than once. We propose a novel type of intrinsic reward which encourages the agent to take actions that lead to significant changes in its learned state representation. We evaluate our method on multiple challenging procedurally-generated tasks in MiniGrid, as well as on tasks with high-dimensional observations used in prior work. Our experiments demonstrate that this approach is more sample efficient than existing exploration methods, particularly for procedurally-generated MiniGrid environments. Furthermore, we analyze the learned behavior as well as the intrinsic reward received by our agent. In contrast to previous approaches, our intrinsic reward does not diminish during the course of training and it rewards the agent substantially more for interacting with objects that it can control." @default.
- W3006773485 created "2020-03-06" @default.
- W3006773485 creator A5018702533 @default.
- W3006773485 creator A5079315903 @default.
- W3006773485 date "2020-02-27" @default.
- W3006773485 modified "2023-10-01" @default.
- W3006773485 title "RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments" @default.
- W3006773485 cites W1480053368 @default.
- W3006773485 cites W1554318634 @default.
- W3006773485 cites W172298727 @default.
- W3006773485 cites W1863227302 @default.
- W3006773485 cites W1988526405 @default.
- W3006773485 cites W2000514530 @default.
- W3006773485 cites W2020920737 @default.
- W3006773485 cites W2034806191 @default.
- W3006773485 cites W2061868368 @default.
- W3006773485 cites W2078150668 @default.
- W3006773485 cites W2101524054 @default.
- W3006773485 cites W2247242831 @default.
- W3006773485 cites W2257979135 @default.
- W3006773485 cites W2480004914 @default.
- W3006773485 cites W2551887912 @default.
- W3006773485 cites W2561776174 @default.
- W3006773485 cites W2593766708 @default.
- W3006773485 cites W2614839826 @default.
- W3006773485 cites W2766447205 @default.
- W3006773485 cites W2768908787 @default.
- W3006773485 cites W2787666871 @default.
- W3006773485 cites W2796979132 @default.
- W3006773485 cites W2797527950 @default.
- W3006773485 cites W2807192342 @default.
- W3006773485 cites W2809668646 @default.
- W3006773485 cites W2891790128 @default.
- W3006773485 cites W2898436992 @default.
- W3006773485 cites W2905034002 @default.
- W3006773485 cites W2914261249 @default.
- W3006773485 cites W2914431475 @default.
- W3006773485 cites W2922388521 @default.
- W3006773485 cites W2944472880 @default.
- W3006773485 cites W2962715211 @default.
- W3006773485 cites W2962749646 @default.
- W3006773485 cites W2963090522 @default.
- W3006773485 cites W2963126744 @default.
- W3006773485 cites W2963160877 @default.
- W3006773485 cites W2963177395 @default.
- W3006773485 cites W2963248502 @default.
- W3006773485 cites W2963276097 @default.
- W3006773485 cites W2963285578 @default.
- W3006773485 cites W2963359646 @default.
- W3006773485 cites W2963403143 @default.
- W3006773485 cites W2963438456 @default.
- W3006773485 cites W2963639957 @default.
- W3006773485 cites W2963680188 @default.
- W3006773485 cites W2963751259 @default.
- W3006773485 cites W2963761387 @default.
- W3006773485 cites W2963871073 @default.
- W3006773485 cites W2963938771 @default.
- W3006773485 cites W2963985863 @default.
- W3006773485 cites W2964009285 @default.
- W3006773485 cites W2964043796 @default.
- W3006773485 cites W2964067469 @default.
- W3006773485 cites W2964083594 @default.
- W3006773485 cites W2964174623 @default.
- W3006773485 cites W2966556569 @default.
- W3006773485 cites W2980077985 @default.
- W3006773485 cites W2997289589 @default.
- W3006773485 cites W779494576 @default.
- W3006773485 hasPublicationYear "2020" @default.
- W3006773485 type Work @default.
- W3006773485 sameAs 3006773485 @default.
- W3006773485 citedByCount "4" @default.
- W3006773485 countsByYear W30067734852020 @default.
- W3006773485 countsByYear W30067734852021 @default.
- W3006773485 crossrefType "posted-content" @default.
- W3006773485 hasAuthorship W3006773485A5018702533 @default.
- W3006773485 hasAuthorship W3006773485A5079315903 @default.
- W3006773485 hasConcept C107457646 @default.
- W3006773485 hasConcept C154945302 @default.
- W3006773485 hasConcept C17744445 @default.
- W3006773485 hasConcept C199539241 @default.
- W3006773485 hasConcept C26517878 @default.
- W3006773485 hasConcept C2775924081 @default.
- W3006773485 hasConcept C2776359362 @default.
- W3006773485 hasConcept C38652104 @default.
- W3006773485 hasConcept C41008148 @default.
- W3006773485 hasConcept C94625758 @default.
- W3006773485 hasConcept C97541855 @default.
- W3006773485 hasConceptScore W3006773485C107457646 @default.
- W3006773485 hasConceptScore W3006773485C154945302 @default.
- W3006773485 hasConceptScore W3006773485C17744445 @default.
- W3006773485 hasConceptScore W3006773485C199539241 @default.
- W3006773485 hasConceptScore W3006773485C26517878 @default.
- W3006773485 hasConceptScore W3006773485C2775924081 @default.
- W3006773485 hasConceptScore W3006773485C2776359362 @default.
- W3006773485 hasConceptScore W3006773485C38652104 @default.
- W3006773485 hasConceptScore W3006773485C41008148 @default.
- W3006773485 hasConceptScore W3006773485C94625758 @default.
- W3006773485 hasConceptScore W3006773485C97541855 @default.
- W3006773485 hasLocation W30067734851 @default.
- W3006773485 hasOpenAccess W3006773485 @default.