Matches in SemOpenAlex for { <https://semopenalex.org/work/W4289743246> ?p ?o ?g. }
Showing items 1 to 70 of
70
with 100 items per page.
- W4289743246 abstract "In this paper we introduce a simple approach for exploration in reinforcement learning (RL) that allows us to develop theoretically justified algorithms in the tabular case but that is also extendable to settings where function approximation is required. Our approach is based on the successor representation (SR), which was originally introduced as a representation defining state generalization by the similarity of successor states. Here we show that the norm of the SR, while it is being learned, can be used as a reward bonus to incentivize exploration. In order to better understand this transient behavior of the norm of the SR we introduce the substochastic successor representation (SSR) and we show that it implicitly counts the number of times each state (or feature) has been observed. We use this result to introduce an algorithm that performs as well as some theoretically sample-efficient approaches. Finally, we extend these ideas to a deep RL algorithm and show that it achieves state-of-the-art performance in Atari 2600 games when in a low sample-complexity regime." @default.
- W4289743246 created "2022-08-04" @default.
- W4289743246 creator A5001087292 @default.
- W4289743246 creator A5081163135 @default.
- W4289743246 creator A5085413987 @default.
- W4289743246 date "2018-07-30" @default.
- W4289743246 modified "2023-09-23" @default.
- W4289743246 title "Count-Based Exploration with the Successor Representation" @default.
- W4289743246 doi "https://doi.org/10.48550/arxiv.1807.11622" @default.
- W4289743246 hasPublicationYear "2018" @default.
- W4289743246 type Work @default.
- W4289743246 citedByCount "0" @default.
- W4289743246 crossrefType "posted-content" @default.
- W4289743246 hasAuthorship W4289743246A5001087292 @default.
- W4289743246 hasAuthorship W4289743246A5081163135 @default.
- W4289743246 hasAuthorship W4289743246A5085413987 @default.
- W4289743246 hasBestOaLocation W42897432461 @default.
- W4289743246 hasConcept C103278499 @default.
- W4289743246 hasConcept C11413529 @default.
- W4289743246 hasConcept C115961682 @default.
- W4289743246 hasConcept C134306372 @default.
- W4289743246 hasConcept C154945302 @default.
- W4289743246 hasConcept C177148314 @default.
- W4289743246 hasConcept C17744445 @default.
- W4289743246 hasConcept C191795146 @default.
- W4289743246 hasConcept C199539241 @default.
- W4289743246 hasConcept C2776359362 @default.
- W4289743246 hasConcept C2778445095 @default.
- W4289743246 hasConcept C33923547 @default.
- W4289743246 hasConcept C41008148 @default.
- W4289743246 hasConcept C48103436 @default.
- W4289743246 hasConcept C75306776 @default.
- W4289743246 hasConcept C80444323 @default.
- W4289743246 hasConcept C94625758 @default.
- W4289743246 hasConcept C97541855 @default.
- W4289743246 hasConceptScore W4289743246C103278499 @default.
- W4289743246 hasConceptScore W4289743246C11413529 @default.
- W4289743246 hasConceptScore W4289743246C115961682 @default.
- W4289743246 hasConceptScore W4289743246C134306372 @default.
- W4289743246 hasConceptScore W4289743246C154945302 @default.
- W4289743246 hasConceptScore W4289743246C177148314 @default.
- W4289743246 hasConceptScore W4289743246C17744445 @default.
- W4289743246 hasConceptScore W4289743246C191795146 @default.
- W4289743246 hasConceptScore W4289743246C199539241 @default.
- W4289743246 hasConceptScore W4289743246C2776359362 @default.
- W4289743246 hasConceptScore W4289743246C2778445095 @default.
- W4289743246 hasConceptScore W4289743246C33923547 @default.
- W4289743246 hasConceptScore W4289743246C41008148 @default.
- W4289743246 hasConceptScore W4289743246C48103436 @default.
- W4289743246 hasConceptScore W4289743246C75306776 @default.
- W4289743246 hasConceptScore W4289743246C80444323 @default.
- W4289743246 hasConceptScore W4289743246C94625758 @default.
- W4289743246 hasConceptScore W4289743246C97541855 @default.
- W4289743246 hasLocation W42897432461 @default.
- W4289743246 hasLocation W42897432462 @default.
- W4289743246 hasOpenAccess W4289743246 @default.
- W4289743246 hasPrimaryLocation W42897432461 @default.
- W4289743246 hasRelatedWork W2101355568 @default.
- W4289743246 hasRelatedWork W2440926996 @default.
- W4289743246 hasRelatedWork W2950141738 @default.
- W4289743246 hasRelatedWork W2962717849 @default.
- W4289743246 hasRelatedWork W2979869797 @default.
- W4289743246 hasRelatedWork W3049166411 @default.
- W4289743246 hasRelatedWork W3166948032 @default.
- W4289743246 hasRelatedWork W3169292790 @default.
- W4289743246 hasRelatedWork W4287688416 @default.
- W4289743246 hasRelatedWork W4296512964 @default.
- W4289743246 isParatext "false" @default.
- W4289743246 isRetracted "false" @default.
- W4289743246 workType "article" @default.