Matches in SemOpenAlex for { <https://semopenalex.org/work/W3156679053> ?p ?o ?g. }
- W3156679053 abstract "Training with Reinforcement Learning requires a reward function that is used to guide the agent towards achieving its objective. However, designing smooth and well-behaved rewards is in general not trivial and requires significant human engineering efforts. Generating rewards in self-supervised way, by inspiring the agent with an intrinsic desire to learn and explore the environment, might induce more general behaviours. In this work, we propose a curiosity-based bonus as intrinsic reward for Reinforcement Learning, computed as the Bayesian surprise with respect to a latent state variable, learnt by reconstructing fixed random features. We extensively evaluate our model by measuring the agent's performance in terms of environment exploration, for continuous tasks, and looking at the game scores achieved, for video games. Our model is computationally cheap and empirically shows state-of-the-art performance on several problems. Furthermore, experimenting on an environment with stochastic actions, our approach emerged to be the most resilient to simple stochasticity. Further visualization is available on the project webpage.(this https URL)" @default.
- W3156679053 created "2021-04-26" @default.
- W3156679053 creator A5008371078 @default.
- W3156679053 creator A5024997865 @default.
- W3156679053 creator A5032558375 @default.
- W3156679053 creator A5071221406 @default.
- W3156679053 date "2021-04-15" @default.
- W3156679053 modified "2023-09-27" @default.
- W3156679053 title "Self-Supervised Exploration via Latent Bayesian Surprise" @default.
- W3156679053 cites W1771410628 @default.
- W3156679053 cites W1863227302 @default.
- W3156679053 cites W1959608418 @default.
- W3156679053 cites W2101524054 @default.
- W3156679053 cites W2120889539 @default.
- W3156679053 cites W2132221774 @default.
- W3156679053 cites W2145339207 @default.
- W3156679053 cites W2158782408 @default.
- W3156679053 cites W2561776174 @default.
- W3156679053 cites W2593766708 @default.
- W3156679053 cites W2614839826 @default.
- W3156679053 cites W2736601468 @default.
- W3156679053 cites W2753738274 @default.
- W3156679053 cites W2883182720 @default.
- W3156679053 cites W2890208753 @default.
- W3156679053 cites W2900152462 @default.
- W3156679053 cites W2949608212 @default.
- W3156679053 cites W2953772919 @default.
- W3156679053 cites W2962723954 @default.
- W3156679053 cites W2962902376 @default.
- W3156679053 cites W2963160877 @default.
- W3156679053 cites W2963276097 @default.
- W3156679053 cites W2963430173 @default.
- W3156679053 cites W2963639957 @default.
- W3156679053 cites W2963820385 @default.
- W3156679053 cites W2963864421 @default.
- W3156679053 cites W2964067469 @default.
- W3156679053 cites W2964174623 @default.
- W3156679053 cites W2964291307 @default.
- W3156679053 cites W2965435131 @default.
- W3156679053 cites W2981030070 @default.
- W3156679053 cites W2995298643 @default.
- W3156679053 cites W2996695841 @default.
- W3156679053 cites W3092804041 @default.
- W3156679053 cites W3113994363 @default.
- W3156679053 cites W3115502595 @default.
- W3156679053 cites W41238051 @default.
- W3156679053 cites W3034849606 @default.
- W3156679053 hasPublicationYear "2021" @default.
- W3156679053 type Work @default.
- W3156679053 sameAs 3156679053 @default.
- W3156679053 citedByCount "1" @default.
- W3156679053 countsByYear W31566790532021 @default.
- W3156679053 crossrefType "posted-content" @default.
- W3156679053 hasAuthorship W3156679053A5008371078 @default.
- W3156679053 hasAuthorship W3156679053A5024997865 @default.
- W3156679053 hasAuthorship W3156679053A5032558375 @default.
- W3156679053 hasAuthorship W3156679053A5071221406 @default.
- W3156679053 hasConcept C107673813 @default.
- W3156679053 hasConcept C111472728 @default.
- W3156679053 hasConcept C119857082 @default.
- W3156679053 hasConcept C134306372 @default.
- W3156679053 hasConcept C138885662 @default.
- W3156679053 hasConcept C154945302 @default.
- W3156679053 hasConcept C15744967 @default.
- W3156679053 hasConcept C182365436 @default.
- W3156679053 hasConcept C2780343955 @default.
- W3156679053 hasConcept C2780586882 @default.
- W3156679053 hasConcept C33435437 @default.
- W3156679053 hasConcept C33923547 @default.
- W3156679053 hasConcept C36464697 @default.
- W3156679053 hasConcept C41008148 @default.
- W3156679053 hasConcept C51167844 @default.
- W3156679053 hasConcept C77805123 @default.
- W3156679053 hasConcept C97541855 @default.
- W3156679053 hasConceptScore W3156679053C107673813 @default.
- W3156679053 hasConceptScore W3156679053C111472728 @default.
- W3156679053 hasConceptScore W3156679053C119857082 @default.
- W3156679053 hasConceptScore W3156679053C134306372 @default.
- W3156679053 hasConceptScore W3156679053C138885662 @default.
- W3156679053 hasConceptScore W3156679053C154945302 @default.
- W3156679053 hasConceptScore W3156679053C15744967 @default.
- W3156679053 hasConceptScore W3156679053C182365436 @default.
- W3156679053 hasConceptScore W3156679053C2780343955 @default.
- W3156679053 hasConceptScore W3156679053C2780586882 @default.
- W3156679053 hasConceptScore W3156679053C33435437 @default.
- W3156679053 hasConceptScore W3156679053C33923547 @default.
- W3156679053 hasConceptScore W3156679053C36464697 @default.
- W3156679053 hasConceptScore W3156679053C41008148 @default.
- W3156679053 hasConceptScore W3156679053C51167844 @default.
- W3156679053 hasConceptScore W3156679053C77805123 @default.
- W3156679053 hasConceptScore W3156679053C97541855 @default.
- W3156679053 hasLocation W31566790531 @default.
- W3156679053 hasOpenAccess W3156679053 @default.
- W3156679053 hasPrimaryLocation W31566790531 @default.
- W3156679053 hasRelatedWork W2183087363 @default.
- W3156679053 hasRelatedWork W2551887912 @default.
- W3156679053 hasRelatedWork W2593766708 @default.
- W3156679053 hasRelatedWork W2751973545 @default.
- W3156679053 hasRelatedWork W2953772919 @default.
- W3156679053 hasRelatedWork W2963391602 @default.