Matches in SemOpenAlex for { <https://semopenalex.org/work/W2998497924> ?p ?o ?g. }
- W2998497924 abstract "We analyze the Gambler's problem, a simple reinforcement learning problem where the gambler has the chance to double or lose the bets until the target is reached. This is an early example introduced in the reinforcement learning textbook by Sutton and Barto (2018), where they mention an interesting pattern of the optimal value function with high-frequency components and repeating non-smooth points. It is however without further investigation. We provide the exact formula for the optimal value function for both the discrete and the continuous cases. Though simple as it might seem, the value function is pathological: fractal, self-similar, derivative taking either zero or infinity, and not written as elementary functions. It is in fact one of the generalized Cantor functions, where it holds a complexity that has been uncharted thus far. Our analyses could provide insights into improving value function approximation, gradient-based algorithms, and Q-learning, in real applications and implementations." @default.
- W2998497924 created "2020-01-10" @default.
- W2998497924 creator A5053471653 @default.
- W2998497924 creator A5055941214 @default.
- W2998497924 creator A5064686947 @default.
- W2998497924 creator A5065912071 @default.
- W2998497924 date "2019-12-31" @default.
- W2998497924 modified "2023-09-27" @default.
- W2998497924 title "The Gambler's Problem and Beyond." @default.
- W2998497924 cites W1201872488 @default.
- W2998497924 cites W1541012876 @default.
- W2998497924 cites W1547253971 @default.
- W2998497924 cites W1583314537 @default.
- W2998497924 cites W1646707810 @default.
- W2998497924 cites W1665890689 @default.
- W2998497924 cites W1679945064 @default.
- W2998497924 cites W1753768835 @default.
- W2998497924 cites W1971934487 @default.
- W2998497924 cites W2022593932 @default.
- W2998497924 cites W2032100464 @default.
- W2998497924 cites W2071554809 @default.
- W2998497924 cites W2073384958 @default.
- W2998497924 cites W2079599249 @default.
- W2998497924 cites W2107257053 @default.
- W2998497924 cites W2121863487 @default.
- W2998497924 cites W2145339207 @default.
- W2998497924 cites W2152706713 @default.
- W2998497924 cites W2168024904 @default.
- W2998497924 cites W2173248099 @default.
- W2998497924 cites W2329326227 @default.
- W2998497924 cites W2620671107 @default.
- W2998497924 cites W2896412913 @default.
- W2998497924 cites W2952896555 @default.
- W2998497924 cites W2964006217 @default.
- W2998497924 hasPublicationYear "2019" @default.
- W2998497924 type Work @default.
- W2998497924 sameAs 2998497924 @default.
- W2998497924 citedByCount "0" @default.
- W2998497924 crossrefType "posted-content" @default.
- W2998497924 hasAuthorship W2998497924A5053471653 @default.
- W2998497924 hasAuthorship W2998497924A5055941214 @default.
- W2998497924 hasAuthorship W2998497924A5064686947 @default.
- W2998497924 hasAuthorship W2998497924A5065912071 @default.
- W2998497924 hasConcept C105795698 @default.
- W2998497924 hasConcept C111472728 @default.
- W2998497924 hasConcept C134306372 @default.
- W2998497924 hasConcept C138885662 @default.
- W2998497924 hasConcept C14036430 @default.
- W2998497924 hasConcept C144237770 @default.
- W2998497924 hasConcept C14646407 @default.
- W2998497924 hasConcept C154945302 @default.
- W2998497924 hasConcept C2776291640 @default.
- W2998497924 hasConcept C2780586882 @default.
- W2998497924 hasConcept C2780813799 @default.
- W2998497924 hasConcept C33923547 @default.
- W2998497924 hasConcept C41008148 @default.
- W2998497924 hasConcept C41895202 @default.
- W2998497924 hasConcept C7321624 @default.
- W2998497924 hasConcept C78458016 @default.
- W2998497924 hasConcept C86803240 @default.
- W2998497924 hasConcept C97541855 @default.
- W2998497924 hasConceptScore W2998497924C105795698 @default.
- W2998497924 hasConceptScore W2998497924C111472728 @default.
- W2998497924 hasConceptScore W2998497924C134306372 @default.
- W2998497924 hasConceptScore W2998497924C138885662 @default.
- W2998497924 hasConceptScore W2998497924C14036430 @default.
- W2998497924 hasConceptScore W2998497924C144237770 @default.
- W2998497924 hasConceptScore W2998497924C14646407 @default.
- W2998497924 hasConceptScore W2998497924C154945302 @default.
- W2998497924 hasConceptScore W2998497924C2776291640 @default.
- W2998497924 hasConceptScore W2998497924C2780586882 @default.
- W2998497924 hasConceptScore W2998497924C2780813799 @default.
- W2998497924 hasConceptScore W2998497924C33923547 @default.
- W2998497924 hasConceptScore W2998497924C41008148 @default.
- W2998497924 hasConceptScore W2998497924C41895202 @default.
- W2998497924 hasConceptScore W2998497924C7321624 @default.
- W2998497924 hasConceptScore W2998497924C78458016 @default.
- W2998497924 hasConceptScore W2998497924C86803240 @default.
- W2998497924 hasConceptScore W2998497924C97541855 @default.
- W2998497924 hasLocation W29984979241 @default.
- W2998497924 hasOpenAccess W2998497924 @default.
- W2998497924 hasPrimaryLocation W29984979241 @default.
- W2998497924 hasRelatedWork W110565014 @default.
- W2998497924 hasRelatedWork W1595716122 @default.
- W2998497924 hasRelatedWork W1609488243 @default.
- W2998497924 hasRelatedWork W1986389067 @default.
- W2998497924 hasRelatedWork W2028744244 @default.
- W2998497924 hasRelatedWork W2035350435 @default.
- W2998497924 hasRelatedWork W2042811347 @default.
- W2998497924 hasRelatedWork W2250047028 @default.
- W2998497924 hasRelatedWork W2293493106 @default.
- W2998497924 hasRelatedWork W2314523682 @default.
- W2998497924 hasRelatedWork W2322871508 @default.
- W2998497924 hasRelatedWork W2333733655 @default.
- W2998497924 hasRelatedWork W2599085501 @default.
- W2998497924 hasRelatedWork W2889297910 @default.
- W2998497924 hasRelatedWork W2949863386 @default.
- W2998497924 hasRelatedWork W2964200481 @default.
- W2998497924 hasRelatedWork W2995347325 @default.
- W2998497924 hasRelatedWork W3139405606 @default.