Matches in SemOpenAlex for { <https://semopenalex.org/work/W2952023104> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- W2952023104 abstract "Since their introduction a year ago, distributional approaches to reinforcement learning (distributional RL) have produced strong results relative to the standard approach which models expected values (expected RL). However, aside from convergence guarantees, there have been few theoretical results investigating the reasons behind the improvements distributional RL provides. In this paper we begin the investigation into this fundamental question by analyzing the differences in the tabular, linear approximation, and non-linear approximation settings. We prove that in many realizations of the tabular and linear approximation settings, distributional RL behaves exactly the same as expected RL. In cases where the two methods behave differently, distributional RL can in fact hurt performance when it does not induce identical behaviour. We then continue with an empirical analysis comparing distributional and expected RL methods in control settings with non-linear approximators to tease apart where the improvements from distributional RL methods are coming from." @default.
- W2952023104 created "2019-06-27" @default.
- W2952023104 creator A5001087292 @default.
- W2952023104 creator A5068291173 @default.
- W2952023104 creator A5089486474 @default.
- W2952023104 date "2019-01-30" @default.
- W2952023104 modified "2023-09-27" @default.
- W2952023104 title "A Comparative Analysis of Expected and Distributional Reinforcement Learning" @default.
- W2952023104 cites W2032916024 @default.
- W2952023104 cites W2106261932 @default.
- W2952023104 cites W2619903301 @default.
- W2952023104 cites W2739748921 @default.
- W2952023104 cites W2788086877 @default.
- W2952023104 cites W2803308811 @default.
- W2952023104 cites W2808399504 @default.
- W2952023104 cites W2953083372 @default.
- W2952023104 cites W2963423916 @default.
- W2952023104 cites W2963757175 @default.
- W2952023104 cites W2964291307 @default.
- W2952023104 cites W2964331425 @default.
- W2952023104 cites W3139377883 @default.
- W2952023104 hasPublicationYear "2019" @default.
- W2952023104 type Work @default.
- W2952023104 sameAs 2952023104 @default.
- W2952023104 citedByCount "19" @default.
- W2952023104 countsByYear W29520231042018 @default.
- W2952023104 countsByYear W29520231042019 @default.
- W2952023104 countsByYear W29520231042020 @default.
- W2952023104 countsByYear W29520231042021 @default.
- W2952023104 crossrefType "posted-content" @default.
- W2952023104 hasAuthorship W2952023104A5001087292 @default.
- W2952023104 hasAuthorship W2952023104A5068291173 @default.
- W2952023104 hasAuthorship W2952023104A5089486474 @default.
- W2952023104 hasConcept C126255220 @default.
- W2952023104 hasConcept C149782125 @default.
- W2952023104 hasConcept C154945302 @default.
- W2952023104 hasConcept C162324750 @default.
- W2952023104 hasConcept C2777303404 @default.
- W2952023104 hasConcept C33923547 @default.
- W2952023104 hasConcept C41008148 @default.
- W2952023104 hasConcept C50522688 @default.
- W2952023104 hasConcept C97541855 @default.
- W2952023104 hasConceptScore W2952023104C126255220 @default.
- W2952023104 hasConceptScore W2952023104C149782125 @default.
- W2952023104 hasConceptScore W2952023104C154945302 @default.
- W2952023104 hasConceptScore W2952023104C162324750 @default.
- W2952023104 hasConceptScore W2952023104C2777303404 @default.
- W2952023104 hasConceptScore W2952023104C33923547 @default.
- W2952023104 hasConceptScore W2952023104C41008148 @default.
- W2952023104 hasConceptScore W2952023104C50522688 @default.
- W2952023104 hasConceptScore W2952023104C97541855 @default.
- W2952023104 hasLocation W29520231041 @default.
- W2952023104 hasOpenAccess W2952023104 @default.
- W2952023104 hasPrimaryLocation W29520231041 @default.
- W2952023104 hasRelatedWork W1641379095 @default.
- W2952023104 hasRelatedWork W1757796397 @default.
- W2952023104 hasRelatedWork W2119567691 @default.
- W2952023104 hasRelatedWork W2121863487 @default.
- W2952023104 hasRelatedWork W2145339207 @default.
- W2952023104 hasRelatedWork W2155968351 @default.
- W2952023104 hasRelatedWork W2158782408 @default.
- W2952023104 hasRelatedWork W2173248099 @default.
- W2952023104 hasRelatedWork W2173564293 @default.
- W2952023104 hasRelatedWork W2257979135 @default.
- W2952023104 hasRelatedWork W2736601468 @default.
- W2952023104 hasRelatedWork W2781726626 @default.
- W2952023104 hasRelatedWork W2786036274 @default.
- W2952023104 hasRelatedWork W2905224739 @default.
- W2952023104 hasRelatedWork W2905342215 @default.
- W2952023104 hasRelatedWork W2962878825 @default.
- W2952023104 hasRelatedWork W2963423916 @default.
- W2952023104 hasRelatedWork W2963757175 @default.
- W2952023104 hasRelatedWork W2964043796 @default.
- W2952023104 hasRelatedWork W3103780890 @default.
- W2952023104 isParatext "false" @default.
- W2952023104 isRetracted "false" @default.
- W2952023104 magId "2952023104" @default.
- W2952023104 workType "article" @default.