Matches in SemOpenAlex for { <https://semopenalex.org/work/W2912496897> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W2912496897 abstract "Multi-step methods such as Retrace($lambda$) and $n$-step $Q$-learning have become a crucial component of modern deep reinforcement learning agents. These methods are often evaluated as a part of bigger architectures and their evaluations rarely include enough samples to draw statistically significant conclusions about their performance. This type of methodology makes it difficult to understand how particular algorithmic details of multi-step methods influence learning. In this paper we combine the $n$-step action-value algorithms Retrace, $Q$-learning, Tree Backup, Sarsa, and $Q(sigma)$ with an architecture analogous to DQN. We test the performance of all these algorithms in the mountain car environment; this choice of environment allows for faster training times and larger sample sizes. We present statistical analyses on the effects of the off-policy correction, the backup length parameter $n$, and the update frequency of the target network on the performance of these algorithms. Our results show that (1) using off-policy correction can have an adverse effect on the performance of Sarsa and $Q(sigma)$; (2) increasing the backup length $n$ consistently improved performance across all the different algorithms; and (3) the performance of Sarsa and $Q$-learning was more robust to the effect of the target network update frequency than the performance of Tree Backup, $Q(sigma)$, and Retrace in this particular task." @default.
- W2912496897 created "2019-02-21" @default.
- W2912496897 creator A5004923102 @default.
- W2912496897 creator A5076176197 @default.
- W2912496897 date "2019-01-22" @default.
- W2912496897 modified "2023-09-27" @default.
- W2912496897 title "Understanding Multi-Step Deep Reinforcement Learning: A Systematic Study of the DQN Target." @default.
- W2912496897 cites W1583330603 @default.
- W2912496897 cites W2100677568 @default.
- W2912496897 cites W2145339207 @default.
- W2912496897 cites W2761873684 @default.
- W2912496897 hasPublicationYear "2019" @default.
- W2912496897 type Work @default.
- W2912496897 sameAs 2912496897 @default.
- W2912496897 citedByCount "6" @default.
- W2912496897 countsByYear W29124968972019 @default.
- W2912496897 countsByYear W29124968972020 @default.
- W2912496897 crossrefType "posted-content" @default.
- W2912496897 hasAuthorship W2912496897A5004923102 @default.
- W2912496897 hasAuthorship W2912496897A5076176197 @default.
- W2912496897 hasConcept C113174947 @default.
- W2912496897 hasConcept C11413529 @default.
- W2912496897 hasConcept C119857082 @default.
- W2912496897 hasConcept C121332964 @default.
- W2912496897 hasConcept C127413603 @default.
- W2912496897 hasConcept C134306372 @default.
- W2912496897 hasConcept C154945302 @default.
- W2912496897 hasConcept C188116033 @default.
- W2912496897 hasConcept C201995342 @default.
- W2912496897 hasConcept C2778049214 @default.
- W2912496897 hasConcept C2780451532 @default.
- W2912496897 hasConcept C2780945871 @default.
- W2912496897 hasConcept C33923547 @default.
- W2912496897 hasConcept C41008148 @default.
- W2912496897 hasConcept C62520636 @default.
- W2912496897 hasConcept C77088390 @default.
- W2912496897 hasConcept C97541855 @default.
- W2912496897 hasConceptScore W2912496897C113174947 @default.
- W2912496897 hasConceptScore W2912496897C11413529 @default.
- W2912496897 hasConceptScore W2912496897C119857082 @default.
- W2912496897 hasConceptScore W2912496897C121332964 @default.
- W2912496897 hasConceptScore W2912496897C127413603 @default.
- W2912496897 hasConceptScore W2912496897C134306372 @default.
- W2912496897 hasConceptScore W2912496897C154945302 @default.
- W2912496897 hasConceptScore W2912496897C188116033 @default.
- W2912496897 hasConceptScore W2912496897C201995342 @default.
- W2912496897 hasConceptScore W2912496897C2778049214 @default.
- W2912496897 hasConceptScore W2912496897C2780451532 @default.
- W2912496897 hasConceptScore W2912496897C2780945871 @default.
- W2912496897 hasConceptScore W2912496897C33923547 @default.
- W2912496897 hasConceptScore W2912496897C41008148 @default.
- W2912496897 hasConceptScore W2912496897C62520636 @default.
- W2912496897 hasConceptScore W2912496897C77088390 @default.
- W2912496897 hasConceptScore W2912496897C97541855 @default.
- W2912496897 hasLocation W29124968971 @default.
- W2912496897 hasOpenAccess W2912496897 @default.
- W2912496897 hasPrimaryLocation W29124968971 @default.
- W2912496897 hasRelatedWork W1514587017 @default.
- W2912496897 hasRelatedWork W2121863487 @default.
- W2912496897 hasRelatedWork W2145339207 @default.
- W2912496897 hasRelatedWork W2155968351 @default.
- W2912496897 hasRelatedWork W2173248099 @default.
- W2912496897 hasRelatedWork W2375993676 @default.
- W2912496897 hasRelatedWork W2549225575 @default.
- W2912496897 hasRelatedWork W2736601468 @default.
- W2912496897 hasRelatedWork W2890169813 @default.
- W2912496897 hasRelatedWork W2896489347 @default.
- W2912496897 hasRelatedWork W2955790965 @default.
- W2912496897 hasRelatedWork W2963828709 @default.
- W2912496897 hasRelatedWork W2964043796 @default.
- W2912496897 hasRelatedWork W2978041660 @default.
- W2912496897 hasRelatedWork W3042365678 @default.
- W2912496897 hasRelatedWork W3116030074 @default.
- W2912496897 hasRelatedWork W3126245445 @default.
- W2912496897 hasRelatedWork W3132647335 @default.
- W2912496897 hasRelatedWork W3160355689 @default.
- W2912496897 hasRelatedWork W51508254 @default.
- W2912496897 isParatext "false" @default.
- W2912496897 isRetracted "false" @default.
- W2912496897 magId "2912496897" @default.
- W2912496897 workType "article" @default.