Matches in SemOpenAlex for { <https://semopenalex.org/work/W2010162645> ?p ?o ?g. }
- W2010162645 abstract "Traditional reinforcement learning algorithms, such as Q-learning, Q(λ), Sarsa, and Sarsa(λ), update the action value function using temporal difference (TD) error, which is computed by the last action value function. From the perspective of the TD error, and with respect to the problems of low efficiency and slow convergence of the traditional Sarsa(λ) algorithm, this paper defines the nth order TD Error, applies it in the traditional Sarsa(λ) algorithm, and develops a fast Sarsa(λ) algorithm based on the 2nd order TD Error. The algorithm adjusts the Q value with the second-order TD Error and broadcasts the TD Error into the whole state-action space, which speeds up the convergence of the algorithm. This paper also analyzes the convergence rate, and under the condition of one-step update, the results show that the number of iteration depends primarily on γ, e. Finally, using the proposed algorithm on the traditional reinforcement learning problems, the results show that the algorithm has both a faster convergence rate and better convergence performance." @default.
- W2010162645 created "2016-06-24" @default.
- W2010162645 creator A5008519017 @default.
- W2010162645 creator A5033044925 @default.
- W2010162645 creator A5077588162 @default.
- W2010162645 creator A5087429872 @default.
- W2010162645 date "2013-04-01" @default.
- W2010162645 modified "2023-09-25" @default.
- W2010162645 title "The second order temporal difference error for Sarsa(λ)" @default.
- W2010162645 cites W2100677568 @default.
- W2010162645 cites W2100752967 @default.
- W2010162645 cites W2115211925 @default.
- W2010162645 cites W2118556122 @default.
- W2010162645 cites W2130005627 @default.
- W2010162645 cites W2140924946 @default.
- W2010162645 cites W2150339816 @default.
- W2010162645 cites W2156974606 @default.
- W2010162645 cites W2161009228 @default.
- W2010162645 cites W2354435290 @default.
- W2010162645 cites W2610686804 @default.
- W2010162645 cites W2911283634 @default.
- W2010162645 doi "https://doi.org/10.1109/adprl.2013.6614990" @default.
- W2010162645 hasPublicationYear "2013" @default.
- W2010162645 type Work @default.
- W2010162645 sameAs 2010162645 @default.
- W2010162645 citedByCount "0" @default.
- W2010162645 crossrefType "proceedings-article" @default.
- W2010162645 hasAuthorship W2010162645A5008519017 @default.
- W2010162645 hasAuthorship W2010162645A5033044925 @default.
- W2010162645 hasAuthorship W2010162645A5077588162 @default.
- W2010162645 hasAuthorship W2010162645A5087429872 @default.
- W2010162645 hasConcept C11413529 @default.
- W2010162645 hasConcept C119857082 @default.
- W2010162645 hasConcept C122383733 @default.
- W2010162645 hasConcept C126255220 @default.
- W2010162645 hasConcept C14036430 @default.
- W2010162645 hasConcept C14646407 @default.
- W2010162645 hasConcept C154945302 @default.
- W2010162645 hasConcept C162324750 @default.
- W2010162645 hasConcept C196340769 @default.
- W2010162645 hasConcept C26517878 @default.
- W2010162645 hasConcept C2776291640 @default.
- W2010162645 hasConcept C2777303404 @default.
- W2010162645 hasConcept C33923547 @default.
- W2010162645 hasConcept C38652104 @default.
- W2010162645 hasConcept C41008148 @default.
- W2010162645 hasConcept C50522688 @default.
- W2010162645 hasConcept C50644808 @default.
- W2010162645 hasConcept C57869625 @default.
- W2010162645 hasConcept C78458016 @default.
- W2010162645 hasConcept C86803240 @default.
- W2010162645 hasConcept C91873725 @default.
- W2010162645 hasConcept C97541855 @default.
- W2010162645 hasConceptScore W2010162645C11413529 @default.
- W2010162645 hasConceptScore W2010162645C119857082 @default.
- W2010162645 hasConceptScore W2010162645C122383733 @default.
- W2010162645 hasConceptScore W2010162645C126255220 @default.
- W2010162645 hasConceptScore W2010162645C14036430 @default.
- W2010162645 hasConceptScore W2010162645C14646407 @default.
- W2010162645 hasConceptScore W2010162645C154945302 @default.
- W2010162645 hasConceptScore W2010162645C162324750 @default.
- W2010162645 hasConceptScore W2010162645C196340769 @default.
- W2010162645 hasConceptScore W2010162645C26517878 @default.
- W2010162645 hasConceptScore W2010162645C2776291640 @default.
- W2010162645 hasConceptScore W2010162645C2777303404 @default.
- W2010162645 hasConceptScore W2010162645C33923547 @default.
- W2010162645 hasConceptScore W2010162645C38652104 @default.
- W2010162645 hasConceptScore W2010162645C41008148 @default.
- W2010162645 hasConceptScore W2010162645C50522688 @default.
- W2010162645 hasConceptScore W2010162645C50644808 @default.
- W2010162645 hasConceptScore W2010162645C57869625 @default.
- W2010162645 hasConceptScore W2010162645C78458016 @default.
- W2010162645 hasConceptScore W2010162645C86803240 @default.
- W2010162645 hasConceptScore W2010162645C91873725 @default.
- W2010162645 hasConceptScore W2010162645C97541855 @default.
- W2010162645 hasLocation W20101626451 @default.
- W2010162645 hasOpenAccess W2010162645 @default.
- W2010162645 hasPrimaryLocation W20101626451 @default.
- W2010162645 hasRelatedWork W1521228173 @default.
- W2010162645 hasRelatedWork W1561685851 @default.
- W2010162645 hasRelatedWork W1603605173 @default.
- W2010162645 hasRelatedWork W1832110895 @default.
- W2010162645 hasRelatedWork W1984615387 @default.
- W2010162645 hasRelatedWork W2041176007 @default.
- W2010162645 hasRelatedWork W2100752967 @default.
- W2010162645 hasRelatedWork W2152445738 @default.
- W2010162645 hasRelatedWork W225045806 @default.
- W2010162645 hasRelatedWork W2282062099 @default.
- W2010162645 hasRelatedWork W2354212205 @default.
- W2010162645 hasRelatedWork W2362522782 @default.
- W2010162645 hasRelatedWork W2375993676 @default.
- W2010162645 hasRelatedWork W2382462702 @default.
- W2010162645 hasRelatedWork W2955790965 @default.
- W2010162645 hasRelatedWork W2978071539 @default.
- W2010162645 hasRelatedWork W3126245445 @default.
- W2010162645 hasRelatedWork W315747311 @default.
- W2010162645 hasRelatedWork W3171931554 @default.
- W2010162645 hasRelatedWork W3203001245 @default.
- W2010162645 isParatext "false" @default.
- W2010162645 isRetracted "false" @default.