Matches in SemOpenAlex for { <https://semopenalex.org/work/W2152445738> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W2152445738 abstract "This paper describes two novel on-policy reinforcement learning algorithms, named QV(λ)-learning and the actor critic learning automaton (ACLA). Both algorithms learn a state value-function using TD(λ)-methods. The difference between the algorithms is that QV-learning uses the learned value function and a form of Q-learning to learn Q-values, whereas ACLA uses the value function and a learning automaton-like update rule to update the actor. We describe several possible advantages of these methods compared to other value-function-based reinforcement learning algorithms such as Q-learning, Sarsa, and conventional actor-critic methods. Experiments are performed on (1) small, (2) large, (3) partially observable, and (4) dynamic maze problems with tabular and neural network value-function representations, and on the mountain car problem. The overall results show that the two novel algorithms can outperform previously known reinforcement learning algorithms" @default.
- W2152445738 created "2016-06-24" @default.
- W2152445738 creator A5033135596 @default.
- W2152445738 creator A5060596453 @default.
- W2152445738 date "2007-04-01" @default.
- W2152445738 modified "2023-09-24" @default.
- W2152445738 title "Two Novel On-policy Reinforcement Learning Algorithms based on TD(λ)-methods" @default.
- W2152445738 cites W2064018461 @default.
- W2152445738 cites W2082261506 @default.
- W2152445738 cites W2091565802 @default.
- W2152445738 cites W2107726111 @default.
- W2152445738 cites W2113913482 @default.
- W2152445738 cites W2150339816 @default.
- W2152445738 cites W3041202696 @default.
- W2152445738 cites W32403112 @default.
- W2152445738 doi "https://doi.org/10.1109/adprl.2007.368200" @default.
- W2152445738 hasPublicationYear "2007" @default.
- W2152445738 type Work @default.
- W2152445738 sameAs 2152445738 @default.
- W2152445738 citedByCount "18" @default.
- W2152445738 countsByYear W21524457382012 @default.
- W2152445738 countsByYear W21524457382013 @default.
- W2152445738 countsByYear W21524457382014 @default.
- W2152445738 countsByYear W21524457382017 @default.
- W2152445738 countsByYear W21524457382018 @default.
- W2152445738 countsByYear W21524457382019 @default.
- W2152445738 countsByYear W21524457382020 @default.
- W2152445738 countsByYear W21524457382022 @default.
- W2152445738 crossrefType "proceedings-article" @default.
- W2152445738 hasAuthorship W2152445738A5033135596 @default.
- W2152445738 hasAuthorship W2152445738A5060596453 @default.
- W2152445738 hasConcept C112505250 @default.
- W2152445738 hasConcept C11413529 @default.
- W2152445738 hasConcept C119857082 @default.
- W2152445738 hasConcept C126255220 @default.
- W2152445738 hasConcept C14036430 @default.
- W2152445738 hasConcept C14646407 @default.
- W2152445738 hasConcept C154945302 @default.
- W2152445738 hasConcept C188116033 @default.
- W2152445738 hasConcept C196340769 @default.
- W2152445738 hasConcept C199190896 @default.
- W2152445738 hasConcept C2776291640 @default.
- W2152445738 hasConcept C2776807809 @default.
- W2152445738 hasConcept C33923547 @default.
- W2152445738 hasConcept C41008148 @default.
- W2152445738 hasConcept C50644808 @default.
- W2152445738 hasConcept C78458016 @default.
- W2152445738 hasConcept C86803240 @default.
- W2152445738 hasConcept C91873725 @default.
- W2152445738 hasConcept C97541855 @default.
- W2152445738 hasConceptScore W2152445738C112505250 @default.
- W2152445738 hasConceptScore W2152445738C11413529 @default.
- W2152445738 hasConceptScore W2152445738C119857082 @default.
- W2152445738 hasConceptScore W2152445738C126255220 @default.
- W2152445738 hasConceptScore W2152445738C14036430 @default.
- W2152445738 hasConceptScore W2152445738C14646407 @default.
- W2152445738 hasConceptScore W2152445738C154945302 @default.
- W2152445738 hasConceptScore W2152445738C188116033 @default.
- W2152445738 hasConceptScore W2152445738C196340769 @default.
- W2152445738 hasConceptScore W2152445738C199190896 @default.
- W2152445738 hasConceptScore W2152445738C2776291640 @default.
- W2152445738 hasConceptScore W2152445738C2776807809 @default.
- W2152445738 hasConceptScore W2152445738C33923547 @default.
- W2152445738 hasConceptScore W2152445738C41008148 @default.
- W2152445738 hasConceptScore W2152445738C50644808 @default.
- W2152445738 hasConceptScore W2152445738C78458016 @default.
- W2152445738 hasConceptScore W2152445738C86803240 @default.
- W2152445738 hasConceptScore W2152445738C91873725 @default.
- W2152445738 hasConceptScore W2152445738C97541855 @default.
- W2152445738 hasLocation W21524457381 @default.
- W2152445738 hasOpenAccess W2152445738 @default.
- W2152445738 hasPrimaryLocation W21524457381 @default.
- W2152445738 hasRelatedWork W1537974983 @default.
- W2152445738 hasRelatedWork W2025663273 @default.
- W2152445738 hasRelatedWork W2089415692 @default.
- W2152445738 hasRelatedWork W2119031567 @default.
- W2152445738 hasRelatedWork W2129532790 @default.
- W2152445738 hasRelatedWork W2152445738 @default.
- W2152445738 hasRelatedWork W2208012000 @default.
- W2152445738 hasRelatedWork W2984109677 @default.
- W2152445738 hasRelatedWork W4285484150 @default.
- W2152445738 hasRelatedWork W4308702637 @default.
- W2152445738 isParatext "false" @default.
- W2152445738 isRetracted "false" @default.
- W2152445738 magId "2152445738" @default.
- W2152445738 workType "article" @default.