Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226246290> ?p ?o ?g. }
Showing items 1 to 97 of
97
with 100 items per page.
- W4226246290 abstract "Markov Decision Processes are classically solved using Value Iteration and Policy Iteration algorithms. Recent interest in Reinforcement Learning has motivated the study of methods inspired by optimization, such as gradient ascent. Among these, a popular algorithm is the Natural Policy Gradient, which is a mirror descent variant for MDPs. This algorithm forms the basis of several popular RL algorithms such as Natural actor-critic, TRPO, PPO, etc, and so is being studied with growing interest. It has been shown that Natural Policy Gradient with constant step size converges with a sublinear rate of $mathcal{O}(1/k)$ to the global optimal. In this paper, we present improved finite time convergence bounds, and show that this algorithm has geometric (also known as linear) asymptotic convergence rate. We further improve this convergence result by introducing a variant of Natural Policy Gradient with adaptive step sizes. Finally, we compare different variants of policy gradient methods experimentally." @default.
- W4226246290 created "2022-05-05" @default.
- W4226246290 creator A5018172757 @default.
- W4226246290 creator A5021806638 @default.
- W4226246290 creator A5045270609 @default.
- W4226246290 creator A5051894559 @default.
- W4226246290 date "2021-12-14" @default.
- W4226246290 modified "2023-10-01" @default.
- W4226246290 title "On the Linear Convergence of Natural Policy Gradient Algorithm" @default.
- W4226246290 cites W1971945429 @default.
- W4226246290 cites W1980516134 @default.
- W4226246290 cites W2074680702 @default.
- W4226246290 cites W2094387729 @default.
- W4226246290 cites W2115809996 @default.
- W4226246290 cites W2172968643 @default.
- W4226246290 cites W2763081248 @default.
- W4226246290 cites W2964123095 @default.
- W4226246290 cites W2998050631 @default.
- W4226246290 cites W3011080757 @default.
- W4226246290 cites W4226246290 @default.
- W4226246290 doi "https://doi.org/10.1109/cdc45484.2021.9682908" @default.
- W4226246290 hasPublicationYear "2021" @default.
- W4226246290 type Work @default.
- W4226246290 citedByCount "4" @default.
- W4226246290 countsByYear W42262462902021 @default.
- W4226246290 countsByYear W42262462902023 @default.
- W4226246290 crossrefType "proceedings-article" @default.
- W4226246290 hasAuthorship W4226246290A5018172757 @default.
- W4226246290 hasAuthorship W4226246290A5021806638 @default.
- W4226246290 hasAuthorship W4226246290A5045270609 @default.
- W4226246290 hasAuthorship W4226246290A5051894559 @default.
- W4226246290 hasBestOaLocation W42262462902 @default.
- W4226246290 hasConcept C105795698 @default.
- W4226246290 hasConcept C106189395 @default.
- W4226246290 hasConcept C11413529 @default.
- W4226246290 hasConcept C115680565 @default.
- W4226246290 hasConcept C117160843 @default.
- W4226246290 hasConcept C126255220 @default.
- W4226246290 hasConcept C127162648 @default.
- W4226246290 hasConcept C134306372 @default.
- W4226246290 hasConcept C153258448 @default.
- W4226246290 hasConcept C154945302 @default.
- W4226246290 hasConcept C159886148 @default.
- W4226246290 hasConcept C162324750 @default.
- W4226246290 hasConcept C199360897 @default.
- W4226246290 hasConcept C206688291 @default.
- W4226246290 hasConcept C2777027219 @default.
- W4226246290 hasConcept C2777303404 @default.
- W4226246290 hasConcept C28826006 @default.
- W4226246290 hasConcept C31258907 @default.
- W4226246290 hasConcept C33923547 @default.
- W4226246290 hasConcept C41008148 @default.
- W4226246290 hasConcept C50522688 @default.
- W4226246290 hasConcept C50644808 @default.
- W4226246290 hasConcept C57869625 @default.
- W4226246290 hasConcept C97541855 @default.
- W4226246290 hasConceptScore W4226246290C105795698 @default.
- W4226246290 hasConceptScore W4226246290C106189395 @default.
- W4226246290 hasConceptScore W4226246290C11413529 @default.
- W4226246290 hasConceptScore W4226246290C115680565 @default.
- W4226246290 hasConceptScore W4226246290C117160843 @default.
- W4226246290 hasConceptScore W4226246290C126255220 @default.
- W4226246290 hasConceptScore W4226246290C127162648 @default.
- W4226246290 hasConceptScore W4226246290C134306372 @default.
- W4226246290 hasConceptScore W4226246290C153258448 @default.
- W4226246290 hasConceptScore W4226246290C154945302 @default.
- W4226246290 hasConceptScore W4226246290C159886148 @default.
- W4226246290 hasConceptScore W4226246290C162324750 @default.
- W4226246290 hasConceptScore W4226246290C199360897 @default.
- W4226246290 hasConceptScore W4226246290C206688291 @default.
- W4226246290 hasConceptScore W4226246290C2777027219 @default.
- W4226246290 hasConceptScore W4226246290C2777303404 @default.
- W4226246290 hasConceptScore W4226246290C28826006 @default.
- W4226246290 hasConceptScore W4226246290C31258907 @default.
- W4226246290 hasConceptScore W4226246290C33923547 @default.
- W4226246290 hasConceptScore W4226246290C41008148 @default.
- W4226246290 hasConceptScore W4226246290C50522688 @default.
- W4226246290 hasConceptScore W4226246290C50644808 @default.
- W4226246290 hasConceptScore W4226246290C57869625 @default.
- W4226246290 hasConceptScore W4226246290C97541855 @default.
- W4226246290 hasLocation W42262462901 @default.
- W4226246290 hasLocation W42262462902 @default.
- W4226246290 hasOpenAccess W4226246290 @default.
- W4226246290 hasPrimaryLocation W42262462901 @default.
- W4226246290 hasRelatedWork W1626977535 @default.
- W4226246290 hasRelatedWork W2145363145 @default.
- W4226246290 hasRelatedWork W2349504429 @default.
- W4226246290 hasRelatedWork W2393217814 @default.
- W4226246290 hasRelatedWork W2766590049 @default.
- W4226246290 hasRelatedWork W3126996176 @default.
- W4226246290 hasRelatedWork W3159422316 @default.
- W4226246290 hasRelatedWork W4221157223 @default.
- W4226246290 hasRelatedWork W4226246290 @default.
- W4226246290 hasRelatedWork W4287185865 @default.
- W4226246290 isParatext "false" @default.
- W4226246290 isRetracted "false" @default.
- W4226246290 workType "article" @default.