Matches in SemOpenAlex for { <https://semopenalex.org/work/W4221157223> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W4221157223 abstract "We consider infinite-horizon discounted Markov decision problems with finite state and action spaces and study the convergence rates of the projected policy gradient method and a general class of policy mirror descent methods, all with direct parametrization in the policy space. First, we develop a theory of weak gradient-mapping dominance and use it to prove sharper sublinear convergence rate of the projected policy gradient method. Then we show that with geometrically increasing step sizes, a general class of policy mirror descent methods, including the natural policy gradient method and a projected Q-descent method, all enjoy a linear rate of convergence without relying on entropy or other strongly convex regularization. Finally, we also analyze the convergence rate of an inexact policy mirror descent method and estimate its sample complexity under a simple generative model." @default.
- W4221157223 created "2022-04-03" @default.
- W4221157223 creator A5083014172 @default.
- W4221157223 date "2022-01-19" @default.
- W4221157223 modified "2023-09-29" @default.
- W4221157223 title "On the Convergence Rates of Policy Gradient Methods" @default.
- W4221157223 doi "https://doi.org/10.48550/arxiv.2201.07443" @default.
- W4221157223 hasPublicationYear "2022" @default.
- W4221157223 type Work @default.
- W4221157223 citedByCount "0" @default.
- W4221157223 crossrefType "posted-content" @default.
- W4221157223 hasAuthorship W4221157223A5083014172 @default.
- W4221157223 hasBestOaLocation W42211572231 @default.
- W4221157223 hasConcept C115680565 @default.
- W4221157223 hasConcept C117160843 @default.
- W4221157223 hasConcept C119857082 @default.
- W4221157223 hasConcept C126255220 @default.
- W4221157223 hasConcept C127162648 @default.
- W4221157223 hasConcept C134306372 @default.
- W4221157223 hasConcept C153258448 @default.
- W4221157223 hasConcept C154945302 @default.
- W4221157223 hasConcept C162324750 @default.
- W4221157223 hasConcept C206688291 @default.
- W4221157223 hasConcept C2776135515 @default.
- W4221157223 hasConcept C2777303404 @default.
- W4221157223 hasConcept C28826006 @default.
- W4221157223 hasConcept C31258907 @default.
- W4221157223 hasConcept C33923547 @default.
- W4221157223 hasConcept C41008148 @default.
- W4221157223 hasConcept C50522688 @default.
- W4221157223 hasConcept C50644808 @default.
- W4221157223 hasConcept C57869625 @default.
- W4221157223 hasConceptScore W4221157223C115680565 @default.
- W4221157223 hasConceptScore W4221157223C117160843 @default.
- W4221157223 hasConceptScore W4221157223C119857082 @default.
- W4221157223 hasConceptScore W4221157223C126255220 @default.
- W4221157223 hasConceptScore W4221157223C127162648 @default.
- W4221157223 hasConceptScore W4221157223C134306372 @default.
- W4221157223 hasConceptScore W4221157223C153258448 @default.
- W4221157223 hasConceptScore W4221157223C154945302 @default.
- W4221157223 hasConceptScore W4221157223C162324750 @default.
- W4221157223 hasConceptScore W4221157223C206688291 @default.
- W4221157223 hasConceptScore W4221157223C2776135515 @default.
- W4221157223 hasConceptScore W4221157223C2777303404 @default.
- W4221157223 hasConceptScore W4221157223C28826006 @default.
- W4221157223 hasConceptScore W4221157223C31258907 @default.
- W4221157223 hasConceptScore W4221157223C33923547 @default.
- W4221157223 hasConceptScore W4221157223C41008148 @default.
- W4221157223 hasConceptScore W4221157223C50522688 @default.
- W4221157223 hasConceptScore W4221157223C50644808 @default.
- W4221157223 hasConceptScore W4221157223C57869625 @default.
- W4221157223 hasLocation W42211572231 @default.
- W4221157223 hasOpenAccess W4221157223 @default.
- W4221157223 hasPrimaryLocation W42211572231 @default.
- W4221157223 hasRelatedWork W2115266937 @default.
- W4221157223 hasRelatedWork W2393217814 @default.
- W4221157223 hasRelatedWork W2946319938 @default.
- W4221157223 hasRelatedWork W2970671394 @default.
- W4221157223 hasRelatedWork W2971336218 @default.
- W4221157223 hasRelatedWork W3011663284 @default.
- W4221157223 hasRelatedWork W3183827394 @default.
- W4221157223 hasRelatedWork W3212731412 @default.
- W4221157223 hasRelatedWork W4221157223 @default.
- W4221157223 hasRelatedWork W4226246290 @default.
- W4221157223 isParatext "false" @default.
- W4221157223 isRetracted "false" @default.
- W4221157223 workType "article" @default.