Matches in SemOpenAlex for { <https://semopenalex.org/work/W4302306147> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4302306147 abstract "We consider infinite-horizon discounted Markov decision processes and study the convergence rates of the natural policy gradient (NPG) and the Q-NPG methods with the log-linear policy class. Using the compatible function approximation framework, both methods with log-linear policies can be written as inexact versions of the policy mirror descent (PMD) method. We show that both methods attain linear convergence rates and $tilde{mathcal{O}}(1/epsilon^2)$ sample complexities using a simple, non-adaptive geometrically increasing step size, without resorting to entropy or other strongly convex regularization. Lastly, as a byproduct, we obtain sublinear convergence rates for both methods with arbitrary constant step size." @default.
- W4302306147 created "2022-10-06" @default.
- W4302306147 creator A5008334471 @default.
- W4302306147 creator A5014791481 @default.
- W4302306147 creator A5033061754 @default.
- W4302306147 creator A5034558645 @default.
- W4302306147 creator A5083014172 @default.
- W4302306147 date "2022-10-04" @default.
- W4302306147 modified "2023-09-30" @default.
- W4302306147 title "Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies" @default.
- W4302306147 doi "https://doi.org/10.48550/arxiv.2210.01400" @default.
- W4302306147 hasPublicationYear "2022" @default.
- W4302306147 type Work @default.
- W4302306147 citedByCount "0" @default.
- W4302306147 crossrefType "posted-content" @default.
- W4302306147 hasAuthorship W4302306147A5008334471 @default.
- W4302306147 hasAuthorship W4302306147A5014791481 @default.
- W4302306147 hasAuthorship W4302306147A5033061754 @default.
- W4302306147 hasAuthorship W4302306147A5034558645 @default.
- W4302306147 hasAuthorship W4302306147A5083014172 @default.
- W4302306147 hasBestOaLocation W43023061471 @default.
- W4302306147 hasConcept C112680207 @default.
- W4302306147 hasConcept C114614502 @default.
- W4302306147 hasConcept C117160843 @default.
- W4302306147 hasConcept C126255220 @default.
- W4302306147 hasConcept C127162648 @default.
- W4302306147 hasConcept C145446738 @default.
- W4302306147 hasConcept C154945302 @default.
- W4302306147 hasConcept C162324750 @default.
- W4302306147 hasConcept C2524010 @default.
- W4302306147 hasConcept C2776135515 @default.
- W4302306147 hasConcept C2777303404 @default.
- W4302306147 hasConcept C28826006 @default.
- W4302306147 hasConcept C31258907 @default.
- W4302306147 hasConcept C33923547 @default.
- W4302306147 hasConcept C41008148 @default.
- W4302306147 hasConcept C50522688 @default.
- W4302306147 hasConcept C57869625 @default.
- W4302306147 hasConceptScore W4302306147C112680207 @default.
- W4302306147 hasConceptScore W4302306147C114614502 @default.
- W4302306147 hasConceptScore W4302306147C117160843 @default.
- W4302306147 hasConceptScore W4302306147C126255220 @default.
- W4302306147 hasConceptScore W4302306147C127162648 @default.
- W4302306147 hasConceptScore W4302306147C145446738 @default.
- W4302306147 hasConceptScore W4302306147C154945302 @default.
- W4302306147 hasConceptScore W4302306147C162324750 @default.
- W4302306147 hasConceptScore W4302306147C2524010 @default.
- W4302306147 hasConceptScore W4302306147C2776135515 @default.
- W4302306147 hasConceptScore W4302306147C2777303404 @default.
- W4302306147 hasConceptScore W4302306147C28826006 @default.
- W4302306147 hasConceptScore W4302306147C31258907 @default.
- W4302306147 hasConceptScore W4302306147C33923547 @default.
- W4302306147 hasConceptScore W4302306147C41008148 @default.
- W4302306147 hasConceptScore W4302306147C50522688 @default.
- W4302306147 hasConceptScore W4302306147C57869625 @default.
- W4302306147 hasLocation W43023061471 @default.
- W4302306147 hasOpenAccess W4302306147 @default.
- W4302306147 hasPrimaryLocation W43023061471 @default.
- W4302306147 hasRelatedWork W184886946 @default.
- W4302306147 hasRelatedWork W2089251650 @default.
- W4302306147 hasRelatedWork W2115266937 @default.
- W4302306147 hasRelatedWork W2238261533 @default.
- W4302306147 hasRelatedWork W2541235961 @default.
- W4302306147 hasRelatedWork W2807215287 @default.
- W4302306147 hasRelatedWork W2942896838 @default.
- W4302306147 hasRelatedWork W2970671394 @default.
- W4302306147 hasRelatedWork W4287897530 @default.
- W4302306147 hasRelatedWork W4291238459 @default.
- W4302306147 isParatext "false" @default.
- W4302306147 isRetracted "false" @default.
- W4302306147 workType "article" @default.