Matches in SemOpenAlex for { <https://semopenalex.org/work/W3172461854> ?p ?o ?g. }
Showing items 1 to 94 of
94
with 100 items per page.
- W3172461854 abstract "Natural policy gradient (NPG) methods with entropy regularization achieve impressive empirical success in reinforcement learning problems with large state-action spaces. However, their convergence properties and the impact of entropy regularization remain elusive in the function approximation regime. In this paper, we establish finite-time convergence analyses of entropy-regularized NPG with linear function approximation under softmax parameterization. In particular, we prove that entropy-regularized NPG with averaging satisfies the emph{persistence of excitation} condition, and achieves a fast convergence rate of $tilde{O}(1/T)$ up to a function approximation error in regularized Markov decision processes. This convergence result does not require any a priori assumptions on the policies. Furthermore, under mild regularity conditions on the concentrability coefficient and basis vectors, we prove that entropy-regularized NPG exhibits emph{linear convergence} up to a function approximation error." @default.
- W3172461854 created "2021-06-22" @default.
- W3172461854 creator A5054923152 @default.
- W3172461854 creator A5071683073 @default.
- W3172461854 creator A5078518595 @default.
- W3172461854 date "2021-06-08" @default.
- W3172461854 modified "2023-09-23" @default.
- W3172461854 title "Linear Convergence of Entropy-Regularized Natural Policy Gradient with Linear Function Approximation" @default.
- W3172461854 cites W1575592356 @default.
- W3172461854 cites W1576452626 @default.
- W3172461854 cites W1771410628 @default.
- W3172461854 cites W1889629917 @default.
- W3172461854 cites W2073384958 @default.
- W3172461854 cites W2074680702 @default.
- W3172461854 cites W2104753538 @default.
- W3172461854 cites W2112269233 @default.
- W3172461854 cites W2119717200 @default.
- W3172461854 cites W2121863487 @default.
- W3172461854 cites W2122689259 @default.
- W3172461854 cites W2130801532 @default.
- W3172461854 cites W2136602922 @default.
- W3172461854 cites W2144902422 @default.
- W3172461854 cites W2155027007 @default.
- W3172461854 cites W2156737235 @default.
- W3172461854 cites W2172968643 @default.
- W3172461854 cites W2257979135 @default.
- W3172461854 cites W2397607997 @default.
- W3172461854 cites W2727576081 @default.
- W3172461854 cites W2948432982 @default.
- W3172461854 cites W2956068307 @default.
- W3172461854 cites W2962821147 @default.
- W3172461854 cites W2962902376 @default.
- W3172461854 cites W2963641140 @default.
- W3172461854 cites W2964043796 @default.
- W3172461854 cites W2964155733 @default.
- W3172461854 cites W2971587637 @default.
- W3172461854 cites W2998050631 @default.
- W3172461854 cites W3034426742 @default.
- W3172461854 cites W3041970508 @default.
- W3172461854 cites W3046626913 @default.
- W3172461854 cites W3127686539 @default.
- W3172461854 cites W3159422316 @default.
- W3172461854 cites W607505555 @default.
- W3172461854 doi "https://doi.org/10.48550/arxiv.2106.04096" @default.
- W3172461854 hasPublicationYear "2021" @default.
- W3172461854 type Work @default.
- W3172461854 sameAs 3172461854 @default.
- W3172461854 citedByCount "0" @default.
- W3172461854 crossrefType "posted-content" @default.
- W3172461854 hasAuthorship W3172461854A5054923152 @default.
- W3172461854 hasAuthorship W3172461854A5071683073 @default.
- W3172461854 hasAuthorship W3172461854A5078518595 @default.
- W3172461854 hasBestOaLocation W31724618541 @default.
- W3172461854 hasConcept C106301342 @default.
- W3172461854 hasConcept C121332964 @default.
- W3172461854 hasConcept C126255220 @default.
- W3172461854 hasConcept C127162648 @default.
- W3172461854 hasConcept C154945302 @default.
- W3172461854 hasConcept C2776135515 @default.
- W3172461854 hasConcept C28826006 @default.
- W3172461854 hasConcept C31258907 @default.
- W3172461854 hasConcept C33923547 @default.
- W3172461854 hasConcept C41008148 @default.
- W3172461854 hasConcept C57869625 @default.
- W3172461854 hasConcept C62520636 @default.
- W3172461854 hasConceptScore W3172461854C106301342 @default.
- W3172461854 hasConceptScore W3172461854C121332964 @default.
- W3172461854 hasConceptScore W3172461854C126255220 @default.
- W3172461854 hasConceptScore W3172461854C127162648 @default.
- W3172461854 hasConceptScore W3172461854C154945302 @default.
- W3172461854 hasConceptScore W3172461854C2776135515 @default.
- W3172461854 hasConceptScore W3172461854C28826006 @default.
- W3172461854 hasConceptScore W3172461854C31258907 @default.
- W3172461854 hasConceptScore W3172461854C33923547 @default.
- W3172461854 hasConceptScore W3172461854C41008148 @default.
- W3172461854 hasConceptScore W3172461854C57869625 @default.
- W3172461854 hasConceptScore W3172461854C62520636 @default.
- W3172461854 hasLocation W31724618541 @default.
- W3172461854 hasOpenAccess W3172461854 @default.
- W3172461854 hasPrimaryLocation W31724618541 @default.
- W3172461854 hasRelatedWork W13541812 @default.
- W3172461854 hasRelatedWork W1964781732 @default.
- W3172461854 hasRelatedWork W1974309151 @default.
- W3172461854 hasRelatedWork W2030166053 @default.
- W3172461854 hasRelatedWork W203722542 @default.
- W3172461854 hasRelatedWork W2058202007 @default.
- W3172461854 hasRelatedWork W2059247951 @default.
- W3172461854 hasRelatedWork W2953159110 @default.
- W3172461854 hasRelatedWork W3000352751 @default.
- W3172461854 hasRelatedWork W3126131685 @default.
- W3172461854 isParatext "false" @default.
- W3172461854 isRetracted "false" @default.
- W3172461854 magId "3172461854" @default.
- W3172461854 workType "article" @default.