Matches in SemOpenAlex for { <https://semopenalex.org/work/W3208296466> ?p ?o ?g. }
- W3208296466 endingPage "9073" @default.
- W3208296466 startingPage "9066" @default.
- W3208296466 abstract "Policy gradient methods have been frequently applied to problems in control and reinforcement learning with great success, yet existing convergence analysis still relies on non-intuitive, impractical and often opaque conditions. In particular, existing rates are achieved in limited settings, under strict regularity conditions. In this work, we establish explicit convergence rates of policy gradient methods, extending the convergence regime to weakly smooth policy classes with L2 integrable gradient. We provide intuitive examples to illustrate the insight behind these new conditions. Notably, our analysis also shows that convergence rates are achievable for both the standard policy gradient and the natural policy gradient algorithms under these assumptions. Lastly we provide performance guarantees for the converged policies." @default.
- W3208296466 created "2021-11-08" @default.
- W3208296466 creator A5022491830 @default.
- W3208296466 creator A5061193324 @default.
- W3208296466 creator A5061538996 @default.
- W3208296466 date "2022-06-28" @default.
- W3208296466 modified "2023-09-26" @default.
- W3208296466 title "Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings" @default.
- W3208296466 cites W107583932 @default.
- W3208296466 cites W1499021337 @default.
- W3208296466 cites W1970789124 @default.
- W3208296466 cites W1977655452 @default.
- W3208296466 cites W1980516134 @default.
- W3208296466 cites W1985240368 @default.
- W3208296466 cites W2070355878 @default.
- W3208296466 cites W2086161653 @default.
- W3208296466 cites W2109910161 @default.
- W3208296466 cites W2113501460 @default.
- W3208296466 cites W2119717200 @default.
- W3208296466 cites W2124768887 @default.
- W3208296466 cites W2130801532 @default.
- W3208296466 cites W2151299863 @default.
- W3208296466 cites W2155027007 @default.
- W3208296466 cites W2156737235 @default.
- W3208296466 cites W2210625349 @default.
- W3208296466 cites W2272094913 @default.
- W3208296466 cites W2344786740 @default.
- W3208296466 cites W2545578326 @default.
- W3208296466 cites W2695227890 @default.
- W3208296466 cites W2740828027 @default.
- W3208296466 cites W2783637925 @default.
- W3208296466 cites W2796196168 @default.
- W3208296466 cites W2798888671 @default.
- W3208296466 cites W2805861379 @default.
- W3208296466 cites W2944187456 @default.
- W3208296466 cites W2948677277 @default.
- W3208296466 cites W2962802215 @default.
- W3208296466 cites W2963867712 @default.
- W3208296466 cites W2964084913 @default.
- W3208296466 cites W2964282829 @default.
- W3208296466 cites W2969281871 @default.
- W3208296466 cites W2971587637 @default.
- W3208296466 cites W2977813751 @default.
- W3208296466 cites W2979895842 @default.
- W3208296466 cites W2981237928 @default.
- W3208296466 cites W2990830025 @default.
- W3208296466 cites W3018736630 @default.
- W3208296466 cites W3034426742 @default.
- W3208296466 cites W3034871777 @default.
- W3208296466 cites W3038006656 @default.
- W3208296466 cites W3038915804 @default.
- W3208296466 cites W3043114440 @default.
- W3208296466 cites W3046626913 @default.
- W3208296466 cites W3103047293 @default.
- W3208296466 cites W3117137507 @default.
- W3208296466 cites W3132054471 @default.
- W3208296466 cites W3160101512 @default.
- W3208296466 doi "https://doi.org/10.1609/aaai.v36i8.20891" @default.
- W3208296466 hasPublicationYear "2022" @default.
- W3208296466 type Work @default.
- W3208296466 sameAs 3208296466 @default.
- W3208296466 citedByCount "1" @default.
- W3208296466 countsByYear W32082964662023 @default.
- W3208296466 crossrefType "journal-article" @default.
- W3208296466 hasAuthorship W3208296466A5022491830 @default.
- W3208296466 hasAuthorship W3208296466A5061193324 @default.
- W3208296466 hasAuthorship W3208296466A5061538996 @default.
- W3208296466 hasBestOaLocation W32082964661 @default.
- W3208296466 hasConcept C115680565 @default.
- W3208296466 hasConcept C121332964 @default.
- W3208296466 hasConcept C126255220 @default.
- W3208296466 hasConcept C162324750 @default.
- W3208296466 hasConcept C18762648 @default.
- W3208296466 hasConcept C26517878 @default.
- W3208296466 hasConcept C2777303404 @default.
- W3208296466 hasConcept C28826006 @default.
- W3208296466 hasConcept C33923547 @default.
- W3208296466 hasConcept C38652104 @default.
- W3208296466 hasConcept C41008148 @default.
- W3208296466 hasConcept C50522688 @default.
- W3208296466 hasConcept C57869625 @default.
- W3208296466 hasConcept C97355855 @default.
- W3208296466 hasConceptScore W3208296466C115680565 @default.
- W3208296466 hasConceptScore W3208296466C121332964 @default.
- W3208296466 hasConceptScore W3208296466C126255220 @default.
- W3208296466 hasConceptScore W3208296466C162324750 @default.
- W3208296466 hasConceptScore W3208296466C18762648 @default.
- W3208296466 hasConceptScore W3208296466C26517878 @default.
- W3208296466 hasConceptScore W3208296466C2777303404 @default.
- W3208296466 hasConceptScore W3208296466C28826006 @default.
- W3208296466 hasConceptScore W3208296466C33923547 @default.
- W3208296466 hasConceptScore W3208296466C38652104 @default.
- W3208296466 hasConceptScore W3208296466C41008148 @default.
- W3208296466 hasConceptScore W3208296466C50522688 @default.
- W3208296466 hasConceptScore W3208296466C57869625 @default.
- W3208296466 hasConceptScore W3208296466C97355855 @default.
- W3208296466 hasIssue "8" @default.
- W3208296466 hasLocation W32082964661 @default.