Matches in SemOpenAlex for { <https://semopenalex.org/work/W2958746411> ?p ?o ?g. }
- W2958746411 abstract "Despite the empirical success of the actor-critic algorithm, its theoretical understanding lags behind. In a broader context, actor-critic can be viewed as an online alternating update algorithm for bilevel optimization, whose convergence is known to be fragile. To understand the instability of actor-critic, we focus on its application to linear quadratic regulators, a simple yet fundamental setting of reinforcement learning. We establish a nonasymptotic convergence analysis of actor-critic in this setting. In particular, we prove that actor-critic finds a globally optimal pair of actor (policy) and critic (action-value function) at a linear rate of convergence. Our analysis may serve as a preliminary step towards a complete theoretical understanding of bilevel optimization with nonconvex subproblems, which is NP-hard in the worst case and is often solved using heuristics." @default.
- W2958746411 created "2019-07-23" @default.
- W2958746411 creator A5016900183 @default.
- W2958746411 creator A5048272675 @default.
- W2958746411 creator A5066940107 @default.
- W2958746411 creator A5078210646 @default.
- W2958746411 date "2019-07-14" @default.
- W2958746411 modified "2023-09-27" @default.
- W2958746411 title "On the Global Convergence of Actor-Critic: A Case for Linear Quadratic Regulator with Ergodic Cost" @default.
- W2958746411 cites W1499021337 @default.
- W2958746411 cites W1522531528 @default.
- W2958746411 cites W1574514837 @default.
- W2958746411 cites W1578630563 @default.
- W2958746411 cites W1588628884 @default.
- W2958746411 cites W1616818660 @default.
- W2958746411 cites W1676470423 @default.
- W2958746411 cites W1971800148 @default.
- W2958746411 cites W1992208280 @default.
- W2958746411 cites W2012110988 @default.
- W2958746411 cites W2032291279 @default.
- W2958746411 cites W2044287460 @default.
- W2958746411 cites W2046376809 @default.
- W2958746411 cites W2047364871 @default.
- W2958746411 cites W2054640142 @default.
- W2958746411 cites W2075268401 @default.
- W2958746411 cites W2094364653 @default.
- W2958746411 cites W2094387729 @default.
- W2958746411 cites W2098432798 @default.
- W2958746411 cites W2099471712 @default.
- W2958746411 cites W2100677568 @default.
- W2958746411 cites W2115558605 @default.
- W2958746411 cites W2119717200 @default.
- W2958746411 cites W2121703796 @default.
- W2958746411 cites W2121832485 @default.
- W2958746411 cites W2134042548 @default.
- W2958746411 cites W2155027007 @default.
- W2958746411 cites W2156737235 @default.
- W2958746411 cites W2165150801 @default.
- W2958746411 cites W2165905123 @default.
- W2958746411 cites W217831350 @default.
- W2958746411 cites W2315798686 @default.
- W2958746411 cites W2395162158 @default.
- W2958746411 cites W2527819024 @default.
- W2958746411 cites W2610857016 @default.
- W2958746411 cites W2614367549 @default.
- W2958746411 cites W2747402019 @default.
- W2958746411 cites W2761923184 @default.
- W2958746411 cites W2767075075 @default.
- W2958746411 cites W2785359650 @default.
- W2958746411 cites W2785728832 @default.
- W2958746411 cites W2788862106 @default.
- W2958746411 cites W2806985155 @default.
- W2958746411 cites W2886474253 @default.
- W2958746411 cites W2895628298 @default.
- W2958746411 cites W2895969815 @default.
- W2958746411 cites W2905418717 @default.
- W2958746411 cites W2906183303 @default.
- W2958746411 cites W2910718468 @default.
- W2958746411 cites W2920961312 @default.
- W2958746411 cites W2951222758 @default.
- W2958746411 cites W2951990408 @default.
- W2958746411 cites W2962737466 @default.
- W2958746411 cites W2963218690 @default.
- W2958746411 cites W2963277051 @default.
- W2958746411 cites W2963681938 @default.
- W2958746411 cites W2964043796 @default.
- W2958746411 cites W2964084913 @default.
- W2958746411 cites W2964203948 @default.
- W2958746411 cites W2964297418 @default.
- W2958746411 cites W2964990165 @default.
- W2958746411 cites W2973076431 @default.
- W2958746411 cites W3046916097 @default.
- W2958746411 cites W3195133498 @default.
- W2958746411 hasPublicationYear "2019" @default.
- W2958746411 type Work @default.
- W2958746411 sameAs 2958746411 @default.
- W2958746411 citedByCount "17" @default.
- W2958746411 countsByYear W29587464112019 @default.
- W2958746411 countsByYear W29587464112020 @default.
- W2958746411 countsByYear W29587464112021 @default.
- W2958746411 crossrefType "posted-content" @default.
- W2958746411 hasAuthorship W2958746411A5016900183 @default.
- W2958746411 hasAuthorship W2958746411A5048272675 @default.
- W2958746411 hasAuthorship W2958746411A5066940107 @default.
- W2958746411 hasAuthorship W2958746411A5078210646 @default.
- W2958746411 hasConcept C111472728 @default.
- W2958746411 hasConcept C122044880 @default.
- W2958746411 hasConcept C126255220 @default.
- W2958746411 hasConcept C127705205 @default.
- W2958746411 hasConcept C129844170 @default.
- W2958746411 hasConcept C134306372 @default.
- W2958746411 hasConcept C137836250 @default.
- W2958746411 hasConcept C138885662 @default.
- W2958746411 hasConcept C14036430 @default.
- W2958746411 hasConcept C14646407 @default.
- W2958746411 hasConcept C151730666 @default.
- W2958746411 hasConcept C154945302 @default.
- W2958746411 hasConcept C162324750 @default.
- W2958746411 hasConcept C2524010 @default.
- W2958746411 hasConcept C26517878 @default.