Matches in SemOpenAlex for { <https://semopenalex.org/work/W3109124629> ?p ?o ?g. }
- W3109124629 abstract "Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning. In this paper, we investigate the global convergence of gradient-based policy optimization methods for quadratic optimal control of discrete-time Markovian jump linear systems (MJLS). First, we study the optimization landscape of direct policy optimization for MJLS, with static state feedback controllers and quadratic performance costs. Despite the non-convexity of the resultant problem, we are still able to identify several useful properties such as coercivity, gradient dominance, and almost smoothness. Based on these properties, we show global convergence of three types of policy optimization methods: the gradient descent method; the Gauss-Newton method; and the natural policy gradient method. We prove that all three methods converge to the optimal state feedback controller for MJLS at a linear rate if initialized at a controller which is mean-square stabilizing. Some numerical examples are presented to support the theory. This work brings new insights for understanding the performance of policy gradient methods on the Markovian jump linear quadratic control problem." @default.
- W3109124629 created "2020-12-07" @default.
- W3109124629 creator A5016980595 @default.
- W3109124629 creator A5061952552 @default.
- W3109124629 creator A5075872711 @default.
- W3109124629 date "2020-11-24" @default.
- W3109124629 modified "2023-09-27" @default.
- W3109124629 title "Policy Optimization for Markovian Jump Linear Quadratic Control: Gradient-Based Methods and Global Convergence." @default.
- W3109124629 cites W1490978409 @default.
- W3109124629 cites W1531616435 @default.
- W3109124629 cites W1646998587 @default.
- W3109124629 cites W1771410628 @default.
- W3109124629 cites W1798181000 @default.
- W3109124629 cites W181733065 @default.
- W3109124629 cites W1976058996 @default.
- W3109124629 cites W2056241376 @default.
- W3109124629 cites W205960364 @default.
- W3109124629 cites W2103979068 @default.
- W3109124629 cites W2109957730 @default.
- W3109124629 cites W2110935301 @default.
- W3109124629 cites W2121863487 @default.
- W3109124629 cites W2122068733 @default.
- W3109124629 cites W2130801532 @default.
- W3109124629 cites W2131387339 @default.
- W3109124629 cites W2149479912 @default.
- W3109124629 cites W2155027007 @default.
- W3109124629 cites W2172968643 @default.
- W3109124629 cites W2179469667 @default.
- W3109124629 cites W2569546761 @default.
- W3109124629 cites W2736601468 @default.
- W3109124629 cites W2886474253 @default.
- W3109124629 cites W2946924892 @default.
- W3109124629 cites W2949608212 @default.
- W3109124629 cites W2951990408 @default.
- W3109124629 cites W2958746411 @default.
- W3109124629 cites W2962749646 @default.
- W3109124629 cites W2963120839 @default.
- W3109124629 cites W2963602876 @default.
- W3109124629 cites W2963641140 @default.
- W3109124629 cites W2963774238 @default.
- W3109124629 cites W2964161785 @default.
- W3109124629 cites W2964990165 @default.
- W3109124629 cites W2970351915 @default.
- W3109124629 cites W2970400802 @default.
- W3109124629 cites W2982428006 @default.
- W3109124629 cites W2994779591 @default.
- W3109124629 cites W2998481680 @default.
- W3109124629 cites W3011908139 @default.
- W3109124629 cites W3012118400 @default.
- W3109124629 cites W3020047188 @default.
- W3109124629 cites W3035048301 @default.
- W3109124629 cites W3046017196 @default.
- W3109124629 cites W3100975812 @default.
- W3109124629 cites W3108902193 @default.
- W3109124629 cites W3110042586 @default.
- W3109124629 cites W3209135762 @default.
- W3109124629 cites W360528349 @default.
- W3109124629 cites W97620058 @default.
- W3109124629 hasPublicationYear "2020" @default.
- W3109124629 type Work @default.
- W3109124629 sameAs 3109124629 @default.
- W3109124629 citedByCount "2" @default.
- W3109124629 countsByYear W31091246292021 @default.
- W3109124629 crossrefType "posted-content" @default.
- W3109124629 hasAuthorship W3109124629A5016980595 @default.
- W3109124629 hasAuthorship W3109124629A5061952552 @default.
- W3109124629 hasAuthorship W3109124629A5075872711 @default.
- W3109124629 hasConcept C102634674 @default.
- W3109124629 hasConcept C106159729 @default.
- W3109124629 hasConcept C115680565 @default.
- W3109124629 hasConcept C119857082 @default.
- W3109124629 hasConcept C126255220 @default.
- W3109124629 hasConcept C129844170 @default.
- W3109124629 hasConcept C134306372 @default.
- W3109124629 hasConcept C137836250 @default.
- W3109124629 hasConcept C153258448 @default.
- W3109124629 hasConcept C154945302 @default.
- W3109124629 hasConcept C162324750 @default.
- W3109124629 hasConcept C203479927 @default.
- W3109124629 hasConcept C2524010 @default.
- W3109124629 hasConcept C2775924081 @default.
- W3109124629 hasConcept C2777303404 @default.
- W3109124629 hasConcept C33923547 @default.
- W3109124629 hasConcept C41008148 @default.
- W3109124629 hasConcept C47446073 @default.
- W3109124629 hasConcept C50522688 @default.
- W3109124629 hasConcept C50644808 @default.
- W3109124629 hasConcept C6557445 @default.
- W3109124629 hasConcept C72134830 @default.
- W3109124629 hasConcept C86803240 @default.
- W3109124629 hasConcept C91575142 @default.
- W3109124629 hasConceptScore W3109124629C102634674 @default.
- W3109124629 hasConceptScore W3109124629C106159729 @default.
- W3109124629 hasConceptScore W3109124629C115680565 @default.
- W3109124629 hasConceptScore W3109124629C119857082 @default.
- W3109124629 hasConceptScore W3109124629C126255220 @default.
- W3109124629 hasConceptScore W3109124629C129844170 @default.
- W3109124629 hasConceptScore W3109124629C134306372 @default.
- W3109124629 hasConceptScore W3109124629C137836250 @default.
- W3109124629 hasConceptScore W3109124629C153258448 @default.