Matches in SemOpenAlex for { <https://semopenalex.org/work/W2946924892> ?p ?o ?g. }
- W2946924892 abstract "The linear quadratic regulator (LQR) problem has reemerged as an important theoretical benchmark for reinforcement learning-based control of complex dynamical systems with continuous state and action spaces. In contrast with nearly all recent work in this area, we consider multiplicative noise models, which are increasingly relevant because they explicitly incorporate inherent uncertainty and variation in the system dynamics and thereby improve robustness properties of the controller. Robustness is a critical and poorly understood issue in reinforcement learning; existing methods which do not account for uncertainty can converge to fragile policies or fail to converge at all. Additionally, intentional injection of multiplicative noise into learning algorithms can enhance robustness of policies, as observed in ad hoc work on domain randomization. Although policy gradient algorithms require optimization of a non-convex cost function, we show that the multiplicative noise LQR cost has a special property called gradient domination, which is exploited to prove global convergence of policy gradient algorithms to the globally optimum control policy with polynomial dependence on problem parameters. Results are provided both in the model-known and model-unknown settings where samples of system trajectories are used to estimate policy gradients." @default.
- W2946924892 created "2019-06-07" @default.
- W2946924892 date "2019-05-28" @default.
- W2946924892 modified "2023-09-27" @default.
- W2946924892 title "Learning robust control for LQR systems with multiplicative noise via policy gradient." @default.
- W2946924892 cites W1582653889 @default.
- W2946924892 cites W1616818660 @default.
- W2946924892 cites W1975498076 @default.
- W2946924892 cites W1981042292 @default.
- W2946924892 cites W1981066143 @default.
- W2946924892 cites W1990139897 @default.
- W2946924892 cites W2002858896 @default.
- W2946924892 cites W2004001705 @default.
- W2946924892 cites W2011866373 @default.
- W2946924892 cites W2020677283 @default.
- W2946924892 cites W2022611390 @default.
- W2946924892 cites W2052334067 @default.
- W2946924892 cites W2055140741 @default.
- W2946924892 cites W2082210980 @default.
- W2946924892 cites W2098432798 @default.
- W2946924892 cites W2120871244 @default.
- W2946924892 cites W2123486671 @default.
- W2946924892 cites W2124783336 @default.
- W2946924892 cites W2127107099 @default.
- W2946924892 cites W2129488707 @default.
- W2946924892 cites W2130801532 @default.
- W2946924892 cites W2145339207 @default.
- W2946924892 cites W2152161277 @default.
- W2946924892 cites W2153292828 @default.
- W2946924892 cites W2153461933 @default.
- W2946924892 cites W2163083887 @default.
- W2946924892 cites W2168090960 @default.
- W2946924892 cites W2257979135 @default.
- W2946924892 cites W2275844880 @default.
- W2946924892 cites W2395955964 @default.
- W2946924892 cites W2590144118 @default.
- W2946924892 cites W2605102758 @default.
- W2946924892 cites W2761923184 @default.
- W2946924892 cites W2776811469 @default.
- W2946924892 cites W2804569092 @default.
- W2946924892 cites W2822752092 @default.
- W2946924892 cites W2854640194 @default.
- W2946924892 cites W2886474253 @default.
- W2946924892 cites W2888916820 @default.
- W2946924892 cites W2896048729 @default.
- W2946924892 cites W2902907165 @default.
- W2946924892 cites W2916346876 @default.
- W2946924892 cites W2943040161 @default.
- W2946924892 cites W2963248893 @default.
- W2946924892 cites W2963400148 @default.
- W2946924892 cites W2963774238 @default.
- W2946924892 cites W2970142535 @default.
- W2946924892 cites W3012118400 @default.
- W2946924892 cites W3012546009 @default.
- W2946924892 cites W3045891384 @default.
- W2946924892 cites W3046017196 @default.
- W2946924892 cites W3156973010 @default.
- W2946924892 cites W3210839039 @default.
- W2946924892 cites W53582479 @default.
- W2946924892 cites W578177376 @default.
- W2946924892 cites W651734400 @default.
- W2946924892 hasPublicationYear "2019" @default.
- W2946924892 type Work @default.
- W2946924892 sameAs 2946924892 @default.
- W2946924892 citedByCount "8" @default.
- W2946924892 countsByYear W29469248922019 @default.
- W2946924892 countsByYear W29469248922020 @default.
- W2946924892 countsByYear W29469248922021 @default.
- W2946924892 crossrefType "posted-content" @default.
- W2946924892 hasConcept C104317684 @default.
- W2946924892 hasConcept C121332964 @default.
- W2946924892 hasConcept C126255220 @default.
- W2946924892 hasConcept C131021393 @default.
- W2946924892 hasConcept C13412647 @default.
- W2946924892 hasConcept C134306372 @default.
- W2946924892 hasConcept C154945302 @default.
- W2946924892 hasConcept C18015164 @default.
- W2946924892 hasConcept C185592680 @default.
- W2946924892 hasConcept C2775924081 @default.
- W2946924892 hasConcept C33923547 @default.
- W2946924892 hasConcept C41008148 @default.
- W2946924892 hasConcept C42747912 @default.
- W2946924892 hasConcept C47446073 @default.
- W2946924892 hasConcept C55493867 @default.
- W2946924892 hasConcept C62520636 @default.
- W2946924892 hasConcept C63479239 @default.
- W2946924892 hasConcept C79379906 @default.
- W2946924892 hasConcept C84462506 @default.
- W2946924892 hasConcept C91575142 @default.
- W2946924892 hasConcept C9390403 @default.
- W2946924892 hasConcept C97541855 @default.
- W2946924892 hasConcept C98779006 @default.
- W2946924892 hasConceptScore W2946924892C104317684 @default.
- W2946924892 hasConceptScore W2946924892C121332964 @default.
- W2946924892 hasConceptScore W2946924892C126255220 @default.
- W2946924892 hasConceptScore W2946924892C131021393 @default.
- W2946924892 hasConceptScore W2946924892C13412647 @default.
- W2946924892 hasConceptScore W2946924892C134306372 @default.
- W2946924892 hasConceptScore W2946924892C154945302 @default.
- W2946924892 hasConceptScore W2946924892C18015164 @default.