Matches in SemOpenAlex for { <https://semopenalex.org/work/W3107198524> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W3107198524 abstract "We study model-based and model-free policy optimization in a class of nonzero-sum stochastic dynamic games called linear quadratic (LQ) deep structured games. In such games, players interact with each other through a set of weighted averages (linear regressions) of the states and actions. In this paper, we focus our attention to homogeneous weights; however, for the special case of infinite population, the obtained results extend to asymptotically vanishing weights wherein the players learn the sequential weighted mean-field equilibrium. Despite the non-convexity of the optimization in policy space and the fact that policy optimization does not generally converge in game setting, we prove that the proposed model-based and model-free policy gradient descent and natural policy gradient descent algorithms globally converge to the sub-game perfect Nash equilibrium. To the best of our knowledge, this is the first result that provides a global convergence proof of policy optimization in a nonzero-sum LQ game. One of the salient features of the proposed algorithms is that their parameter space is independent of the number of players, and when the dimension of state space is significantly larger than that of the action space, they provide a more efficient way of computation compared to those algorithms that plan and learn in the action space. Finally, some simulations are provided to numerically verify the obtained theoretical results." @default.
- W3107198524 created "2020-12-07" @default.
- W3107198524 creator A5023864111 @default.
- W3107198524 creator A5043000045 @default.
- W3107198524 creator A5062571668 @default.
- W3107198524 date "2020-11-29" @default.
- W3107198524 modified "2023-09-27" @default.
- W3107198524 title "Reinforcement Learning in Nonzero-sum Linear Quadratic Deep Structured Games: Global Convergence of Policy Optimization" @default.
- W3107198524 cites W1977377218 @default.
- W3107198524 cites W1987217557 @default.
- W3107198524 cites W2038686546 @default.
- W3107198524 cites W2292219073 @default.
- W3107198524 cites W2765128655 @default.
- W3107198524 cites W2791784110 @default.
- W3107198524 cites W2886474253 @default.
- W3107198524 cites W2886579254 @default.
- W3107198524 cites W2953732167 @default.
- W3107198524 cites W2955520961 @default.
- W3107198524 cites W2970537473 @default.
- W3107198524 cites W3002705417 @default.
- W3107198524 cites W3008744877 @default.
- W3107198524 cites W3043568155 @default.
- W3107198524 cites W3100023842 @default.
- W3107198524 cites W3100174854 @default.
- W3107198524 cites W3120633829 @default.
- W3107198524 cites W3162021874 @default.
- W3107198524 cites W586490843 @default.
- W3107198524 cites W2913603117 @default.
- W3107198524 hasPublicationYear "2020" @default.
- W3107198524 type Work @default.
- W3107198524 sameAs 3107198524 @default.
- W3107198524 citedByCount "0" @default.
- W3107198524 crossrefType "posted-content" @default.
- W3107198524 hasAuthorship W3107198524A5023864111 @default.
- W3107198524 hasAuthorship W3107198524A5043000045 @default.
- W3107198524 hasAuthorship W3107198524A5062571668 @default.
- W3107198524 hasConcept C126255220 @default.
- W3107198524 hasConcept C144237770 @default.
- W3107198524 hasConcept C154945302 @default.
- W3107198524 hasConcept C162324750 @default.
- W3107198524 hasConcept C202444582 @default.
- W3107198524 hasConcept C2777303404 @default.
- W3107198524 hasConcept C33676613 @default.
- W3107198524 hasConcept C33923547 @default.
- W3107198524 hasConcept C41008148 @default.
- W3107198524 hasConcept C46814582 @default.
- W3107198524 hasConcept C50522688 @default.
- W3107198524 hasConcept C97541855 @default.
- W3107198524 hasConceptScore W3107198524C126255220 @default.
- W3107198524 hasConceptScore W3107198524C144237770 @default.
- W3107198524 hasConceptScore W3107198524C154945302 @default.
- W3107198524 hasConceptScore W3107198524C162324750 @default.
- W3107198524 hasConceptScore W3107198524C202444582 @default.
- W3107198524 hasConceptScore W3107198524C2777303404 @default.
- W3107198524 hasConceptScore W3107198524C33676613 @default.
- W3107198524 hasConceptScore W3107198524C33923547 @default.
- W3107198524 hasConceptScore W3107198524C41008148 @default.
- W3107198524 hasConceptScore W3107198524C46814582 @default.
- W3107198524 hasConceptScore W3107198524C50522688 @default.
- W3107198524 hasConceptScore W3107198524C97541855 @default.
- W3107198524 hasLocation W31071985241 @default.
- W3107198524 hasOpenAccess W3107198524 @default.
- W3107198524 hasPrimaryLocation W31071985241 @default.
- W3107198524 hasRelatedWork W1489458588 @default.
- W3107198524 hasRelatedWork W1533809857 @default.
- W3107198524 hasRelatedWork W1791210442 @default.
- W3107198524 hasRelatedWork W1852036295 @default.
- W3107198524 hasRelatedWork W2338213173 @default.
- W3107198524 hasRelatedWork W2441054692 @default.
- W3107198524 hasRelatedWork W2477313175 @default.
- W3107198524 hasRelatedWork W2570547546 @default.
- W3107198524 hasRelatedWork W2624756160 @default.
- W3107198524 hasRelatedWork W2667760 @default.
- W3107198524 hasRelatedWork W2951924730 @default.
- W3107198524 hasRelatedWork W3037978882 @default.
- W3107198524 hasRelatedWork W3046553904 @default.
- W3107198524 hasRelatedWork W3092283217 @default.
- W3107198524 hasRelatedWork W3094246525 @default.
- W3107198524 hasRelatedWork W3105688066 @default.
- W3107198524 hasRelatedWork W3119173266 @default.
- W3107198524 hasRelatedWork W3121202611 @default.
- W3107198524 hasRelatedWork W3121228496 @default.
- W3107198524 hasRelatedWork W3214674783 @default.
- W3107198524 isParatext "false" @default.
- W3107198524 isRetracted "false" @default.
- W3107198524 magId "3107198524" @default.
- W3107198524 workType "article" @default.