Matches in SemOpenAlex for { <https://semopenalex.org/work/W3037391804> ?p ?o ?g. }
Showing items 1 to 89 of
89
with 100 items per page.
- W3037391804 endingPage "868" @default.
- W3037391804 startingPage "860" @default.
- W3037391804 abstract "We show by counterexample that policy-gradient algorithms have no guarantees of even local convergence to Nash equilibria in continuous action and state space multi-agent settings. To do so, we analyze gradient-play in N-player general-sum linear quadratic games, a classic game setting which is recently emerging as a benchmark in the field of multi-agent learning. In such games the state and action spaces are continuous and global Nash equilibria can be found be solving coupled Ricatti equations. Further, gradient-play in LQ games is equivalent to multi-agent policy-gradient. We first show that these games are surprisingly not convex games. Despite this, we are still able to show that the only critical points of the gradient dynamics are global Nash equilibria. We then give sufficient conditions under which policy-gradient will avoid the Nash equilibria, and generate a large number of general-sum linear quadratic games that satisfy these conditions. The existence of such games indicates that one of the most popular approaches to solving reinforcement learning problems in the classic reinforcement learning setting has no local guarantee of convergence in multi-agent settings. Further, the ease with which we can generate these counterexamples suggests that such situations are not mere edge cases and are in fact quite common." @default.
- W3037391804 created "2020-07-02" @default.
- W3037391804 creator A5008161296 @default.
- W3037391804 creator A5049812527 @default.
- W3037391804 creator A5062722286 @default.
- W3037391804 creator A5089549365 @default.
- W3037391804 date "2020-05-05" @default.
- W3037391804 modified "2023-09-24" @default.
- W3037391804 title "Policy-Gradient Algorithms Have No Guarantees of Convergence in Linear Quadratic Games" @default.
- W3037391804 hasPublicationYear "2020" @default.
- W3037391804 type Work @default.
- W3037391804 sameAs 3037391804 @default.
- W3037391804 citedByCount "4" @default.
- W3037391804 countsByYear W30373918042020 @default.
- W3037391804 countsByYear W30373918042021 @default.
- W3037391804 crossrefType "proceedings-article" @default.
- W3037391804 hasAuthorship W3037391804A5008161296 @default.
- W3037391804 hasAuthorship W3037391804A5049812527 @default.
- W3037391804 hasAuthorship W3037391804A5062722286 @default.
- W3037391804 hasAuthorship W3037391804A5089549365 @default.
- W3037391804 hasConcept C105795698 @default.
- W3037391804 hasConcept C118615104 @default.
- W3037391804 hasConcept C126255220 @default.
- W3037391804 hasConcept C13280743 @default.
- W3037391804 hasConcept C144237770 @default.
- W3037391804 hasConcept C154945302 @default.
- W3037391804 hasConcept C162324750 @default.
- W3037391804 hasConcept C162838799 @default.
- W3037391804 hasConcept C177142836 @default.
- W3037391804 hasConcept C185798385 @default.
- W3037391804 hasConcept C205649164 @default.
- W3037391804 hasConcept C2777303404 @default.
- W3037391804 hasConcept C28826006 @default.
- W3037391804 hasConcept C32407928 @default.
- W3037391804 hasConcept C33923547 @default.
- W3037391804 hasConcept C41008148 @default.
- W3037391804 hasConcept C46814582 @default.
- W3037391804 hasConcept C50522688 @default.
- W3037391804 hasConcept C72434380 @default.
- W3037391804 hasConcept C97541855 @default.
- W3037391804 hasConceptScore W3037391804C105795698 @default.
- W3037391804 hasConceptScore W3037391804C118615104 @default.
- W3037391804 hasConceptScore W3037391804C126255220 @default.
- W3037391804 hasConceptScore W3037391804C13280743 @default.
- W3037391804 hasConceptScore W3037391804C144237770 @default.
- W3037391804 hasConceptScore W3037391804C154945302 @default.
- W3037391804 hasConceptScore W3037391804C162324750 @default.
- W3037391804 hasConceptScore W3037391804C162838799 @default.
- W3037391804 hasConceptScore W3037391804C177142836 @default.
- W3037391804 hasConceptScore W3037391804C185798385 @default.
- W3037391804 hasConceptScore W3037391804C205649164 @default.
- W3037391804 hasConceptScore W3037391804C2777303404 @default.
- W3037391804 hasConceptScore W3037391804C28826006 @default.
- W3037391804 hasConceptScore W3037391804C32407928 @default.
- W3037391804 hasConceptScore W3037391804C33923547 @default.
- W3037391804 hasConceptScore W3037391804C41008148 @default.
- W3037391804 hasConceptScore W3037391804C46814582 @default.
- W3037391804 hasConceptScore W3037391804C50522688 @default.
- W3037391804 hasConceptScore W3037391804C72434380 @default.
- W3037391804 hasConceptScore W3037391804C97541855 @default.
- W3037391804 hasLocation W30373918041 @default.
- W3037391804 hasOpenAccess W3037391804 @default.
- W3037391804 hasPrimaryLocation W30373918041 @default.
- W3037391804 hasRelatedWork W1884814890 @default.
- W3037391804 hasRelatedWork W1996920228 @default.
- W3037391804 hasRelatedWork W2045047916 @default.
- W3037391804 hasRelatedWork W2113702258 @default.
- W3037391804 hasRelatedWork W2162221666 @default.
- W3037391804 hasRelatedWork W2569595205 @default.
- W3037391804 hasRelatedWork W2752268688 @default.
- W3037391804 hasRelatedWork W2765459668 @default.
- W3037391804 hasRelatedWork W2769347852 @default.
- W3037391804 hasRelatedWork W2947577977 @default.
- W3037391804 hasRelatedWork W2948350736 @default.
- W3037391804 hasRelatedWork W2953732167 @default.
- W3037391804 hasRelatedWork W2981664478 @default.
- W3037391804 hasRelatedWork W3001078900 @default.
- W3037391804 hasRelatedWork W3012442074 @default.
- W3037391804 hasRelatedWork W3084085014 @default.
- W3037391804 hasRelatedWork W3116605176 @default.
- W3037391804 hasRelatedWork W3120217439 @default.
- W3037391804 hasRelatedWork W3123153215 @default.
- W3037391804 hasRelatedWork W3189491017 @default.
- W3037391804 isParatext "false" @default.
- W3037391804 isRetracted "false" @default.
- W3037391804 magId "3037391804" @default.
- W3037391804 workType "article" @default.