Matches in SemOpenAlex for { <https://semopenalex.org/work/W4360830157> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4360830157 abstract "We consider the problem of decentralized multi-agent reinforcement learning in Markov games. A fundamental question is whether there exist algorithms that, when adopted by all agents and run independently in a decentralized fashion, lead to no-regret for each player, analogous to celebrated convergence results in normal-form games. While recent work has shown that such algorithms exist for restricted settings (notably, when regret is defined with respect to deviations to Markovian policies), the question of whether independent no-regret learning can be achieved in the standard Markov game framework was open. We provide a decisive negative resolution this problem, both from a computational and statistical perspective. We show that: - Under the widely-believed assumption that PPAD-hard problems cannot be solved in polynomial time, there is no polynomial-time algorithm that attains no-regret in general-sum Markov games when executed independently by all players, even when the game is known to the algorithm designer and the number of players is a small constant. - When the game is unknown, no algorithm, regardless of computational efficiency, can achieve no-regret without observing a number of episodes that is exponential in the number of players. Perhaps surprisingly, our lower bounds hold even for seemingly easier setting in which all agents are controlled by a a centralized algorithm. They are proven via lower bounds for a simpler problem we refer to as SparseCCE, in which the goal is to compute a coarse correlated equilibrium that is sparse in the sense that it can be represented as a mixture of a small number of product policies. The crux of our approach is a novel application of aggregation techniques from online learning, whereby we show that any algorithm for the SparseCCE problem can be used to compute approximate Nash equilibria for non-zero sum normal-form games." @default.
- W4360830157 created "2023-03-25" @default.
- W4360830157 creator A5018792915 @default.
- W4360830157 creator A5075069827 @default.
- W4360830157 creator A5089252105 @default.
- W4360830157 date "2023-03-21" @default.
- W4360830157 modified "2023-10-16" @default.
- W4360830157 title "Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games" @default.
- W4360830157 doi "https://doi.org/10.48550/arxiv.2303.12287" @default.
- W4360830157 hasPublicationYear "2023" @default.
- W4360830157 type Work @default.
- W4360830157 citedByCount "0" @default.
- W4360830157 crossrefType "posted-content" @default.
- W4360830157 hasAuthorship W4360830157A5018792915 @default.
- W4360830157 hasAuthorship W4360830157A5075069827 @default.
- W4360830157 hasAuthorship W4360830157A5089252105 @default.
- W4360830157 hasBestOaLocation W43608301571 @default.
- W4360830157 hasConcept C105795698 @default.
- W4360830157 hasConcept C106189395 @default.
- W4360830157 hasConcept C11413529 @default.
- W4360830157 hasConcept C119857082 @default.
- W4360830157 hasConcept C126255220 @default.
- W4360830157 hasConcept C134306372 @default.
- W4360830157 hasConcept C144237770 @default.
- W4360830157 hasConcept C159886148 @default.
- W4360830157 hasConcept C162324750 @default.
- W4360830157 hasConcept C2777303404 @default.
- W4360830157 hasConcept C311688 @default.
- W4360830157 hasConcept C33923547 @default.
- W4360830157 hasConcept C34388435 @default.
- W4360830157 hasConcept C41008148 @default.
- W4360830157 hasConcept C46814582 @default.
- W4360830157 hasConcept C50522688 @default.
- W4360830157 hasConcept C50817715 @default.
- W4360830157 hasConcept C80444323 @default.
- W4360830157 hasConcept C98763669 @default.
- W4360830157 hasConceptScore W4360830157C105795698 @default.
- W4360830157 hasConceptScore W4360830157C106189395 @default.
- W4360830157 hasConceptScore W4360830157C11413529 @default.
- W4360830157 hasConceptScore W4360830157C119857082 @default.
- W4360830157 hasConceptScore W4360830157C126255220 @default.
- W4360830157 hasConceptScore W4360830157C134306372 @default.
- W4360830157 hasConceptScore W4360830157C144237770 @default.
- W4360830157 hasConceptScore W4360830157C159886148 @default.
- W4360830157 hasConceptScore W4360830157C162324750 @default.
- W4360830157 hasConceptScore W4360830157C2777303404 @default.
- W4360830157 hasConceptScore W4360830157C311688 @default.
- W4360830157 hasConceptScore W4360830157C33923547 @default.
- W4360830157 hasConceptScore W4360830157C34388435 @default.
- W4360830157 hasConceptScore W4360830157C41008148 @default.
- W4360830157 hasConceptScore W4360830157C46814582 @default.
- W4360830157 hasConceptScore W4360830157C50522688 @default.
- W4360830157 hasConceptScore W4360830157C50817715 @default.
- W4360830157 hasConceptScore W4360830157C80444323 @default.
- W4360830157 hasConceptScore W4360830157C98763669 @default.
- W4360830157 hasLocation W43608301571 @default.
- W4360830157 hasOpenAccess W4360830157 @default.
- W4360830157 hasPrimaryLocation W43608301571 @default.
- W4360830157 hasRelatedWork W2128702080 @default.
- W4360830157 hasRelatedWork W2137847062 @default.
- W4360830157 hasRelatedWork W2149943599 @default.
- W4360830157 hasRelatedWork W2952099252 @default.
- W4360830157 hasRelatedWork W2963650147 @default.
- W4360830157 hasRelatedWork W3140738360 @default.
- W4360830157 hasRelatedWork W3167472281 @default.
- W4360830157 hasRelatedWork W3198596521 @default.
- W4360830157 hasRelatedWork W4287640753 @default.
- W4360830157 hasRelatedWork W4302063494 @default.
- W4360830157 isParatext "false" @default.
- W4360830157 isRetracted "false" @default.
- W4360830157 workType "article" @default.