Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226138304> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4226138304 abstract "We examine global non-asymptotic convergence properties of policy gradient methods for multi-agent reinforcement learning (RL) problems in Markov potential games (MPG). To learn a Nash equilibrium of an MPG in which the size of state space and/or the number of players can be very large, we propose new independent policy gradient algorithms that are run by all players in tandem. When there is no uncertainty in the gradient evaluation, we show that our algorithm finds an $epsilon$-Nash equilibrium with $O(1/epsilon^2)$ iteration complexity which does not explicitly depend on the state space size. When the exact gradient is not available, we establish $O(1/epsilon^5)$ sample complexity bound in a potentially infinitely large state space for a sample-based algorithm that utilizes function approximation. Moreover, we identify a class of independent policy gradient algorithms that enjoys convergence for both zero-sum Markov games and Markov cooperative games with the players that are oblivious to the types of games being played. Finally, we provide computational experiments to corroborate the merits and the effectiveness of our theoretical developments." @default.
- W4226138304 created "2022-05-05" @default.
- W4226138304 creator A5008672638 @default.
- W4226138304 creator A5035849375 @default.
- W4226138304 creator A5047410441 @default.
- W4226138304 creator A5087790067 @default.
- W4226138304 date "2022-02-08" @default.
- W4226138304 modified "2023-10-16" @default.
- W4226138304 title "Independent Policy Gradient for Large-Scale Markov Potential Games: Sharper Rates, Function Approximation, and Game-Agnostic Convergence" @default.
- W4226138304 doi "https://doi.org/10.48550/arxiv.2202.04129" @default.
- W4226138304 hasPublicationYear "2022" @default.
- W4226138304 type Work @default.
- W4226138304 citedByCount "0" @default.
- W4226138304 crossrefType "posted-content" @default.
- W4226138304 hasAuthorship W4226138304A5008672638 @default.
- W4226138304 hasAuthorship W4226138304A5035849375 @default.
- W4226138304 hasAuthorship W4226138304A5047410441 @default.
- W4226138304 hasAuthorship W4226138304A5087790067 @default.
- W4226138304 hasBestOaLocation W42261383041 @default.
- W4226138304 hasConcept C105795698 @default.
- W4226138304 hasConcept C106189395 @default.
- W4226138304 hasConcept C126255220 @default.
- W4226138304 hasConcept C14036430 @default.
- W4226138304 hasConcept C154945302 @default.
- W4226138304 hasConcept C159886148 @default.
- W4226138304 hasConcept C162324750 @default.
- W4226138304 hasConcept C2777303404 @default.
- W4226138304 hasConcept C28826006 @default.
- W4226138304 hasConcept C33923547 @default.
- W4226138304 hasConcept C41008148 @default.
- W4226138304 hasConcept C46814582 @default.
- W4226138304 hasConcept C50522688 @default.
- W4226138304 hasConcept C72434380 @default.
- W4226138304 hasConcept C78458016 @default.
- W4226138304 hasConcept C86803240 @default.
- W4226138304 hasConcept C97541855 @default.
- W4226138304 hasConcept C98763669 @default.
- W4226138304 hasConceptScore W4226138304C105795698 @default.
- W4226138304 hasConceptScore W4226138304C106189395 @default.
- W4226138304 hasConceptScore W4226138304C126255220 @default.
- W4226138304 hasConceptScore W4226138304C14036430 @default.
- W4226138304 hasConceptScore W4226138304C154945302 @default.
- W4226138304 hasConceptScore W4226138304C159886148 @default.
- W4226138304 hasConceptScore W4226138304C162324750 @default.
- W4226138304 hasConceptScore W4226138304C2777303404 @default.
- W4226138304 hasConceptScore W4226138304C28826006 @default.
- W4226138304 hasConceptScore W4226138304C33923547 @default.
- W4226138304 hasConceptScore W4226138304C41008148 @default.
- W4226138304 hasConceptScore W4226138304C46814582 @default.
- W4226138304 hasConceptScore W4226138304C50522688 @default.
- W4226138304 hasConceptScore W4226138304C72434380 @default.
- W4226138304 hasConceptScore W4226138304C78458016 @default.
- W4226138304 hasConceptScore W4226138304C86803240 @default.
- W4226138304 hasConceptScore W4226138304C97541855 @default.
- W4226138304 hasConceptScore W4226138304C98763669 @default.
- W4226138304 hasLocation W42261383041 @default.
- W4226138304 hasOpenAccess W4226138304 @default.
- W4226138304 hasPrimaryLocation W42261383041 @default.
- W4226138304 hasRelatedWork W1626977535 @default.
- W4226138304 hasRelatedWork W1996326480 @default.
- W4226138304 hasRelatedWork W2128702080 @default.
- W4226138304 hasRelatedWork W2252587815 @default.
- W4226138304 hasRelatedWork W2937181779 @default.
- W4226138304 hasRelatedWork W2947128950 @default.
- W4226138304 hasRelatedWork W2970347269 @default.
- W4226138304 hasRelatedWork W3089496523 @default.
- W4226138304 hasRelatedWork W3167472281 @default.
- W4226138304 hasRelatedWork W3201878770 @default.
- W4226138304 isParatext "false" @default.
- W4226138304 isRetracted "false" @default.
- W4226138304 workType "article" @default.