Matches in SemOpenAlex for { <https://semopenalex.org/work/W4317951342> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W4317951342 abstract "Achieving convergence of multiple learning agents in general $N$-player games is imperative for the development of safe and reliable machine learning (ML) algorithms and their application to autonomous systems. Yet it is known that, outside the bounds of simple two-player games, convergence cannot be taken for granted. To make progress in resolving this problem, we study the dynamics of smooth Q-Learning, a popular reinforcement learning algorithm which quantifies the tendency for learning agents to explore their state space or exploit their payoffs. We show a sufficient condition on the rate of exploration such that the Q-Learning dynamics is guaranteed to converge to a unique equilibrium in any game. We connect this result to games for which Q-Learning is known to converge with arbitrary exploration rates, including weighted Potential games and weighted zero sum polymatrix games. Finally, we examine the performance of the Q-Learning dynamic as measured by the Time Averaged Social Welfare, and comparing this with the Social Welfare achieved by the equilibrium. We provide a sufficient condition whereby the Q-Learning dynamic will outperform the equilibrium even if the dynamics do not converge." @default.
- W4317951342 created "2023-01-25" @default.
- W4317951342 creator A5050036888 @default.
- W4317951342 creator A5055883955 @default.
- W4317951342 creator A5079969658 @default.
- W4317951342 date "2023-01-23" @default.
- W4317951342 modified "2023-09-28" @default.
- W4317951342 title "Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics" @default.
- W4317951342 doi "https://doi.org/10.48550/arxiv.2301.09619" @default.
- W4317951342 hasPublicationYear "2023" @default.
- W4317951342 type Work @default.
- W4317951342 citedByCount "0" @default.
- W4317951342 crossrefType "posted-content" @default.
- W4317951342 hasAuthorship W4317951342A5050036888 @default.
- W4317951342 hasAuthorship W4317951342A5055883955 @default.
- W4317951342 hasAuthorship W4317951342A5079969658 @default.
- W4317951342 hasBestOaLocation W43179513421 @default.
- W4317951342 hasConcept C105795698 @default.
- W4317951342 hasConcept C111472728 @default.
- W4317951342 hasConcept C121332964 @default.
- W4317951342 hasConcept C126255220 @default.
- W4317951342 hasConcept C138885662 @default.
- W4317951342 hasConcept C144237770 @default.
- W4317951342 hasConcept C145912823 @default.
- W4317951342 hasConcept C154945302 @default.
- W4317951342 hasConcept C162324750 @default.
- W4317951342 hasConcept C165696696 @default.
- W4317951342 hasConcept C188116033 @default.
- W4317951342 hasConcept C24890656 @default.
- W4317951342 hasConcept C2777303404 @default.
- W4317951342 hasConcept C2780586882 @default.
- W4317951342 hasConcept C33923547 @default.
- W4317951342 hasConcept C38652104 @default.
- W4317951342 hasConcept C41008148 @default.
- W4317951342 hasConcept C50522688 @default.
- W4317951342 hasConcept C56739046 @default.
- W4317951342 hasConcept C72434380 @default.
- W4317951342 hasConcept C79416737 @default.
- W4317951342 hasConcept C97541855 @default.
- W4317951342 hasConceptScore W4317951342C105795698 @default.
- W4317951342 hasConceptScore W4317951342C111472728 @default.
- W4317951342 hasConceptScore W4317951342C121332964 @default.
- W4317951342 hasConceptScore W4317951342C126255220 @default.
- W4317951342 hasConceptScore W4317951342C138885662 @default.
- W4317951342 hasConceptScore W4317951342C144237770 @default.
- W4317951342 hasConceptScore W4317951342C145912823 @default.
- W4317951342 hasConceptScore W4317951342C154945302 @default.
- W4317951342 hasConceptScore W4317951342C162324750 @default.
- W4317951342 hasConceptScore W4317951342C165696696 @default.
- W4317951342 hasConceptScore W4317951342C188116033 @default.
- W4317951342 hasConceptScore W4317951342C24890656 @default.
- W4317951342 hasConceptScore W4317951342C2777303404 @default.
- W4317951342 hasConceptScore W4317951342C2780586882 @default.
- W4317951342 hasConceptScore W4317951342C33923547 @default.
- W4317951342 hasConceptScore W4317951342C38652104 @default.
- W4317951342 hasConceptScore W4317951342C41008148 @default.
- W4317951342 hasConceptScore W4317951342C50522688 @default.
- W4317951342 hasConceptScore W4317951342C56739046 @default.
- W4317951342 hasConceptScore W4317951342C72434380 @default.
- W4317951342 hasConceptScore W4317951342C79416737 @default.
- W4317951342 hasConceptScore W4317951342C97541855 @default.
- W4317951342 hasLocation W43179513421 @default.
- W4317951342 hasOpenAccess W4317951342 @default.
- W4317951342 hasPrimaryLocation W43179513421 @default.
- W4317951342 hasRelatedWork W2031695474 @default.
- W4317951342 hasRelatedWork W2034558910 @default.
- W4317951342 hasRelatedWork W2101748387 @default.
- W4317951342 hasRelatedWork W2123899227 @default.
- W4317951342 hasRelatedWork W2145363145 @default.
- W4317951342 hasRelatedWork W2361647908 @default.
- W4317951342 hasRelatedWork W2361799059 @default.
- W4317951342 hasRelatedWork W2886498421 @default.
- W4317951342 hasRelatedWork W2923653485 @default.
- W4317951342 hasRelatedWork W2937181779 @default.
- W4317951342 isParatext "false" @default.
- W4317951342 isRetracted "false" @default.
- W4317951342 workType "article" @default.