Matches in SemOpenAlex for { <https://semopenalex.org/work/W4214885058> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W4214885058 abstract "We focus on multi-agent reinforcement learning in tabular average-cost settings: a team of agents sequentially interacts with the environment and observes localized incentives. The setting we focus on is one in which the global reward is a sum of all local rewards, the joint policy factorizes into agents' marginals, and full observability. To date, exceptionally few global optimality guarantees exist for this simple setting, as most results, asymptotic or non-asymptotic, yield convergence to stationarity under parameterized settings for possibly large/continuous spaces. To strengthen performance guarantees in MARL, we focus on linear programming (LP) reformulations of RL for which stochastic primal-dual method has recently been shown to achieve optimal sample complexity in the centralized tabular case. We develop multi-agent LP extensions, whereby agents solve their local saddle point problems and then compose their variable estimates with weighted averaging steps to diffuse information between agents across time. We establish that the number of samples required to attain near-globally optimal solutions matches tight dependencies on the cardinality of the state and action spaces, and exhibits classical scalings with the size of the team in accordance with multi-agent optimization. Experiments then demonstrate the merits of this approach for cooperative navigation problems." @default.
- W4214885058 created "2022-03-05" @default.
- W4214885058 creator A5025896653 @default.
- W4214885058 creator A5037627748 @default.
- W4214885058 creator A5039563144 @default.
- W4214885058 creator A5064822688 @default.
- W4214885058 date "2021-10-31" @default.
- W4214885058 modified "2023-09-30" @default.
- W4214885058 title "Randomized Linear Programming for Tabular Average-Cost Multi-agent Reinforcement Learning" @default.
- W4214885058 doi "https://doi.org/10.1109/ieeeconf53345.2021.9723192" @default.
- W4214885058 hasPublicationYear "2021" @default.
- W4214885058 type Work @default.
- W4214885058 citedByCount "0" @default.
- W4214885058 crossrefType "proceedings-article" @default.
- W4214885058 hasAuthorship W4214885058A5025896653 @default.
- W4214885058 hasAuthorship W4214885058A5037627748 @default.
- W4214885058 hasAuthorship W4214885058A5039563144 @default.
- W4214885058 hasAuthorship W4214885058A5064822688 @default.
- W4214885058 hasConcept C120665830 @default.
- W4214885058 hasConcept C121332964 @default.
- W4214885058 hasConcept C124101348 @default.
- W4214885058 hasConcept C126255220 @default.
- W4214885058 hasConcept C154945302 @default.
- W4214885058 hasConcept C162324750 @default.
- W4214885058 hasConcept C192209626 @default.
- W4214885058 hasConcept C2777303404 @default.
- W4214885058 hasConcept C28826006 @default.
- W4214885058 hasConcept C33923547 @default.
- W4214885058 hasConcept C36299963 @default.
- W4214885058 hasConcept C41008148 @default.
- W4214885058 hasConcept C41045048 @default.
- W4214885058 hasConcept C50522688 @default.
- W4214885058 hasConcept C87117476 @default.
- W4214885058 hasConcept C97541855 @default.
- W4214885058 hasConceptScore W4214885058C120665830 @default.
- W4214885058 hasConceptScore W4214885058C121332964 @default.
- W4214885058 hasConceptScore W4214885058C124101348 @default.
- W4214885058 hasConceptScore W4214885058C126255220 @default.
- W4214885058 hasConceptScore W4214885058C154945302 @default.
- W4214885058 hasConceptScore W4214885058C162324750 @default.
- W4214885058 hasConceptScore W4214885058C192209626 @default.
- W4214885058 hasConceptScore W4214885058C2777303404 @default.
- W4214885058 hasConceptScore W4214885058C28826006 @default.
- W4214885058 hasConceptScore W4214885058C33923547 @default.
- W4214885058 hasConceptScore W4214885058C36299963 @default.
- W4214885058 hasConceptScore W4214885058C41008148 @default.
- W4214885058 hasConceptScore W4214885058C41045048 @default.
- W4214885058 hasConceptScore W4214885058C50522688 @default.
- W4214885058 hasConceptScore W4214885058C87117476 @default.
- W4214885058 hasConceptScore W4214885058C97541855 @default.
- W4214885058 hasLocation W42148850581 @default.
- W4214885058 hasOpenAccess W4214885058 @default.
- W4214885058 hasPrimaryLocation W42148850581 @default.
- W4214885058 hasRelatedWork W1562959674 @default.
- W4214885058 hasRelatedWork W2072323933 @default.
- W4214885058 hasRelatedWork W2074914621 @default.
- W4214885058 hasRelatedWork W2986252512 @default.
- W4214885058 hasRelatedWork W3141808077 @default.
- W4214885058 hasRelatedWork W3156449588 @default.
- W4214885058 hasRelatedWork W4214885058 @default.
- W4214885058 hasRelatedWork W4283455536 @default.
- W4214885058 hasRelatedWork W4311696670 @default.
- W4214885058 hasRelatedWork W4313162897 @default.
- W4214885058 isParatext "false" @default.
- W4214885058 isRetracted "false" @default.
- W4214885058 workType "article" @default.