Matches in SemOpenAlex for { <https://semopenalex.org/work/W3037361954> ?p ?o ?g. }
Showing items 1 to 89 of
89
with 100 items per page.
- W3037361954 endingPage "3002" @default.
- W3037361954 startingPage "2992" @default.
- W3037361954 abstract "In this paper, we settle the sampling complexity of solving discounted two-player turn-based zero-sum stochastic games up to polylogarithmic factors. Given a stochastic game with discount factor $gammain(0,1)$ we provide an algorithm that computes an $epsilon$-optimal strategy with high-probability given $tilde{O}((1 - gamma)^{-3} epsilon^{-2})$ samples from the transition function for each state-action-pair. Our algorithm runs in time nearly linear in the number of samples and uses space nearly linear in the number of state-action pairs. As stochastic games generalize Markov decision processes (MDPs) our runtime and sample complexities are optimal due to Azar et al (2013). We achieve our results by showing how to generalize a near-optimal Q-learning based algorithms for MDP, in particular Sidford et al (2018), to two-player strategy computation algorithms. This overcomes limitations of standard Q-learning and strategy iteration or alternating minimization based approaches and we hope will pave the way for future reinforcement learning results by facilitating the extension of MDP results to multi-agent settings with little loss." @default.
- W3037361954 created "2020-07-02" @default.
- W3037361954 creator A5041526408 @default.
- W3037361954 creator A5052683643 @default.
- W3037361954 creator A5072096775 @default.
- W3037361954 creator A5086787265 @default.
- W3037361954 date "2019-08-29" @default.
- W3037361954 modified "2023-09-26" @default.
- W3037361954 title "Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity" @default.
- W3037361954 hasPublicationYear "2019" @default.
- W3037361954 type Work @default.
- W3037361954 sameAs 3037361954 @default.
- W3037361954 citedByCount "11" @default.
- W3037361954 countsByYear W30373619542020 @default.
- W3037361954 countsByYear W30373619542021 @default.
- W3037361954 crossrefType "proceedings-article" @default.
- W3037361954 hasAuthorship W3037361954A5041526408 @default.
- W3037361954 hasAuthorship W3037361954A5052683643 @default.
- W3037361954 hasAuthorship W3037361954A5072096775 @default.
- W3037361954 hasAuthorship W3037361954A5086787265 @default.
- W3037361954 hasConcept C105795698 @default.
- W3037361954 hasConcept C106189395 @default.
- W3037361954 hasConcept C11413529 @default.
- W3037361954 hasConcept C118615104 @default.
- W3037361954 hasConcept C119857082 @default.
- W3037361954 hasConcept C121332964 @default.
- W3037361954 hasConcept C126255220 @default.
- W3037361954 hasConcept C154945302 @default.
- W3037361954 hasConcept C159886148 @default.
- W3037361954 hasConcept C188116033 @default.
- W3037361954 hasConcept C2778445095 @default.
- W3037361954 hasConcept C2780791683 @default.
- W3037361954 hasConcept C33923547 @default.
- W3037361954 hasConcept C36686422 @default.
- W3037361954 hasConcept C41008148 @default.
- W3037361954 hasConcept C48103436 @default.
- W3037361954 hasConcept C62520636 @default.
- W3037361954 hasConcept C72434380 @default.
- W3037361954 hasConcept C97541855 @default.
- W3037361954 hasConcept C98763669 @default.
- W3037361954 hasConceptScore W3037361954C105795698 @default.
- W3037361954 hasConceptScore W3037361954C106189395 @default.
- W3037361954 hasConceptScore W3037361954C11413529 @default.
- W3037361954 hasConceptScore W3037361954C118615104 @default.
- W3037361954 hasConceptScore W3037361954C119857082 @default.
- W3037361954 hasConceptScore W3037361954C121332964 @default.
- W3037361954 hasConceptScore W3037361954C126255220 @default.
- W3037361954 hasConceptScore W3037361954C154945302 @default.
- W3037361954 hasConceptScore W3037361954C159886148 @default.
- W3037361954 hasConceptScore W3037361954C188116033 @default.
- W3037361954 hasConceptScore W3037361954C2778445095 @default.
- W3037361954 hasConceptScore W3037361954C2780791683 @default.
- W3037361954 hasConceptScore W3037361954C33923547 @default.
- W3037361954 hasConceptScore W3037361954C36686422 @default.
- W3037361954 hasConceptScore W3037361954C41008148 @default.
- W3037361954 hasConceptScore W3037361954C48103436 @default.
- W3037361954 hasConceptScore W3037361954C62520636 @default.
- W3037361954 hasConceptScore W3037361954C72434380 @default.
- W3037361954 hasConceptScore W3037361954C97541855 @default.
- W3037361954 hasConceptScore W3037361954C98763669 @default.
- W3037361954 hasLocation W30373619541 @default.
- W3037361954 hasOpenAccess W3037361954 @default.
- W3037361954 hasPrimaryLocation W30373619541 @default.
- W3037361954 hasRelatedWork W1519783625 @default.
- W3037361954 hasRelatedWork W1542941925 @default.
- W3037361954 hasRelatedWork W1570963478 @default.
- W3037361954 hasRelatedWork W1850488217 @default.
- W3037361954 hasRelatedWork W2120846115 @default.
- W3037361954 hasRelatedWork W2141076336 @default.
- W3037361954 hasRelatedWork W2257979135 @default.
- W3037361954 hasRelatedWork W2530849036 @default.
- W3037361954 hasRelatedWork W2575731723 @default.
- W3037361954 hasRelatedWork W2766447205 @default.
- W3037361954 hasRelatedWork W2946912408 @default.
- W3037361954 hasRelatedWork W2963049774 @default.
- W3037361954 hasRelatedWork W2963111827 @default.
- W3037361954 hasRelatedWork W2964054583 @default.
- W3037361954 hasRelatedWork W2970884920 @default.
- W3037361954 hasRelatedWork W2982316857 @default.
- W3037361954 hasRelatedWork W2991046523 @default.
- W3037361954 hasRelatedWork W3046553904 @default.
- W3037361954 hasRelatedWork W3100785954 @default.
- W3037361954 hasRelatedWork W3167472281 @default.
- W3037361954 isParatext "false" @default.
- W3037361954 isRetracted "false" @default.
- W3037361954 magId "3037361954" @default.
- W3037361954 workType "article" @default.