Matches in SemOpenAlex for { <https://semopenalex.org/work/W2970884920> ?p ?o ?g. }
- W2970884920 abstract "In this paper, we settle the sampling complexity of solving discounted two-player turn-based zero-sum stochastic games up to polylogarithmic factors. Given a stochastic game with discount factor $gammain(0,1)$ we provide an algorithm that computes an $epsilon$-optimal strategy with high-probability given $tilde{O}((1 - gamma)^{-3} epsilon^{-2})$ samples from the transition function for each state-action-pair. Our algorithm runs in time nearly linear in the number of samples and uses space nearly linear in the number of state-action pairs. As stochastic games generalize Markov decision processes (MDPs) our runtime and sample complexities are optimal due to Azar et al (2013). We achieve our results by showing how to generalize a near-optimal Q-learning based algorithms for MDP, in particular Sidford et al (2018), to two-player strategy computation algorithms. This overcomes limitations of standard Q-learning and strategy iteration or alternating minimization based approaches and we hope will pave the way for future reinforcement learning results by facilitating the extension of MDP results to multi-agent settings with little loss." @default.
- W2970884920 created "2019-09-05" @default.
- W2970884920 creator A5041526408 @default.
- W2970884920 creator A5052683643 @default.
- W2970884920 creator A5072096775 @default.
- W2970884920 creator A5086787265 @default.
- W2970884920 date "2019-08-29" @default.
- W2970884920 modified "2023-09-27" @default.
- W2970884920 title "Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity." @default.
- W2970884920 cites W107583932 @default.
- W2970884920 cites W137538757 @default.
- W2970884920 cites W1481965470 @default.
- W2970884920 cites W1494380152 @default.
- W2970884920 cites W1496590343 @default.
- W2970884920 cites W1513468570 @default.
- W2970884920 cites W1515891729 @default.
- W2970884920 cites W1519783625 @default.
- W2970884920 cites W1542941925 @default.
- W2970884920 cites W1788877992 @default.
- W2970884920 cites W1835254890 @default.
- W2970884920 cites W1966347268 @default.
- W2970884920 cites W1973039793 @default.
- W2970884920 cites W1984901446 @default.
- W2970884920 cites W2028145673 @default.
- W2970884920 cites W2039439610 @default.
- W2970884920 cites W2119567691 @default.
- W2970884920 cites W2120678009 @default.
- W2970884920 cites W2120846115 @default.
- W2970884920 cites W2122701159 @default.
- W2970884920 cites W2134293360 @default.
- W2970884920 cites W2141076336 @default.
- W2970884920 cites W2147967768 @default.
- W2970884920 cites W2149166950 @default.
- W2970884920 cites W2167471538 @default.
- W2970884920 cites W2330024298 @default.
- W2970884920 cites W2341171179 @default.
- W2970884920 cites W2575731723 @default.
- W2970884920 cites W2606656360 @default.
- W2970884920 cites W2765415241 @default.
- W2970884920 cites W2805861379 @default.
- W2970884920 cites W2948104910 @default.
- W2970884920 cites W2950299848 @default.
- W2970884920 cites W2962990479 @default.
- W2970884920 cites W2963111827 @default.
- W2970884920 cites W2963872309 @default.
- W2970884920 cites W3014860839 @default.
- W2970884920 cites W361876 @default.
- W2970884920 hasPublicationYear "2019" @default.
- W2970884920 type Work @default.
- W2970884920 sameAs 2970884920 @default.
- W2970884920 citedByCount "10" @default.
- W2970884920 countsByYear W29708849202020 @default.
- W2970884920 countsByYear W29708849202021 @default.
- W2970884920 crossrefType "posted-content" @default.
- W2970884920 hasAuthorship W2970884920A5041526408 @default.
- W2970884920 hasAuthorship W2970884920A5052683643 @default.
- W2970884920 hasAuthorship W2970884920A5072096775 @default.
- W2970884920 hasAuthorship W2970884920A5086787265 @default.
- W2970884920 hasConcept C10138342 @default.
- W2970884920 hasConcept C105795698 @default.
- W2970884920 hasConcept C106189395 @default.
- W2970884920 hasConcept C11413529 @default.
- W2970884920 hasConcept C118615104 @default.
- W2970884920 hasConcept C119857082 @default.
- W2970884920 hasConcept C121332964 @default.
- W2970884920 hasConcept C126255220 @default.
- W2970884920 hasConcept C154945302 @default.
- W2970884920 hasConcept C159886148 @default.
- W2970884920 hasConcept C162324750 @default.
- W2970884920 hasConcept C185592680 @default.
- W2970884920 hasConcept C188116033 @default.
- W2970884920 hasConcept C198531522 @default.
- W2970884920 hasConcept C2778445095 @default.
- W2970884920 hasConcept C2780791683 @default.
- W2970884920 hasConcept C33923547 @default.
- W2970884920 hasConcept C36686422 @default.
- W2970884920 hasConcept C41008148 @default.
- W2970884920 hasConcept C43617362 @default.
- W2970884920 hasConcept C45374587 @default.
- W2970884920 hasConcept C48103436 @default.
- W2970884920 hasConcept C6177178 @default.
- W2970884920 hasConcept C62520636 @default.
- W2970884920 hasConcept C72434380 @default.
- W2970884920 hasConcept C97541855 @default.
- W2970884920 hasConcept C98763669 @default.
- W2970884920 hasConceptScore W2970884920C10138342 @default.
- W2970884920 hasConceptScore W2970884920C105795698 @default.
- W2970884920 hasConceptScore W2970884920C106189395 @default.
- W2970884920 hasConceptScore W2970884920C11413529 @default.
- W2970884920 hasConceptScore W2970884920C118615104 @default.
- W2970884920 hasConceptScore W2970884920C119857082 @default.
- W2970884920 hasConceptScore W2970884920C121332964 @default.
- W2970884920 hasConceptScore W2970884920C126255220 @default.
- W2970884920 hasConceptScore W2970884920C154945302 @default.
- W2970884920 hasConceptScore W2970884920C159886148 @default.
- W2970884920 hasConceptScore W2970884920C162324750 @default.
- W2970884920 hasConceptScore W2970884920C185592680 @default.
- W2970884920 hasConceptScore W2970884920C188116033 @default.
- W2970884920 hasConceptScore W2970884920C198531522 @default.
- W2970884920 hasConceptScore W2970884920C2778445095 @default.