Matches in SemOpenAlex for { <https://semopenalex.org/work/W3084233102> ?p ?o ?g. }
- W3084233102 endingPage "4823" @default.
- W3084233102 startingPage "4816" @default.
- W3084233102 abstract "We consider the problem of two-player zero-sum games. This problem is formulated as a min-max Markov game in the literature. The solution of this game, which is the min-max payoff, starting from a given state is called the min-max value of the state. In this work, we compute the solution of the two-player zero-sum game utilizing the technique of successive relaxation that has been successfully applied in the literature to compute a faster value iteration algorithm in the context of Markov Decision Processes. We extend the concept of successive relaxation to the setting of two-player zero-sum games. We show that, under a special structure on the game, this technique facilitates faster computation of the min-max value of the states. We then derive a generalized minimax Q-learning algorithm that computes the optimal policy when the model information is not known. Finally, we prove the convergence of the proposed generalized minimax Q-learning algorithm utilizing stochastic approximation techniques, under an assumption on the boundedness of iterates. Through experiments, we demonstrate the effectiveness of our proposed algorithm." @default.
- W3084233102 created "2020-09-14" @default.
- W3084233102 creator A5002033038 @default.
- W3084233102 creator A5038163398 @default.
- W3084233102 creator A5059738367 @default.
- W3084233102 date "2022-09-01" @default.
- W3084233102 modified "2023-09-30" @default.
- W3084233102 title "A Generalized Minimax Q-Learning Algorithm for Two-Player Zero-Sum Stochastic Games" @default.
- W3084233102 cites W1485630385 @default.
- W3084233102 cites W1542941925 @default.
- W3084233102 cites W2082860493 @default.
- W3084233102 cites W2099618002 @default.
- W3084233102 cites W2904455790 @default.
- W3084233102 cites W2991046523 @default.
- W3084233102 cites W3105579180 @default.
- W3084233102 cites W4243772471 @default.
- W3084233102 cites W4254547512 @default.
- W3084233102 doi "https://doi.org/10.1109/tac.2022.3159453" @default.
- W3084233102 hasPublicationYear "2022" @default.
- W3084233102 type Work @default.
- W3084233102 sameAs 3084233102 @default.
- W3084233102 citedByCount "1" @default.
- W3084233102 countsByYear W30842331022023 @default.
- W3084233102 crossrefType "journal-article" @default.
- W3084233102 hasAuthorship W3084233102A5002033038 @default.
- W3084233102 hasAuthorship W3084233102A5038163398 @default.
- W3084233102 hasAuthorship W3084233102A5059738367 @default.
- W3084233102 hasBestOaLocation W30842331022 @default.
- W3084233102 hasConcept C105795698 @default.
- W3084233102 hasConcept C106189395 @default.
- W3084233102 hasConcept C11413529 @default.
- W3084233102 hasConcept C126255220 @default.
- W3084233102 hasConcept C134306372 @default.
- W3084233102 hasConcept C136356330 @default.
- W3084233102 hasConcept C138328387 @default.
- W3084233102 hasConcept C138885662 @default.
- W3084233102 hasConcept C140479938 @default.
- W3084233102 hasConcept C144237770 @default.
- W3084233102 hasConcept C149728462 @default.
- W3084233102 hasConcept C151730666 @default.
- W3084233102 hasConcept C155930848 @default.
- W3084233102 hasConcept C15744967 @default.
- W3084233102 hasConcept C159886148 @default.
- W3084233102 hasConcept C162324750 @default.
- W3084233102 hasConcept C177142836 @default.
- W3084233102 hasConcept C22171661 @default.
- W3084233102 hasConcept C2776029896 @default.
- W3084233102 hasConcept C2777303404 @default.
- W3084233102 hasConcept C2779343474 @default.
- W3084233102 hasConcept C2780813799 @default.
- W3084233102 hasConcept C33923547 @default.
- W3084233102 hasConcept C41008148 @default.
- W3084233102 hasConcept C41895202 @default.
- W3084233102 hasConcept C46814582 @default.
- W3084233102 hasConcept C50522688 @default.
- W3084233102 hasConcept C73795354 @default.
- W3084233102 hasConcept C77805123 @default.
- W3084233102 hasConcept C86803240 @default.
- W3084233102 hasConcept C98763669 @default.
- W3084233102 hasConceptScore W3084233102C105795698 @default.
- W3084233102 hasConceptScore W3084233102C106189395 @default.
- W3084233102 hasConceptScore W3084233102C11413529 @default.
- W3084233102 hasConceptScore W3084233102C126255220 @default.
- W3084233102 hasConceptScore W3084233102C134306372 @default.
- W3084233102 hasConceptScore W3084233102C136356330 @default.
- W3084233102 hasConceptScore W3084233102C138328387 @default.
- W3084233102 hasConceptScore W3084233102C138885662 @default.
- W3084233102 hasConceptScore W3084233102C140479938 @default.
- W3084233102 hasConceptScore W3084233102C144237770 @default.
- W3084233102 hasConceptScore W3084233102C149728462 @default.
- W3084233102 hasConceptScore W3084233102C151730666 @default.
- W3084233102 hasConceptScore W3084233102C155930848 @default.
- W3084233102 hasConceptScore W3084233102C15744967 @default.
- W3084233102 hasConceptScore W3084233102C159886148 @default.
- W3084233102 hasConceptScore W3084233102C162324750 @default.
- W3084233102 hasConceptScore W3084233102C177142836 @default.
- W3084233102 hasConceptScore W3084233102C22171661 @default.
- W3084233102 hasConceptScore W3084233102C2776029896 @default.
- W3084233102 hasConceptScore W3084233102C2777303404 @default.
- W3084233102 hasConceptScore W3084233102C2779343474 @default.
- W3084233102 hasConceptScore W3084233102C2780813799 @default.
- W3084233102 hasConceptScore W3084233102C33923547 @default.
- W3084233102 hasConceptScore W3084233102C41008148 @default.
- W3084233102 hasConceptScore W3084233102C41895202 @default.
- W3084233102 hasConceptScore W3084233102C46814582 @default.
- W3084233102 hasConceptScore W3084233102C50522688 @default.
- W3084233102 hasConceptScore W3084233102C73795354 @default.
- W3084233102 hasConceptScore W3084233102C77805123 @default.
- W3084233102 hasConceptScore W3084233102C86803240 @default.
- W3084233102 hasConceptScore W3084233102C98763669 @default.
- W3084233102 hasFunder F4320310071 @default.
- W3084233102 hasFunder F4320334035 @default.
- W3084233102 hasIssue "9" @default.
- W3084233102 hasLocation W30842331021 @default.
- W3084233102 hasLocation W30842331022 @default.
- W3084233102 hasLocation W30842331023 @default.
- W3084233102 hasOpenAccess W3084233102 @default.
- W3084233102 hasPrimaryLocation W30842331021 @default.