Matches in SemOpenAlex for { <https://semopenalex.org/work/W1490782493> ?p ?o ?g. }
- W1490782493 endingPage "292" @default.
- W1490782493 startingPage "283" @default.
- W1490782493 abstract "This paper investigates value function approximation in the context of zero-sum Markov games, which can be viewed as a generalization of the Markov decision process (MDP) framework to the two-agent case. We generalize error bounds from MDPs to Markov games and describe generalizations of reinforcement learning algorithms to Markov games. We present a generalization of the optimal stopping problem to a two-player simultaneous move Markov game. For this special problem, we provide stronger bounds and can guarantee convergence for LSTD and temporal difference learning with linear value function approximation. We demonstrate the viability of value function approximation for Markov games by using the Least squares policy iteration (LSPI) algorithm to learn good policies for a soccer domain and a flow control problem." @default.
- W1490782493 created "2016-06-24" @default.
- W1490782493 creator A5021682235 @default.
- W1490782493 creator A5084693234 @default.
- W1490782493 date "2002-08-01" @default.
- W1490782493 modified "2023-09-23" @default.
- W1490782493 title "Value function approximation in zero-sum markov games" @default.
- W1490782493 cites W1502893368 @default.
- W1490782493 cites W1512407760 @default.
- W1490782493 cites W1542941925 @default.
- W1490782493 cites W1547105496 @default.
- W1490782493 cites W1576452626 @default.
- W1490782493 cites W2072931156 @default.
- W1490782493 cites W2089415692 @default.
- W1490782493 cites W2099833070 @default.
- W1490782493 cites W2100677568 @default.
- W1490782493 cites W2102562355 @default.
- W1490782493 cites W2914474525 @default.
- W1490782493 cites W361876 @default.
- W1490782493 hasPublicationYear "2002" @default.
- W1490782493 type Work @default.
- W1490782493 sameAs 1490782493 @default.
- W1490782493 citedByCount "33" @default.
- W1490782493 countsByYear W14907824932013 @default.
- W1490782493 countsByYear W14907824932015 @default.
- W1490782493 countsByYear W14907824932016 @default.
- W1490782493 countsByYear W14907824932017 @default.
- W1490782493 countsByYear W14907824932019 @default.
- W1490782493 countsByYear W14907824932020 @default.
- W1490782493 countsByYear W14907824932021 @default.
- W1490782493 crossrefType "proceedings-article" @default.
- W1490782493 hasAuthorship W1490782493A5021682235 @default.
- W1490782493 hasAuthorship W1490782493A5084693234 @default.
- W1490782493 hasConcept C105795698 @default.
- W1490782493 hasConcept C106189395 @default.
- W1490782493 hasConcept C119857082 @default.
- W1490782493 hasConcept C126255220 @default.
- W1490782493 hasConcept C134306372 @default.
- W1490782493 hasConcept C136356330 @default.
- W1490782493 hasConcept C14646407 @default.
- W1490782493 hasConcept C148764684 @default.
- W1490782493 hasConcept C151730666 @default.
- W1490782493 hasConcept C154945302 @default.
- W1490782493 hasConcept C159886148 @default.
- W1490782493 hasConcept C162324750 @default.
- W1490782493 hasConcept C163836022 @default.
- W1490782493 hasConcept C17098449 @default.
- W1490782493 hasConcept C177148314 @default.
- W1490782493 hasConcept C188116033 @default.
- W1490782493 hasConcept C196340769 @default.
- W1490782493 hasConcept C2777303404 @default.
- W1490782493 hasConcept C2779343474 @default.
- W1490782493 hasConcept C33923547 @default.
- W1490782493 hasConcept C41008148 @default.
- W1490782493 hasConcept C46814582 @default.
- W1490782493 hasConcept C50522688 @default.
- W1490782493 hasConcept C86803240 @default.
- W1490782493 hasConcept C97541855 @default.
- W1490782493 hasConcept C98763669 @default.
- W1490782493 hasConceptScore W1490782493C105795698 @default.
- W1490782493 hasConceptScore W1490782493C106189395 @default.
- W1490782493 hasConceptScore W1490782493C119857082 @default.
- W1490782493 hasConceptScore W1490782493C126255220 @default.
- W1490782493 hasConceptScore W1490782493C134306372 @default.
- W1490782493 hasConceptScore W1490782493C136356330 @default.
- W1490782493 hasConceptScore W1490782493C14646407 @default.
- W1490782493 hasConceptScore W1490782493C148764684 @default.
- W1490782493 hasConceptScore W1490782493C151730666 @default.
- W1490782493 hasConceptScore W1490782493C154945302 @default.
- W1490782493 hasConceptScore W1490782493C159886148 @default.
- W1490782493 hasConceptScore W1490782493C162324750 @default.
- W1490782493 hasConceptScore W1490782493C163836022 @default.
- W1490782493 hasConceptScore W1490782493C17098449 @default.
- W1490782493 hasConceptScore W1490782493C177148314 @default.
- W1490782493 hasConceptScore W1490782493C188116033 @default.
- W1490782493 hasConceptScore W1490782493C196340769 @default.
- W1490782493 hasConceptScore W1490782493C2777303404 @default.
- W1490782493 hasConceptScore W1490782493C2779343474 @default.
- W1490782493 hasConceptScore W1490782493C33923547 @default.
- W1490782493 hasConceptScore W1490782493C41008148 @default.
- W1490782493 hasConceptScore W1490782493C46814582 @default.
- W1490782493 hasConceptScore W1490782493C50522688 @default.
- W1490782493 hasConceptScore W1490782493C86803240 @default.
- W1490782493 hasConceptScore W1490782493C97541855 @default.
- W1490782493 hasConceptScore W1490782493C98763669 @default.
- W1490782493 hasLocation W14907824931 @default.
- W1490782493 hasOpenAccess W1490782493 @default.
- W1490782493 hasPrimaryLocation W14907824931 @default.
- W1490782493 hasRelatedWork W1513468570 @default.
- W1490782493 hasRelatedWork W1515851193 @default.
- W1490782493 hasRelatedWork W1519783625 @default.
- W1490782493 hasRelatedWork W1542941925 @default.
- W1490782493 hasRelatedWork W1788877992 @default.
- W1490782493 hasRelatedWork W1973039793 @default.
- W1490782493 hasRelatedWork W2119567691 @default.
- W1490782493 hasRelatedWork W2120846115 @default.
- W1490782493 hasRelatedWork W2121863487 @default.
- W1490782493 hasRelatedWork W2130005627 @default.