Matches in SemOpenAlex for { <https://semopenalex.org/work/W361876> ?p ?o ?g. }
- W361876 endingPage "1026" @default.
- W361876 startingPage "1021" @default.
- W361876 abstract "This paper investigates the problem of policy learning in multiagent environments using the stochastic game framework, which we briefly overview. We introduce two properties as desirable for a learning agent when in the presence of other learning agents, namely rationality and convergence. We examine existing reinforcement learning algorithms according to these two properties and notice that they fail to simultaneously meet both criteria. We then contribute a new learning algorithm, WoLF policy hillclimbing, that is based on a simple principle: “learn quickly while losing, slowly while winning.” The algorithm is proven to be rational and we present empirical results for a number of stochastic games showing the algorithm converges." @default.
- W361876 created "2016-06-24" @default.
- W361876 creator A5081163135 @default.
- W361876 creator A5088276691 @default.
- W361876 date "2001-08-04" @default.
- W361876 modified "2023-09-25" @default.
- W361876 title "Rational and convergent learning in stochastic games" @default.
- W361876 cites W1513468570 @default.
- W361876 cites W1515851193 @default.
- W361876 cites W1530487017 @default.
- W361876 cites W1542941925 @default.
- W361876 cites W1607392272 @default.
- W361876 cites W1907356258 @default.
- W361876 cites W1967608200 @default.
- W361876 cites W2011845369 @default.
- W361876 cites W2067050450 @default.
- W361876 cites W2104602264 @default.
- W361876 cites W2147492008 @default.
- W361876 cites W2150339816 @default.
- W361876 cites W2160067530 @default.
- W361876 cites W2164056559 @default.
- W361876 cites W2575731723 @default.
- W361876 cites W2797585760 @default.
- W361876 cites W2911283634 @default.
- W361876 cites W2914656440 @default.
- W361876 hasPublicationYear "2001" @default.
- W361876 type Work @default.
- W361876 sameAs 361876 @default.
- W361876 citedByCount "166" @default.
- W361876 countsByYear W3618762012 @default.
- W361876 countsByYear W3618762013 @default.
- W361876 countsByYear W3618762014 @default.
- W361876 countsByYear W3618762015 @default.
- W361876 countsByYear W3618762016 @default.
- W361876 countsByYear W3618762017 @default.
- W361876 countsByYear W3618762018 @default.
- W361876 countsByYear W3618762019 @default.
- W361876 countsByYear W3618762020 @default.
- W361876 countsByYear W3618762021 @default.
- W361876 crossrefType "proceedings-article" @default.
- W361876 hasAuthorship W361876A5081163135 @default.
- W361876 hasAuthorship W361876A5088276691 @default.
- W361876 hasConcept C111472728 @default.
- W361876 hasConcept C119857082 @default.
- W361876 hasConcept C126255220 @default.
- W361876 hasConcept C138885662 @default.
- W361876 hasConcept C154945302 @default.
- W361876 hasConcept C162324750 @default.
- W361876 hasConcept C17744445 @default.
- W361876 hasConcept C199539241 @default.
- W361876 hasConcept C201717286 @default.
- W361876 hasConcept C2777303404 @default.
- W361876 hasConcept C2779913896 @default.
- W361876 hasConcept C2780586882 @default.
- W361876 hasConcept C33923547 @default.
- W361876 hasConcept C41008148 @default.
- W361876 hasConcept C50522688 @default.
- W361876 hasConcept C58694771 @default.
- W361876 hasConcept C97541855 @default.
- W361876 hasConceptScore W361876C111472728 @default.
- W361876 hasConceptScore W361876C119857082 @default.
- W361876 hasConceptScore W361876C126255220 @default.
- W361876 hasConceptScore W361876C138885662 @default.
- W361876 hasConceptScore W361876C154945302 @default.
- W361876 hasConceptScore W361876C162324750 @default.
- W361876 hasConceptScore W361876C17744445 @default.
- W361876 hasConceptScore W361876C199539241 @default.
- W361876 hasConceptScore W361876C201717286 @default.
- W361876 hasConceptScore W361876C2777303404 @default.
- W361876 hasConceptScore W361876C2779913896 @default.
- W361876 hasConceptScore W361876C2780586882 @default.
- W361876 hasConceptScore W361876C33923547 @default.
- W361876 hasConceptScore W361876C41008148 @default.
- W361876 hasConceptScore W361876C50522688 @default.
- W361876 hasConceptScore W361876C58694771 @default.
- W361876 hasConceptScore W361876C97541855 @default.
- W361876 hasLocation W3618761 @default.
- W361876 hasOpenAccess W361876 @default.
- W361876 hasPrimaryLocation W3618761 @default.
- W361876 hasRelatedWork W1513468570 @default.
- W361876 hasRelatedWork W1519783625 @default.
- W361876 hasRelatedWork W1542941925 @default.
- W361876 hasRelatedWork W1557517019 @default.
- W361876 hasRelatedWork W1605188341 @default.
- W361876 hasRelatedWork W1607392272 @default.
- W361876 hasRelatedWork W1641379095 @default.
- W361876 hasRelatedWork W2099618002 @default.
- W361876 hasRelatedWork W2103437045 @default.
- W361876 hasRelatedWork W2103561211 @default.
- W361876 hasRelatedWork W2104602264 @default.
- W361876 hasRelatedWork W2107726111 @default.
- W361876 hasRelatedWork W2109100253 @default.
- W361876 hasRelatedWork W2120327309 @default.
- W361876 hasRelatedWork W2120846115 @default.
- W361876 hasRelatedWork W2121863487 @default.
- W361876 hasRelatedWork W2138362680 @default.
- W361876 hasRelatedWork W2145339207 @default.
- W361876 hasRelatedWork W2164637474 @default.