Matches in SemOpenAlex for { <https://semopenalex.org/work/W3093732151> ?p ?o ?g. }
- W3093732151 abstract "In this work, we study the system of interacting non-cooperative two Q-learning agents, where one agent has the privilege of observing the other's actions. We show that this information asymmetry can lead to a stable outcome of population learning, which generally does not occur in an environment of general independent learners. The resulting post-learning policies are almost optimal in the underlying game sense, i.e., they form a Nash equilibrium. Furthermore, we propose in this work a Q-learning algorithm, requiring predictive observation of two subsequent opponent's actions, yielding an optimal strategy given that the latter applies a stationary strategy, and discuss the existence of the Nash equilibrium in the underlying information asymmetrical game." @default.
- W3093732151 created "2020-10-29" @default.
- W3093732151 creator A5000732219 @default.
- W3093732151 creator A5017324775 @default.
- W3093732151 creator A5056019047 @default.
- W3093732151 date "2020-10-21" @default.
- W3093732151 modified "2023-09-27" @default.
- W3093732151 title "On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality." @default.
- W3093732151 cites W1030237925 @default.
- W3093732151 cites W1542941925 @default.
- W3093732151 cites W1592657892 @default.
- W3093732151 cites W1605188341 @default.
- W3093732151 cites W1641379095 @default.
- W3093732151 cites W1907356258 @default.
- W3093732151 cites W2011971614 @default.
- W3093732151 cites W2047399268 @default.
- W3093732151 cites W2056666371 @default.
- W3093732151 cites W2070963703 @default.
- W3093732151 cites W2077611535 @default.
- W3093732151 cites W2099618002 @default.
- W3093732151 cites W2103437045 @default.
- W3093732151 cites W2104602264 @default.
- W3093732151 cites W2107726111 @default.
- W3093732151 cites W2109401393 @default.
- W3093732151 cites W2119567691 @default.
- W3093732151 cites W2120846115 @default.
- W3093732151 cites W2121863487 @default.
- W3093732151 cites W2123117904 @default.
- W3093732151 cites W2134243390 @default.
- W3093732151 cites W2151699494 @default.
- W3093732151 cites W2159813604 @default.
- W3093732151 cites W2169747811 @default.
- W3093732151 cites W2194892124 @default.
- W3093732151 cites W2313875766 @default.
- W3093732151 cites W2402328699 @default.
- W3093732151 cites W2490752264 @default.
- W3093732151 cites W2552014313 @default.
- W3093732151 cites W2575731723 @default.
- W3093732151 cites W2620378547 @default.
- W3093732151 cites W2739826048 @default.
- W3093732151 cites W2766614170 @default.
- W3093732151 cites W2891171329 @default.
- W3093732151 cites W2907907364 @default.
- W3093732151 cites W2947610068 @default.
- W3093732151 cites W2962792006 @default.
- W3093732151 cites W2962990479 @default.
- W3093732151 cites W2963688947 @default.
- W3093732151 cites W2963939962 @default.
- W3093732151 cites W2971610430 @default.
- W3093732151 cites W2972782056 @default.
- W3093732151 cites W2991046523 @default.
- W3093732151 cites W2997581807 @default.
- W3093732151 cites W3000818637 @default.
- W3093732151 cites W3006109470 @default.
- W3093732151 cites W3015983928 @default.
- W3093732151 cites W3028158896 @default.
- W3093732151 cites W3094897513 @default.
- W3093732151 cites W3111119344 @default.
- W3093732151 cites W3114937401 @default.
- W3093732151 cites W609261231 @default.
- W3093732151 cites W3023116600 @default.
- W3093732151 hasPublicationYear "2020" @default.
- W3093732151 type Work @default.
- W3093732151 sameAs 3093732151 @default.
- W3093732151 citedByCount "0" @default.
- W3093732151 crossrefType "posted-content" @default.
- W3093732151 hasAuthorship W3093732151A5000732219 @default.
- W3093732151 hasAuthorship W3093732151A5017324775 @default.
- W3093732151 hasAuthorship W3093732151A5056019047 @default.
- W3093732151 hasConcept C113336015 @default.
- W3093732151 hasConcept C137577040 @default.
- W3093732151 hasConcept C138885662 @default.
- W3093732151 hasConcept C141824439 @default.
- W3093732151 hasConcept C144024400 @default.
- W3093732151 hasConcept C144237770 @default.
- W3093732151 hasConcept C145071142 @default.
- W3093732151 hasConcept C148220186 @default.
- W3093732151 hasConcept C149923435 @default.
- W3093732151 hasConcept C154945302 @default.
- W3093732151 hasConcept C162324750 @default.
- W3093732151 hasConcept C163630976 @default.
- W3093732151 hasConcept C164407509 @default.
- W3093732151 hasConcept C175444787 @default.
- W3093732151 hasConcept C177142836 @default.
- W3093732151 hasConcept C202556891 @default.
- W3093732151 hasConcept C22171661 @default.
- W3093732151 hasConcept C2777303404 @default.
- W3093732151 hasConcept C2779954242 @default.
- W3093732151 hasConcept C2908647359 @default.
- W3093732151 hasConcept C32407928 @default.
- W3093732151 hasConcept C33923547 @default.
- W3093732151 hasConcept C41008148 @default.
- W3093732151 hasConcept C41895202 @default.
- W3093732151 hasConcept C46814582 @default.
- W3093732151 hasConcept C50522688 @default.
- W3093732151 hasConcept C97541855 @default.
- W3093732151 hasConceptScore W3093732151C113336015 @default.
- W3093732151 hasConceptScore W3093732151C137577040 @default.
- W3093732151 hasConceptScore W3093732151C138885662 @default.
- W3093732151 hasConceptScore W3093732151C141824439 @default.