Matches in SemOpenAlex for { <https://semopenalex.org/work/W2898567809> ?p ?o ?g. }
- W2898567809 abstract "Cooperative multi-agent reinforcement learning often requires decentralised policies, which severely limit the agents' ability to coordinate their behaviour. In this paper, we show that common knowledge between agents allows for complex decentralised coordination. Common knowledge arises naturally in a large number of decentralised cooperative multi-agent tasks, for example, when agents can reconstruct parts of each others' observations. Since agents an independently agree on their common knowledge, they can execute complex coordinated policies that condition on this knowledge in a fully decentralised fashion. We propose multi-agent common knowledge reinforcement learning (MACKRL), a novel stochastic actor-critic algorithm that learns a hierarchical policy tree. Higher levels in the hierarchy coordinate groups of agents by conditioning on their common knowledge, or delegate to lower levels with smaller subgroups but potentially richer common knowledge. The entire policy tree can be executed in a fully decentralised fashion. As the lowest policy tree level consists of independent policies for each agent, MACKRL reduces to independently learnt decentralised policies as a special case. We demonstrate that our method can exploit common knowledge for superior performance on complex decentralised coordination tasks, including a stochastic matrix game and challenging problems in StarCraft II unit micromanagement." @default.
- W2898567809 created "2018-11-02" @default.
- W2898567809 creator A5016496356 @default.
- W2898567809 creator A5038213642 @default.
- W2898567809 creator A5042899882 @default.
- W2898567809 creator A5045468603 @default.
- W2898567809 creator A5056879203 @default.
- W2898567809 creator A5059094093 @default.
- W2898567809 date "2018-10-27" @default.
- W2898567809 modified "2023-10-01" @default.
- W2898567809 title "Multi-Agent Common Knowledge Reinforcement Learning" @default.
- W2898567809 cites W1484740474 @default.
- W2898567809 cites W1506769828 @default.
- W2898567809 cites W1560400919 @default.
- W2898567809 cites W1606829969 @default.
- W2898567809 cites W1641379095 @default.
- W2898567809 cites W1968340857 @default.
- W2898567809 cites W1982678075 @default.
- W2898567809 cites W1991888757 @default.
- W2898567809 cites W2012812921 @default.
- W2898567809 cites W2022205872 @default.
- W2898567809 cites W2049916932 @default.
- W2898567809 cites W2096622112 @default.
- W2898567809 cites W2099618002 @default.
- W2898567809 cites W2104602264 @default.
- W2898567809 cites W2108684595 @default.
- W2898567809 cites W2109100253 @default.
- W2898567809 cites W2112794046 @default.
- W2898567809 cites W2119717200 @default.
- W2898567809 cites W2121517924 @default.
- W2898567809 cites W2134779831 @default.
- W2898567809 cites W2138965998 @default.
- W2898567809 cites W2139993574 @default.
- W2898567809 cites W2142880189 @default.
- W2898567809 cites W2155027007 @default.
- W2898567809 cites W2156737235 @default.
- W2898567809 cites W2159182476 @default.
- W2898567809 cites W2235512564 @default.
- W2898567809 cites W2292533394 @default.
- W2898567809 cites W2294192315 @default.
- W2898567809 cites W2395575420 @default.
- W2898567809 cites W2518713116 @default.
- W2898567809 cites W2547416798 @default.
- W2898567809 cites W2549542446 @default.
- W2898567809 cites W2560310599 @default.
- W2898567809 cites W2562637642 @default.
- W2898567809 cites W2592798481 @default.
- W2898567809 cites W2604283518 @default.
- W2898567809 cites W2623431351 @default.
- W2898567809 cites W2626637010 @default.
- W2898567809 cites W2736465728 @default.
- W2898567809 cites W2749807327 @default.
- W2898567809 cites W2756196406 @default.
- W2898567809 cites W2768629321 @default.
- W2898567809 cites W2781238083 @default.
- W2898567809 cites W2793398421 @default.
- W2898567809 cites W2794643322 @default.
- W2898567809 cites W2806824292 @default.
- W2898567809 cites W2913964242 @default.
- W2898567809 cites W2949267040 @default.
- W2898567809 cites W2950633365 @default.
- W2898567809 cites W2951896791 @default.
- W2898567809 cites W2951984055 @default.
- W2898567809 cites W2952606116 @default.
- W2898567809 cites W2962938168 @default.
- W2898567809 cites W2963000099 @default.
- W2898567809 cites W2963201472 @default.
- W2898567809 cites W2963502082 @default.
- W2898567809 cites W2964338167 @default.
- W2898567809 cites W2966896140 @default.
- W2898567809 cites W3093287223 @default.
- W2898567809 cites W56117469 @default.
- W2898567809 hasPublicationYear "2018" @default.
- W2898567809 type Work @default.
- W2898567809 sameAs 2898567809 @default.
- W2898567809 citedByCount "8" @default.
- W2898567809 countsByYear W28985678092018 @default.
- W2898567809 countsByYear W28985678092019 @default.
- W2898567809 countsByYear W28985678092020 @default.
- W2898567809 crossrefType "posted-content" @default.
- W2898567809 hasAuthorship W2898567809A5016496356 @default.
- W2898567809 hasAuthorship W2898567809A5038213642 @default.
- W2898567809 hasAuthorship W2898567809A5042899882 @default.
- W2898567809 hasAuthorship W2898567809A5045468603 @default.
- W2898567809 hasAuthorship W2898567809A5056879203 @default.
- W2898567809 hasAuthorship W2898567809A5059094093 @default.
- W2898567809 hasConcept C102993220 @default.
- W2898567809 hasConcept C107247716 @default.
- W2898567809 hasConcept C113174947 @default.
- W2898567809 hasConcept C134306372 @default.
- W2898567809 hasConcept C143273055 @default.
- W2898567809 hasConcept C154945302 @default.
- W2898567809 hasConcept C162324750 @default.
- W2898567809 hasConcept C199360897 @default.
- W2898567809 hasConcept C203659156 @default.
- W2898567809 hasConcept C31084985 @default.
- W2898567809 hasConcept C31170391 @default.
- W2898567809 hasConcept C33923547 @default.
- W2898567809 hasConcept C34447519 @default.
- W2898567809 hasConcept C41008148 @default.