Matches in SemOpenAlex for { <https://semopenalex.org/work/W6043852> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W6043852 endingPage "1108" @default.
- W6043852 startingPage "1101" @default.
- W6043852 abstract "Coordinated multi-agent reinforcement learning (MARL) provides a promising approach to scaling learning in large cooperative multi-agent systems. Distributed constraint optimization (DCOP) techniques have been used to coordinate action selection among agents during both the learning phase and the policy execution phase (if learning is off-line) to ensure good overall system performance. However, running DCOP algorithms for each action selection through the whole system results in significant communication among agents, which is not practical for most applications with limited communication bandwidth. In this paper, we develop a learning approach that generalizes previous coordinated MARL approaches that use DCOP algorithms and enables MARL to be conducted over a spectrum from independent learning (without communication) to fully coordinated learning depending on agents' communication bandwidth. Our approach defines an interaction measure that allows agents to dynamically identify their beneficial coordination set (i.e., whom to coordinate with) in different situations and to trade off its performance and communication cost. By limiting their coordination set, agents dynamically decompose the coordination network in a distributed way, resulting in dramatically reduced communication for DCOP algorithms without significantly affecting overall learning performance. Essentially, our learning approach conducts co-adaptation of agents' policy learning and coordination set identification, which outperforms approaches that sequence them." @default.
- W6043852 created "2016-06-24" @default.
- W6043852 creator A5010176958 @default.
- W6043852 creator A5020207982 @default.
- W6043852 date "2013-05-06" @default.
- W6043852 modified "2023-09-26" @default.
- W6043852 title "Coordinating multi-agent reinforcement learning with limited communication" @default.
- W6043852 cites W102626069 @default.
- W6043852 cites W1540755814 @default.
- W6043852 cites W1570690983 @default.
- W6043852 cites W1588304026 @default.
- W6043852 cites W1749432972 @default.
- W6043852 cites W1769870091 @default.
- W6043852 cites W2088956500 @default.
- W6043852 cites W2104602264 @default.
- W6043852 cites W2110906765 @default.
- W6043852 cites W2112632840 @default.
- W6043852 cites W2118318536 @default.
- W6043852 cites W2134779831 @default.
- W6043852 cites W2152430560 @default.
- W6043852 cites W2159142421 @default.
- W6043852 cites W2238976967 @default.
- W6043852 cites W2404646363 @default.
- W6043852 doi "https://doi.org/10.5555/2484920.2485093" @default.
- W6043852 hasPublicationYear "2013" @default.
- W6043852 type Work @default.
- W6043852 sameAs 6043852 @default.
- W6043852 citedByCount "55" @default.
- W6043852 countsByYear W60438522014 @default.
- W6043852 countsByYear W60438522015 @default.
- W6043852 countsByYear W60438522016 @default.
- W6043852 countsByYear W60438522017 @default.
- W6043852 countsByYear W60438522018 @default.
- W6043852 countsByYear W60438522019 @default.
- W6043852 countsByYear W60438522020 @default.
- W6043852 countsByYear W60438522021 @default.
- W6043852 crossrefType "proceedings-article" @default.
- W6043852 hasAuthorship W6043852A5010176958 @default.
- W6043852 hasAuthorship W6043852A5020207982 @default.
- W6043852 hasConcept C119857082 @default.
- W6043852 hasConcept C120314980 @default.
- W6043852 hasConcept C154945302 @default.
- W6043852 hasConcept C166109690 @default.
- W6043852 hasConcept C169760540 @default.
- W6043852 hasConcept C26760741 @default.
- W6043852 hasConcept C41008148 @default.
- W6043852 hasConcept C86803240 @default.
- W6043852 hasConcept C97541855 @default.
- W6043852 hasConceptScore W6043852C119857082 @default.
- W6043852 hasConceptScore W6043852C120314980 @default.
- W6043852 hasConceptScore W6043852C154945302 @default.
- W6043852 hasConceptScore W6043852C166109690 @default.
- W6043852 hasConceptScore W6043852C169760540 @default.
- W6043852 hasConceptScore W6043852C26760741 @default.
- W6043852 hasConceptScore W6043852C41008148 @default.
- W6043852 hasConceptScore W6043852C86803240 @default.
- W6043852 hasConceptScore W6043852C97541855 @default.
- W6043852 hasLocation W60438521 @default.
- W6043852 hasOpenAccess W6043852 @default.
- W6043852 hasPrimaryLocation W60438521 @default.
- W6043852 hasRelatedWork W1515851193 @default.
- W6043852 hasRelatedWork W1542941925 @default.
- W6043852 hasRelatedWork W1560074431 @default.
- W6043852 hasRelatedWork W1641379095 @default.
- W6043852 hasRelatedWork W2099618002 @default.
- W6043852 hasRelatedWork W2107544712 @default.
- W6043852 hasRelatedWork W2110906765 @default.
- W6043852 hasRelatedWork W2118318536 @default.
- W6043852 hasRelatedWork W2121863487 @default.
- W6043852 hasRelatedWork W2134779831 @default.
- W6043852 hasRelatedWork W2145339207 @default.
- W6043852 hasRelatedWork W2173248099 @default.
- W6043852 hasRelatedWork W2404646363 @default.
- W6043852 hasRelatedWork W2756196406 @default.
- W6043852 hasRelatedWork W2962938168 @default.
- W6043852 hasRelatedWork W2963000099 @default.
- W6043852 hasRelatedWork W2963407617 @default.
- W6043852 hasRelatedWork W2963717208 @default.
- W6043852 hasRelatedWork W2964338167 @default.
- W6043852 hasRelatedWork W3093287223 @default.
- W6043852 isParatext "false" @default.
- W6043852 isRetracted "false" @default.
- W6043852 magId "6043852" @default.
- W6043852 workType "article" @default.