Matches in SemOpenAlex for { <https://semopenalex.org/work/W751028141> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W751028141 endingPage "e0127129" @default.
- W751028141 startingPage "e0127129" @default.
- W751028141 abstract "Multi-Agent Reinforcement Learning (MARL) algorithms face two main difficulties: the curse of dimensionality, and environment non-stationarity due to the independent learning processes carried out by the agents concurrently. In this paper we formalize and prove the convergence of a Distributed Round Robin Q-learning (D-RR-QL) algorithm for cooperative systems. The computational complexity of this algorithm increases linearly with the number of agents. Moreover, it eliminates environment non sta tionarity by carrying a round-robin scheduling of the action selection and execution. That this learning scheme allows the implementation of Modular State-Action Vetoes (MSAV) in cooperative multi-agent systems, which speeds up learning convergence in over-constrained systems by vetoing state-action pairs which lead to undesired termination states (UTS) in the relevant state-action subspace. Each agent's local state-action value function learning is an independent process, including the MSAV policies. Coordination of locally optimal policies to obtain the global optimal joint policy is achieved by a greedy selection procedure using message passing. We show that D-RR-QL improves over state-of-the-art approaches, such as Distributed Q-Learning, Team Q-Learning and Coordinated Reinforcement Learning in a paradigmatic Linked Multi-Component Robotic System (L-MCRS) control problem: the hose transportation task. L-MCRS are over-constrained systems with many UTS induced by the interaction of the passive linking element and the active mobile robots." @default.
- W751028141 created "2016-06-24" @default.
- W751028141 creator A5013959907 @default.
- W751028141 creator A5036851536 @default.
- W751028141 creator A5045365589 @default.
- W751028141 date "2015-07-09" @default.
- W751028141 modified "2023-10-18" @default.
- W751028141 title "Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning" @default.
- W751028141 cites W1515436326 @default.
- W751028141 cites W1972825041 @default.
- W751028141 cites W1973039793 @default.
- W751028141 cites W1985215440 @default.
- W751028141 cites W2001494341 @default.
- W751028141 cites W2032626653 @default.
- W751028141 cites W2041872606 @default.
- W751028141 cites W2049423956 @default.
- W751028141 cites W2074500080 @default.
- W751028141 cites W2099618002 @default.
- W751028141 cites W2107726111 @default.
- W751028141 cites W2108892923 @default.
- W751028141 cites W2153204852 @default.
- W751028141 cites W2623057586 @default.
- W751028141 cites W4214717370 @default.
- W751028141 doi "https://doi.org/10.1371/journal.pone.0127129" @default.
- W751028141 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/4497621" @default.
- W751028141 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/26158587" @default.
- W751028141 hasPublicationYear "2015" @default.
- W751028141 type Work @default.
- W751028141 sameAs 751028141 @default.
- W751028141 citedByCount "15" @default.
- W751028141 countsByYear W7510281412015 @default.
- W751028141 countsByYear W7510281412016 @default.
- W751028141 countsByYear W7510281412018 @default.
- W751028141 countsByYear W7510281412019 @default.
- W751028141 countsByYear W7510281412020 @default.
- W751028141 countsByYear W7510281412022 @default.
- W751028141 countsByYear W7510281412023 @default.
- W751028141 crossrefType "journal-article" @default.
- W751028141 hasAuthorship W751028141A5013959907 @default.
- W751028141 hasAuthorship W751028141A5036851536 @default.
- W751028141 hasAuthorship W751028141A5045365589 @default.
- W751028141 hasBestOaLocation W7510281411 @default.
- W751028141 hasConcept C111030470 @default.
- W751028141 hasConcept C120314980 @default.
- W751028141 hasConcept C126255220 @default.
- W751028141 hasConcept C154945302 @default.
- W751028141 hasConcept C188116033 @default.
- W751028141 hasConcept C206729178 @default.
- W751028141 hasConcept C33923547 @default.
- W751028141 hasConcept C41008148 @default.
- W751028141 hasConcept C97541855 @default.
- W751028141 hasConceptScore W751028141C111030470 @default.
- W751028141 hasConceptScore W751028141C120314980 @default.
- W751028141 hasConceptScore W751028141C126255220 @default.
- W751028141 hasConceptScore W751028141C154945302 @default.
- W751028141 hasConceptScore W751028141C188116033 @default.
- W751028141 hasConceptScore W751028141C206729178 @default.
- W751028141 hasConceptScore W751028141C33923547 @default.
- W751028141 hasConceptScore W751028141C41008148 @default.
- W751028141 hasConceptScore W751028141C97541855 @default.
- W751028141 hasIssue "7" @default.
- W751028141 hasLocation W7510281411 @default.
- W751028141 hasLocation W7510281412 @default.
- W751028141 hasLocation W7510281413 @default.
- W751028141 hasLocation W7510281414 @default.
- W751028141 hasLocation W7510281415 @default.
- W751028141 hasLocation W7510281416 @default.
- W751028141 hasOpenAccess W751028141 @default.
- W751028141 hasPrimaryLocation W7510281411 @default.
- W751028141 hasRelatedWork W1583080569 @default.
- W751028141 hasRelatedWork W1882733036 @default.
- W751028141 hasRelatedWork W1992741870 @default.
- W751028141 hasRelatedWork W2109998134 @default.
- W751028141 hasRelatedWork W2140924946 @default.
- W751028141 hasRelatedWork W2160425906 @default.
- W751028141 hasRelatedWork W2546696010 @default.
- W751028141 hasRelatedWork W3154622627 @default.
- W751028141 hasRelatedWork W4206669594 @default.
- W751028141 hasRelatedWork W4382050307 @default.
- W751028141 hasVolume "10" @default.
- W751028141 isParatext "false" @default.
- W751028141 isRetracted "false" @default.
- W751028141 magId "751028141" @default.
- W751028141 workType "article" @default.