SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W4382045768> ?p ?o ?g. }

Showing items 1 to 97 of 97 with 100 items per page.

W4382045768 endingPage "76" @default.
W4382045768 startingPage "59" @default.
W4382045768 abstract "Multi-Agent Reinforcement Learning (MARL) has been used to solve sequential decision problems by a collection of intelligent agents interacting in a shared environment. However, the design complexity of MARL strategies increases with the complexity of the task specifications. In addition, current MARL approaches suffer from slow convergence and reward sparsity when dealing with multi-task specifications. Linear temporal logic works as one of the software engineering practices to describe non-Markovian task specifications, whose synthesized strategies can be used as a priori knowledge to train the multi-agents to interact with the environment more efficiently. In this paper, we consider multi-agents that react to each other with a high-level reactive temporal logic specification called Generalized Reactivity of rank 1 (GR(1)). We first decompose the synthesized strategy of GR(1) into a set of potential-based reward machines for individual agents. We prove that the parallel composition of the reward machines forward simulates the original reward machine, which satisfies the GR(1) specification. We then extend the Markov Decision Process (MDP) with the synchronized reward machines. A value-iteration-based approach is developed to compute the potential values of the reward machine based on the strategy structure. We also propose a decentralized Q-learning algorithm to train the multi-agents with the extended MDP. Experiments on multi-agent learning under different reactive temporal logic specifications demonstrate the effectiveness of the proposed method, showing a superior learning curve and optimal rewards." @default.
W4382045768 created "2023-06-27" @default.
W4382045768 creator A5002661071 @default.
W4382045768 creator A5006165246 @default.
W4382045768 creator A5042268563 @default.
W4382045768 creator A5053294704 @default.
W4382045768 date "2023-01-01" @default.
W4382045768 modified "2023-10-16" @default.
W4382045768 title "Decomposing Synthesized Strategies for Reactive Multi-agent Reinforcement Learning" @default.
W4382045768 cites W1531321671 @default.
W4382045768 cites W1641379095 @default.
W4382045768 cites W2016206563 @default.
W4382045768 cites W2023808162 @default.
W4382045768 cites W2059470663 @default.
W4382045768 cites W2082784990 @default.
W4382045768 cites W2107726111 @default.
W4382045768 cites W2157807654 @default.
W4382045768 cites W2328819335 @default.
W4382045768 cites W2509841691 @default.
W4382045768 cites W2895196950 @default.
W4382045768 cites W2913326990 @default.
W4382045768 cites W2931553127 @default.
W4382045768 cites W2966537673 @default.
W4382045768 cites W2999973955 @default.
W4382045768 cites W3011250830 @default.
W4382045768 cites W3023554972 @default.
W4382045768 cites W3092156990 @default.
W4382045768 cites W3156295478 @default.
W4382045768 cites W3213843225 @default.
W4382045768 cites W32403112 @default.
W4382045768 cites W4297374029 @default.
W4382045768 cites W4302774970 @default.
W4382045768 cites W4309013619 @default.
W4382045768 cites W4309626784 @default.
W4382045768 cites W4321062009 @default.
W4382045768 doi "https://doi.org/10.1007/978-3-031-35257-7_4" @default.
W4382045768 hasPublicationYear "2023" @default.
W4382045768 type Work @default.
W4382045768 citedByCount "1" @default.
W4382045768 countsByYear W43820457682023 @default.
W4382045768 crossrefType "book-chapter" @default.
W4382045768 hasAuthorship W4382045768A5002661071 @default.
W4382045768 hasAuthorship W4382045768A5006165246 @default.
W4382045768 hasAuthorship W4382045768A5042268563 @default.
W4382045768 hasAuthorship W4382045768A5053294704 @default.
W4382045768 hasConcept C105795698 @default.
W4382045768 hasConcept C106189395 @default.
W4382045768 hasConcept C111472728 @default.
W4382045768 hasConcept C119857082 @default.
W4382045768 hasConcept C138885662 @default.
W4382045768 hasConcept C154945302 @default.
W4382045768 hasConcept C159886148 @default.
W4382045768 hasConcept C162324750 @default.
W4382045768 hasConcept C177264268 @default.
W4382045768 hasConcept C187736073 @default.
W4382045768 hasConcept C199360897 @default.
W4382045768 hasConcept C2780451532 @default.
W4382045768 hasConcept C33923547 @default.
W4382045768 hasConcept C41008148 @default.
W4382045768 hasConcept C75553542 @default.
W4382045768 hasConcept C80444323 @default.
W4382045768 hasConcept C97541855 @default.
W4382045768 hasConceptScore W4382045768C105795698 @default.
W4382045768 hasConceptScore W4382045768C106189395 @default.
W4382045768 hasConceptScore W4382045768C111472728 @default.
W4382045768 hasConceptScore W4382045768C119857082 @default.
W4382045768 hasConceptScore W4382045768C138885662 @default.
W4382045768 hasConceptScore W4382045768C154945302 @default.
W4382045768 hasConceptScore W4382045768C159886148 @default.
W4382045768 hasConceptScore W4382045768C162324750 @default.
W4382045768 hasConceptScore W4382045768C177264268 @default.
W4382045768 hasConceptScore W4382045768C187736073 @default.
W4382045768 hasConceptScore W4382045768C199360897 @default.
W4382045768 hasConceptScore W4382045768C2780451532 @default.
W4382045768 hasConceptScore W4382045768C33923547 @default.
W4382045768 hasConceptScore W4382045768C41008148 @default.
W4382045768 hasConceptScore W4382045768C75553542 @default.
W4382045768 hasConceptScore W4382045768C80444323 @default.
W4382045768 hasConceptScore W4382045768C97541855 @default.
W4382045768 hasLocation W43820457681 @default.
W4382045768 hasOpenAccess W4382045768 @default.
W4382045768 hasPrimaryLocation W43820457681 @default.
W4382045768 hasRelatedWork W2101748387 @default.
W4382045768 hasRelatedWork W2949964922 @default.
W4382045768 hasRelatedWork W2952448454 @default.
W4382045768 hasRelatedWork W3074294383 @default.
W4382045768 hasRelatedWork W3096874164 @default.
W4382045768 hasRelatedWork W3171755056 @default.
W4382045768 hasRelatedWork W3201878770 @default.
W4382045768 hasRelatedWork W4226437174 @default.
W4382045768 hasRelatedWork W4287123794 @default.
W4382045768 hasRelatedWork W4319083788 @default.
W4382045768 isParatext "false" @default.
W4382045768 isRetracted "false" @default.
W4382045768 workType "book-chapter" @default.