Matches in SemOpenAlex for { <https://semopenalex.org/work/W3000554410> ?p ?o ?g. }
- W3000554410 endingPage "11799" @default.
- W3000554410 startingPage "11782" @default.
- W3000554410 abstract "Reinforcement learning (RL) techniques, while often powerful, can suffer from slow learning speeds, particularly in high dimensional spaces or in environments with sparse rewards. The decomposition of tasks into a hierarchical structure holds the potential to significantly speed up learning, generalization, and transfer learning. However, the current task decomposition techniques often cannot extract hierarchical task structures without relying on high-level knowledge provided by an expert (e.g., using dynamic Bayesian networks (DBNs) in factored Markov decision processes), which is not necessarily available in autonomous systems. In this paper, we propose a novel method based on Sequential Association Rule Mining that can extract Hierarchical Structure of Tasks in Reinforcement Learning (SARM-HSTRL) in an autonomous manner for both Markov decision processes (MDPs) and factored MDPs. The proposed method leverages association rule mining to discover the causal and temporal relationships among states in different trajectories and extracts a task hierarchy that captures these relationships among sub-goals as termination conditions of different sub-tasks. We prove that the extracted hierarchical policy offers a hierarchically optimal policy in MDPs and factored MDPs. It should be noted that SARM-HSTRL extracts this hierarchical optimal policy without having dynamic Bayesian networks in scenarios with a single task trajectory and also with multiple tasks' trajectories. Furthermore, we show theoretically and empirically that the extracted hierarchical task structure is consistent with trajectories and provides the most efficient, reliable, and compact structure under appropriate assumptions. The numerical results compare the performance of the proposed SARM-HSTRL method with conventional HRL algorithms in terms of the accuracy in detecting the sub-goals, the validity of the extracted hierarchies, and the speed of learning in several testbeds. The key capabilities of SARM-HSTRL including handling multiple tasks and autonomous hierarchical task extraction can lead to the application of this HRL method in reusing, transferring, and generalization of knowledge in different domains." @default.
- W3000554410 created "2020-01-23" @default.
- W3000554410 creator A5035395012 @default.
- W3000554410 creator A5042116950 @default.
- W3000554410 creator A5070914351 @default.
- W3000554410 date "2020-01-01" @default.
- W3000554410 modified "2023-09-26" @default.
- W3000554410 title "Sequential Association Rule Mining for Autonomously Extracting Hierarchical Task Structures in Reinforcement Learning" @default.
- W3000554410 cites W1505837856 @default.
- W3000554410 cites W1536990779 @default.
- W3000554410 cites W1561843061 @default.
- W3000554410 cites W1574241439 @default.
- W3000554410 cites W1591713425 @default.
- W3000554410 cites W1963873191 @default.
- W3000554410 cites W2010134474 @default.
- W3000554410 cites W2035836571 @default.
- W3000554410 cites W2090170171 @default.
- W3000554410 cites W2105489771 @default.
- W3000554410 cites W2121517924 @default.
- W3000554410 cites W2160808139 @default.
- W3000554410 cites W2161252410 @default.
- W3000554410 cites W2272841450 @default.
- W3000554410 cites W2460299708 @default.
- W3000554410 cites W2497090499 @default.
- W3000554410 cites W2735260544 @default.
- W3000554410 cites W2739330054 @default.
- W3000554410 cites W2963523627 @default.
- W3000554410 cites W3103256699 @default.
- W3000554410 cites W4249441547 @default.
- W3000554410 cites W59183349 @default.
- W3000554410 doi "https://doi.org/10.1109/access.2020.2965930" @default.
- W3000554410 hasPublicationYear "2020" @default.
- W3000554410 type Work @default.
- W3000554410 sameAs 3000554410 @default.
- W3000554410 citedByCount "3" @default.
- W3000554410 countsByYear W30005544102020 @default.
- W3000554410 countsByYear W30005544102021 @default.
- W3000554410 countsByYear W30005544102022 @default.
- W3000554410 crossrefType "journal-article" @default.
- W3000554410 hasAuthorship W3000554410A5035395012 @default.
- W3000554410 hasAuthorship W3000554410A5042116950 @default.
- W3000554410 hasAuthorship W3000554410A5070914351 @default.
- W3000554410 hasBestOaLocation W30005544101 @default.
- W3000554410 hasConcept C105795698 @default.
- W3000554410 hasConcept C106189395 @default.
- W3000554410 hasConcept C119857082 @default.
- W3000554410 hasConcept C121332964 @default.
- W3000554410 hasConcept C124681953 @default.
- W3000554410 hasConcept C1276947 @default.
- W3000554410 hasConcept C134306372 @default.
- W3000554410 hasConcept C13662910 @default.
- W3000554410 hasConcept C154945302 @default.
- W3000554410 hasConcept C159886148 @default.
- W3000554410 hasConcept C162324750 @default.
- W3000554410 hasConcept C177148314 @default.
- W3000554410 hasConcept C187736073 @default.
- W3000554410 hasConcept C18903297 @default.
- W3000554410 hasConcept C193524817 @default.
- W3000554410 hasConcept C2780451532 @default.
- W3000554410 hasConcept C31170391 @default.
- W3000554410 hasConcept C33724603 @default.
- W3000554410 hasConcept C33923547 @default.
- W3000554410 hasConcept C34447519 @default.
- W3000554410 hasConcept C41008148 @default.
- W3000554410 hasConcept C82142266 @default.
- W3000554410 hasConcept C86803240 @default.
- W3000554410 hasConcept C97541855 @default.
- W3000554410 hasConceptScore W3000554410C105795698 @default.
- W3000554410 hasConceptScore W3000554410C106189395 @default.
- W3000554410 hasConceptScore W3000554410C119857082 @default.
- W3000554410 hasConceptScore W3000554410C121332964 @default.
- W3000554410 hasConceptScore W3000554410C124681953 @default.
- W3000554410 hasConceptScore W3000554410C1276947 @default.
- W3000554410 hasConceptScore W3000554410C134306372 @default.
- W3000554410 hasConceptScore W3000554410C13662910 @default.
- W3000554410 hasConceptScore W3000554410C154945302 @default.
- W3000554410 hasConceptScore W3000554410C159886148 @default.
- W3000554410 hasConceptScore W3000554410C162324750 @default.
- W3000554410 hasConceptScore W3000554410C177148314 @default.
- W3000554410 hasConceptScore W3000554410C187736073 @default.
- W3000554410 hasConceptScore W3000554410C18903297 @default.
- W3000554410 hasConceptScore W3000554410C193524817 @default.
- W3000554410 hasConceptScore W3000554410C2780451532 @default.
- W3000554410 hasConceptScore W3000554410C31170391 @default.
- W3000554410 hasConceptScore W3000554410C33724603 @default.
- W3000554410 hasConceptScore W3000554410C33923547 @default.
- W3000554410 hasConceptScore W3000554410C34447519 @default.
- W3000554410 hasConceptScore W3000554410C41008148 @default.
- W3000554410 hasConceptScore W3000554410C82142266 @default.
- W3000554410 hasConceptScore W3000554410C86803240 @default.
- W3000554410 hasConceptScore W3000554410C97541855 @default.
- W3000554410 hasLocation W30005544101 @default.
- W3000554410 hasOpenAccess W3000554410 @default.
- W3000554410 hasPrimaryLocation W30005544101 @default.
- W3000554410 hasRelatedWork W1554015367 @default.
- W3000554410 hasRelatedWork W1606131070 @default.
- W3000554410 hasRelatedWork W1987108535 @default.
- W3000554410 hasRelatedWork W2034866487 @default.