Matches in SemOpenAlex for { <https://semopenalex.org/work/W2808492790> ?p ?o ?g. }
- W2808492790 abstract "In a complex environment, the learning efficiency of reinforcement learning methods always decreases due to large-scale or continuous spaces problems, which can cause the well-known curse of dimensionality. To deal with this problem and enhance learning efficiency, this paper introduces an aggregation method by using framework of sample aggregation based on Chinese restaurant process (CRP), named FSA-CRP, to cluster experiential samples, which is represented by quadruples of the current state, action, next state, and the obtained reward. In addition, the proposed algorithm applies a similarity estimation method, the MinHash method, to calculate the similarity between samples. Moreover, to improve the learning efficiency, the experience sharing Dyna learning algorithm based on samples/clusters prediction method is proposed. While an agent learns the value function of the current state, it acquires clustering results, the value functions of the sample merge with the original as the updated value function of the cluster. In indirect learning (planning) for the Dyna-Q, a learning agent looks for the most likely branches of the constructed FSA-CRP model to raise up learning efficiency. The most likely branches will be selected by an improved action/sample selection algorithm. The algorithm applies the probability that the sample appears in the cluster to select simulated experiences for indirect learning. To verify the validity and applicability of the proposed method, experiments are conducted on a simulated maze and a cart-pole system. The results demonstrate that the proposed method can effectively accelerate the learning process." @default.
- W2808492790 created "2018-06-21" @default.
- W2808492790 creator A5008475651 @default.
- W2808492790 creator A5013665827 @default.
- W2808492790 creator A5014247469 @default.
- W2808492790 creator A5049492183 @default.
- W2808492790 creator A5061189209 @default.
- W2808492790 creator A5084917118 @default.
- W2808492790 date "2018-01-01" @default.
- W2808492790 modified "2023-10-03" @default.
- W2808492790 title "A Sample Aggregation Approach to Experiences Replay of Dyna-Q Learning" @default.
- W2808492790 cites W1492518272 @default.
- W2808492790 cites W1912469130 @default.
- W2808492790 cites W1965969360 @default.
- W2808492790 cites W1980035368 @default.
- W2808492790 cites W1997880753 @default.
- W2808492790 cites W2001643966 @default.
- W2808492790 cites W2028145673 @default.
- W2808492790 cites W2052688942 @default.
- W2808492790 cites W2077671246 @default.
- W2808492790 cites W2098412395 @default.
- W2808492790 cites W2105156548 @default.
- W2808492790 cites W2107726111 @default.
- W2808492790 cites W2110422826 @default.
- W2808492790 cites W2112483970 @default.
- W2808492790 cites W2119567691 @default.
- W2808492790 cites W2120499270 @default.
- W2808492790 cites W2121863487 @default.
- W2808492790 cites W2136706845 @default.
- W2808492790 cites W2147131707 @default.
- W2808492790 cites W2386591939 @default.
- W2808492790 cites W2396836028 @default.
- W2808492790 cites W2491167697 @default.
- W2808492790 cites W2508473194 @default.
- W2808492790 cites W2731659850 @default.
- W2808492790 cites W2789901741 @default.
- W2808492790 cites W2963302368 @default.
- W2808492790 doi "https://doi.org/10.1109/access.2018.2847048" @default.
- W2808492790 hasPublicationYear "2018" @default.
- W2808492790 type Work @default.
- W2808492790 sameAs 2808492790 @default.
- W2808492790 citedByCount "5" @default.
- W2808492790 countsByYear W28084927902019 @default.
- W2808492790 countsByYear W28084927902020 @default.
- W2808492790 countsByYear W28084927902021 @default.
- W2808492790 crossrefType "journal-article" @default.
- W2808492790 hasAuthorship W2808492790A5008475651 @default.
- W2808492790 hasAuthorship W2808492790A5013665827 @default.
- W2808492790 hasAuthorship W2808492790A5014247469 @default.
- W2808492790 hasAuthorship W2808492790A5049492183 @default.
- W2808492790 hasAuthorship W2808492790A5061189209 @default.
- W2808492790 hasAuthorship W2808492790A5084917118 @default.
- W2808492790 hasBestOaLocation W28084927901 @default.
- W2808492790 hasConcept C111030470 @default.
- W2808492790 hasConcept C119857082 @default.
- W2808492790 hasConcept C124101348 @default.
- W2808492790 hasConcept C154945302 @default.
- W2808492790 hasConcept C185592680 @default.
- W2808492790 hasConcept C188116033 @default.
- W2808492790 hasConcept C197129107 @default.
- W2808492790 hasConcept C198531522 @default.
- W2808492790 hasConcept C199190896 @default.
- W2808492790 hasConcept C23123220 @default.
- W2808492790 hasConcept C41008148 @default.
- W2808492790 hasConcept C43617362 @default.
- W2808492790 hasConcept C73555534 @default.
- W2808492790 hasConcept C8038995 @default.
- W2808492790 hasConcept C97541855 @default.
- W2808492790 hasConceptScore W2808492790C111030470 @default.
- W2808492790 hasConceptScore W2808492790C119857082 @default.
- W2808492790 hasConceptScore W2808492790C124101348 @default.
- W2808492790 hasConceptScore W2808492790C154945302 @default.
- W2808492790 hasConceptScore W2808492790C185592680 @default.
- W2808492790 hasConceptScore W2808492790C188116033 @default.
- W2808492790 hasConceptScore W2808492790C197129107 @default.
- W2808492790 hasConceptScore W2808492790C198531522 @default.
- W2808492790 hasConceptScore W2808492790C199190896 @default.
- W2808492790 hasConceptScore W2808492790C23123220 @default.
- W2808492790 hasConceptScore W2808492790C41008148 @default.
- W2808492790 hasConceptScore W2808492790C43617362 @default.
- W2808492790 hasConceptScore W2808492790C73555534 @default.
- W2808492790 hasConceptScore W2808492790C8038995 @default.
- W2808492790 hasConceptScore W2808492790C97541855 @default.
- W2808492790 hasFunder F4320322857 @default.
- W2808492790 hasFunder F4320335777 @default.
- W2808492790 hasFunder F4320335787 @default.
- W2808492790 hasLocation W28084927901 @default.
- W2808492790 hasOpenAccess W2808492790 @default.
- W2808492790 hasPrimaryLocation W28084927901 @default.
- W2808492790 hasRelatedWork W109652693 @default.
- W2808492790 hasRelatedWork W1493147646 @default.
- W2808492790 hasRelatedWork W1507087299 @default.
- W2808492790 hasRelatedWork W1517639318 @default.
- W2808492790 hasRelatedWork W1813810678 @default.
- W2808492790 hasRelatedWork W184546935 @default.
- W2808492790 hasRelatedWork W1981165078 @default.
- W2808492790 hasRelatedWork W2039044448 @default.
- W2808492790 hasRelatedWork W2043700026 @default.
- W2808492790 hasRelatedWork W2098723043 @default.
- W2808492790 hasRelatedWork W2133516300 @default.