Matches in SemOpenAlex for { <https://semopenalex.org/work/W2515676083> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W2515676083 abstract "Nowadays, transfer learning (TL) has become a crucial technique to accelerate the slow optimization procedure of reinforcement learning (RL) by re-utilizing knowledge acquired in a previous related task. Nevertheless, most of the current relevant research acquires knowledge through RL training in the source task, which would be too time-consuming. In view of this situation, in this paper, we propose a novel TL framework where the agent extracts knowledge from human-demonstration trajectories of the source task and reuses the knowledge in RL in the target task. As for what to transfer, two forms of knowledge deduced from the demonstration trajectories, which are the k-nearest neighbour of the current state in source samples and visit frequency of homologous states, are adopted. For how to transfer, the two forms of knowledge are respectively used to recommend a preferred action when random exploration is needed and to shape an instantaneous reward for RL. Simulation experiments of balancing Cart-Poles with different difficulties suggest that both the two forms of knowledge accelerate the learning process of RL obviously. What is more, the effect is even more significant when they are used in combination. In this case, the experimental results manifest the positive role of our framework in RL." @default.
- W2515676083 created "2016-09-16" @default.
- W2515676083 creator A5048127669 @default.
- W2515676083 creator A5054085447 @default.
- W2515676083 creator A5069407513 @default.
- W2515676083 creator A5084846906 @default.
- W2515676083 date "2016-09-05" @default.
- W2515676083 modified "2023-10-17" @default.
- W2515676083 title "Transferring knowledge from human-demonstration trajectories to reinforcement learning" @default.
- W2515676083 cites W1949804828 @default.
- W2515676083 cites W1977655452 @default.
- W2515676083 cites W1980620643 @default.
- W2515676083 cites W1986014385 @default.
- W2515676083 cites W2021055657 @default.
- W2515676083 cites W2113921460 @default.
- W2515676083 cites W2114580749 @default.
- W2515676083 cites W2129427976 @default.
- W2515676083 cites W2165698076 @default.
- W2515676083 cites W2172968643 @default.
- W2515676083 cites W4214717370 @default.
- W2515676083 doi "https://doi.org/10.1177/0142331216649655" @default.
- W2515676083 hasPublicationYear "2016" @default.
- W2515676083 type Work @default.
- W2515676083 sameAs 2515676083 @default.
- W2515676083 citedByCount "6" @default.
- W2515676083 countsByYear W25156760832018 @default.
- W2515676083 countsByYear W25156760832019 @default.
- W2515676083 countsByYear W25156760832020 @default.
- W2515676083 countsByYear W25156760832021 @default.
- W2515676083 countsByYear W25156760832022 @default.
- W2515676083 crossrefType "journal-article" @default.
- W2515676083 hasAuthorship W2515676083A5048127669 @default.
- W2515676083 hasAuthorship W2515676083A5054085447 @default.
- W2515676083 hasAuthorship W2515676083A5069407513 @default.
- W2515676083 hasAuthorship W2515676083A5084846906 @default.
- W2515676083 hasBestOaLocation W25156760831 @default.
- W2515676083 hasConcept C107457646 @default.
- W2515676083 hasConcept C127413603 @default.
- W2515676083 hasConcept C154945302 @default.
- W2515676083 hasConcept C41008148 @default.
- W2515676083 hasConcept C47932503 @default.
- W2515676083 hasConcept C66938386 @default.
- W2515676083 hasConcept C67203356 @default.
- W2515676083 hasConcept C97541855 @default.
- W2515676083 hasConceptScore W2515676083C107457646 @default.
- W2515676083 hasConceptScore W2515676083C127413603 @default.
- W2515676083 hasConceptScore W2515676083C154945302 @default.
- W2515676083 hasConceptScore W2515676083C41008148 @default.
- W2515676083 hasConceptScore W2515676083C47932503 @default.
- W2515676083 hasConceptScore W2515676083C66938386 @default.
- W2515676083 hasConceptScore W2515676083C67203356 @default.
- W2515676083 hasConceptScore W2515676083C97541855 @default.
- W2515676083 hasFunder F4320321001 @default.
- W2515676083 hasFunder F4320338464 @default.
- W2515676083 hasLocation W25156760831 @default.
- W2515676083 hasOpenAccess W2515676083 @default.
- W2515676083 hasPrimaryLocation W25156760831 @default.
- W2515676083 hasRelatedWork W115717799 @default.
- W2515676083 hasRelatedWork W1457482454 @default.
- W2515676083 hasRelatedWork W1487478277 @default.
- W2515676083 hasRelatedWork W1497976081 @default.
- W2515676083 hasRelatedWork W1511570998 @default.
- W2515676083 hasRelatedWork W1882507001 @default.
- W2515676083 hasRelatedWork W1935091844 @default.
- W2515676083 hasRelatedWork W2130711276 @default.
- W2515676083 hasRelatedWork W2355284511 @default.
- W2515676083 hasRelatedWork W2365393372 @default.
- W2515676083 hasRelatedWork W2386329118 @default.
- W2515676083 hasRelatedWork W2615565422 @default.
- W2515676083 hasRelatedWork W2733762270 @default.
- W2515676083 hasRelatedWork W2808546214 @default.
- W2515676083 hasRelatedWork W2957624498 @default.
- W2515676083 hasRelatedWork W2978070926 @default.
- W2515676083 hasRelatedWork W3095083155 @default.
- W2515676083 hasRelatedWork W3127561923 @default.
- W2515676083 hasRelatedWork W3208037667 @default.
- W2515676083 hasRelatedWork W2190150267 @default.
- W2515676083 isParatext "false" @default.
- W2515676083 isRetracted "false" @default.
- W2515676083 magId "2515676083" @default.
- W2515676083 workType "article" @default.