Matches in SemOpenAlex for { <https://semopenalex.org/work/W3091628900> ?p ?o ?g. }
Showing items 1 to 99 of
99
with 100 items per page.
- W3091628900 abstract "Reward shaping is a crucial method to speed up the process of reinforcement learning (RL). However, designing reward shaping functions usually requires many expert demonstrations and much hand-engineering. Moreover, by using the potential function to shape the training rewards, an RL agent can perform Q-learning well to converge the associated Q-table faster without using the expert data, but in deep reinforcement learning (DRL), which is RL using neural networks, Q-learning is sometimes slow to learn the parameters of networks, especially in a long horizon and sparse reward environment. In this paper, we propose a reward model to shape the training rewards for DRL in real time to learn the agent’s motions with a discrete action space. This model and reward shaping method use a combination of agent self-demonstrations and a potential-based reward shaping method to make the neural networks converge faster in every task and can be used in both deep Q-learning and actor-critic methods. We experimentally showed that our proposed method could speed up the DRL in the classic control problems of an agent in various environments." @default.
- W3091628900 created "2020-10-08" @default.
- W3091628900 creator A5059827048 @default.
- W3091628900 creator A5082205147 @default.
- W3091628900 date "2020-07-01" @default.
- W3091628900 modified "2023-09-25" @default.
- W3091628900 title "Meta-Reward Model Based on Trajectory Data with k-Nearest Neighbors Method" @default.
- W3091628900 cites W1130790960 @default.
- W3091628900 cites W13374848 @default.
- W3091628900 cites W1777239053 @default.
- W3091628900 cites W1999874108 @default.
- W3091628900 cites W2061562262 @default.
- W3091628900 cites W2098774185 @default.
- W3091628900 cites W2121863487 @default.
- W3091628900 cites W2122111042 @default.
- W3091628900 cites W2145339207 @default.
- W3091628900 cites W2151382427 @default.
- W3091628900 cites W2164419340 @default.
- W3091628900 cites W2173564293 @default.
- W3091628900 cites W2198041288 @default.
- W3091628900 cites W2294422333 @default.
- W3091628900 cites W2397581010 @default.
- W3091628900 cites W2404616412 @default.
- W3091628900 cites W2472819217 @default.
- W3091628900 cites W2491675558 @default.
- W3091628900 cites W2604763608 @default.
- W3091628900 cites W2620974420 @default.
- W3091628900 cites W2751516180 @default.
- W3091628900 cites W2788455270 @default.
- W3091628900 cites W2895531857 @default.
- W3091628900 cites W2911718261 @default.
- W3091628900 cites W2962943921 @default.
- W3091628900 cites W2963025296 @default.
- W3091628900 cites W2963341924 @default.
- W3091628900 cites W2963376229 @default.
- W3091628900 cites W2964043796 @default.
- W3091628900 cites W2964263543 @default.
- W3091628900 cites W2964273112 @default.
- W3091628900 doi "https://doi.org/10.1109/ijcnn48605.2020.9207388" @default.
- W3091628900 hasPublicationYear "2020" @default.
- W3091628900 type Work @default.
- W3091628900 sameAs 3091628900 @default.
- W3091628900 citedByCount "0" @default.
- W3091628900 crossrefType "proceedings-article" @default.
- W3091628900 hasAuthorship W3091628900A5059827048 @default.
- W3091628900 hasAuthorship W3091628900A5082205147 @default.
- W3091628900 hasConcept C111919701 @default.
- W3091628900 hasConcept C119857082 @default.
- W3091628900 hasConcept C121332964 @default.
- W3091628900 hasConcept C127413603 @default.
- W3091628900 hasConcept C1276947 @default.
- W3091628900 hasConcept C13662910 @default.
- W3091628900 hasConcept C14036430 @default.
- W3091628900 hasConcept C154945302 @default.
- W3091628900 hasConcept C201995342 @default.
- W3091628900 hasConcept C2780451532 @default.
- W3091628900 hasConcept C2780791683 @default.
- W3091628900 hasConcept C41008148 @default.
- W3091628900 hasConcept C50644808 @default.
- W3091628900 hasConcept C62520636 @default.
- W3091628900 hasConcept C78458016 @default.
- W3091628900 hasConcept C86803240 @default.
- W3091628900 hasConcept C97541855 @default.
- W3091628900 hasConcept C98045186 @default.
- W3091628900 hasConceptScore W3091628900C111919701 @default.
- W3091628900 hasConceptScore W3091628900C119857082 @default.
- W3091628900 hasConceptScore W3091628900C121332964 @default.
- W3091628900 hasConceptScore W3091628900C127413603 @default.
- W3091628900 hasConceptScore W3091628900C1276947 @default.
- W3091628900 hasConceptScore W3091628900C13662910 @default.
- W3091628900 hasConceptScore W3091628900C14036430 @default.
- W3091628900 hasConceptScore W3091628900C154945302 @default.
- W3091628900 hasConceptScore W3091628900C201995342 @default.
- W3091628900 hasConceptScore W3091628900C2780451532 @default.
- W3091628900 hasConceptScore W3091628900C2780791683 @default.
- W3091628900 hasConceptScore W3091628900C41008148 @default.
- W3091628900 hasConceptScore W3091628900C50644808 @default.
- W3091628900 hasConceptScore W3091628900C62520636 @default.
- W3091628900 hasConceptScore W3091628900C78458016 @default.
- W3091628900 hasConceptScore W3091628900C86803240 @default.
- W3091628900 hasConceptScore W3091628900C97541855 @default.
- W3091628900 hasConceptScore W3091628900C98045186 @default.
- W3091628900 hasLocation W30916289001 @default.
- W3091628900 hasOpenAccess W3091628900 @default.
- W3091628900 hasPrimaryLocation W30916289001 @default.
- W3091628900 hasRelatedWork W10379689 @default.
- W3091628900 hasRelatedWork W12291563 @default.
- W3091628900 hasRelatedWork W2235786 @default.
- W3091628900 hasRelatedWork W2683128 @default.
- W3091628900 hasRelatedWork W4085024 @default.
- W3091628900 hasRelatedWork W4412456 @default.
- W3091628900 hasRelatedWork W5835750 @default.
- W3091628900 hasRelatedWork W5991403 @default.
- W3091628900 hasRelatedWork W868042 @default.
- W3091628900 hasRelatedWork W9948832 @default.
- W3091628900 isParatext "false" @default.
- W3091628900 isRetracted "false" @default.
- W3091628900 magId "3091628900" @default.
- W3091628900 workType "article" @default.