Matches in SemOpenAlex for { <https://semopenalex.org/work/W3196831457> ?p ?o ?g. }
Showing items 1 to 94 of
94
with 100 items per page.
- W3196831457 abstract "Dealing with the robotic continuous control problem with sparse rewards is a longstanding challenge in deep reinforcement learning (RL). While existing DRL algorithms have demonstrated great progress in learning policies from visual observations, learning effective policies still requires an impractical number of real-world data samples. Moreover, some robotic tasks are naturally specified with sparse rewards, which makes the precious data inefficient and slows down the learning process, making DRL infeasible. In addition, manually shaping reward functions is a complex work because it needs specific domain knowledge and human intervention. To alleviate the issue, this paper proposes a model-free, off-policy RL approach named TD3MHER, to learn the manipulating policy for continuous robotic tasks with sparse rewards. To be specific, TD3MHER utilizes Twin Delayed Deep Deterministic policy gradient algorithm (TD3) and Model-driven Hindsight Experience Replay (MHER) to achieve highly sample-efficient training property. Because while the agent is learning the policy, TD3MHER could also help it to learn the potation physical model of the robot which is helpful to solve the task, and it does not necessitate any novel robot-environment interactions. The performance of TD3MHER is assessed on a simulated robotic task using a 7-DOF manipulator to compare the proposed technique to a previous DRL algorithm and to verify the usefulness of our method. Results of the experiments on simulated robotic task show that the proposed approach is capable of successfully utilizing previously store samples with sparse rewards, and obtain a faster learning speed." @default.
- W3196831457 created "2021-09-13" @default.
- W3196831457 creator A5003590227 @default.
- W3196831457 creator A5003653635 @default.
- W3196831457 creator A5008397618 @default.
- W3196831457 creator A5073231890 @default.
- W3196831457 date "2021-07-15" @default.
- W3196831457 modified "2023-09-23" @default.
- W3196831457 title "Data-efficient Deep Reinforcement Learning Method Toward Scaling Continuous Robotic Task with Sparse Rewards" @default.
- W3196831457 cites W1757796397 @default.
- W3196831457 cites W2082511574 @default.
- W3196831457 cites W2116386744 @default.
- W3196831457 cites W2257979135 @default.
- W3196831457 cites W2296073425 @default.
- W3196831457 cites W2347074400 @default.
- W3196831457 cites W2575705757 @default.
- W3196831457 cites W2738318237 @default.
- W3196831457 cites W2789008106 @default.
- W3196831457 cites W2954718200 @default.
- W3196831457 cites W2960876848 @default.
- W3196831457 cites W2962902376 @default.
- W3196831457 cites W2963311874 @default.
- W3196831457 cites W2963401755 @default.
- W3196831457 cites W2963508354 @default.
- W3196831457 cites W2963669336 @default.
- W3196831457 cites W2963864421 @default.
- W3196831457 cites W2963923407 @default.
- W3196831457 cites W2964001908 @default.
- W3196831457 cites W2964043796 @default.
- W3196831457 cites W2964120017 @default.
- W3196831457 cites W2996037775 @default.
- W3196831457 cites W3005971017 @default.
- W3196831457 cites W3007769740 @default.
- W3196831457 cites W3122688976 @default.
- W3196831457 cites W3156829097 @default.
- W3196831457 doi "https://doi.org/10.1109/rcar52367.2021.9517647" @default.
- W3196831457 hasPublicationYear "2021" @default.
- W3196831457 type Work @default.
- W3196831457 sameAs 3196831457 @default.
- W3196831457 citedByCount "0" @default.
- W3196831457 crossrefType "proceedings-article" @default.
- W3196831457 hasAuthorship W3196831457A5003590227 @default.
- W3196831457 hasAuthorship W3196831457A5003653635 @default.
- W3196831457 hasAuthorship W3196831457A5008397618 @default.
- W3196831457 hasAuthorship W3196831457A5073231890 @default.
- W3196831457 hasConcept C10347200 @default.
- W3196831457 hasConcept C111472728 @default.
- W3196831457 hasConcept C111919701 @default.
- W3196831457 hasConcept C119857082 @default.
- W3196831457 hasConcept C127413603 @default.
- W3196831457 hasConcept C138885662 @default.
- W3196831457 hasConcept C154945302 @default.
- W3196831457 hasConcept C15744967 @default.
- W3196831457 hasConcept C180747234 @default.
- W3196831457 hasConcept C189950617 @default.
- W3196831457 hasConcept C201995342 @default.
- W3196831457 hasConcept C2780451532 @default.
- W3196831457 hasConcept C41008148 @default.
- W3196831457 hasConcept C90509273 @default.
- W3196831457 hasConcept C97541855 @default.
- W3196831457 hasConcept C98045186 @default.
- W3196831457 hasConceptScore W3196831457C10347200 @default.
- W3196831457 hasConceptScore W3196831457C111472728 @default.
- W3196831457 hasConceptScore W3196831457C111919701 @default.
- W3196831457 hasConceptScore W3196831457C119857082 @default.
- W3196831457 hasConceptScore W3196831457C127413603 @default.
- W3196831457 hasConceptScore W3196831457C138885662 @default.
- W3196831457 hasConceptScore W3196831457C154945302 @default.
- W3196831457 hasConceptScore W3196831457C15744967 @default.
- W3196831457 hasConceptScore W3196831457C180747234 @default.
- W3196831457 hasConceptScore W3196831457C189950617 @default.
- W3196831457 hasConceptScore W3196831457C201995342 @default.
- W3196831457 hasConceptScore W3196831457C2780451532 @default.
- W3196831457 hasConceptScore W3196831457C41008148 @default.
- W3196831457 hasConceptScore W3196831457C90509273 @default.
- W3196831457 hasConceptScore W3196831457C97541855 @default.
- W3196831457 hasConceptScore W3196831457C98045186 @default.
- W3196831457 hasLocation W31968314571 @default.
- W3196831457 hasOpenAccess W3196831457 @default.
- W3196831457 hasPrimaryLocation W31968314571 @default.
- W3196831457 hasRelatedWork W10379689 @default.
- W3196831457 hasRelatedWork W12291563 @default.
- W3196831457 hasRelatedWork W2235786 @default.
- W3196831457 hasRelatedWork W4085024 @default.
- W3196831457 hasRelatedWork W4412456 @default.
- W3196831457 hasRelatedWork W5081013 @default.
- W3196831457 hasRelatedWork W5991403 @default.
- W3196831457 hasRelatedWork W7084024 @default.
- W3196831457 hasRelatedWork W868042 @default.
- W3196831457 hasRelatedWork W929682 @default.
- W3196831457 isParatext "false" @default.
- W3196831457 isRetracted "false" @default.
- W3196831457 magId "3196831457" @default.
- W3196831457 workType "article" @default.