Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386598146> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W4386598146 abstract "Multi-goal reinforcement learning has emerged as a powerful approach for planning and robot manipulation tasks, but it faces challenges such as sparse rewards and sample inefficiency. Hindsight Experience Replay (HER) has been proposed as a solution to these challenges by relabeling goals, but it still requires a large number of samples and significant computation. To address these issues, we propose Multi-step Hindsight Experience Replay (MHER), which incorporates multi-step relabeling to improve sample efficiency. Despite the advantages of $n -$step relabeling, we theoretically and experimentally prove the off-policy $n -$step bias introduced by $n -$step relabeling may lead to poor performance in many environments. To address this issue, two bias-reduced MHER algorithms, MHER $( lambda )$ and Model-based MHER (MMHER) are presented. MHER $( lambda )$ exploits the $lambda$ return while MMHER benefits from model-based value expansions. Experimental results on numerous multi-goal robotic tasks show that our solutions can successfully alleviate the off-policy $n -$step bias and achieve significantly higher sample efficiency than previous multi-goal RL baselines with little additional computation beyond HER." @default.
- W4386598146 created "2023-09-12" @default.
- W4386598146 creator A5013716631 @default.
- W4386598146 creator A5051292342 @default.
- W4386598146 creator A5052345420 @default.
- W4386598146 creator A5055663663 @default.
- W4386598146 creator A5064508233 @default.
- W4386598146 creator A5067232825 @default.
- W4386598146 creator A5067453635 @default.
- W4386598146 creator A5075993453 @default.
- W4386598146 date "2023-06-01" @default.
- W4386598146 modified "2023-09-30" @default.
- W4386598146 title "Multi-Step Hindsight Experience Replay with Bias Reduction for Efficient Multi-Goal Reinforcement Learning" @default.
- W4386598146 cites W1980035368 @default.
- W4386598146 cites W2145339207 @default.
- W4386598146 cites W2761873684 @default.
- W4386598146 cites W2962872206 @default.
- W4386598146 cites W3041202696 @default.
- W4386598146 cites W3170872007 @default.
- W4386598146 cites W4200547053 @default.
- W4386598146 doi "https://doi.org/10.1109/frse58934.2023.00028" @default.
- W4386598146 hasPublicationYear "2023" @default.
- W4386598146 type Work @default.
- W4386598146 citedByCount "0" @default.
- W4386598146 crossrefType "proceedings-article" @default.
- W4386598146 hasAuthorship W4386598146A5013716631 @default.
- W4386598146 hasAuthorship W4386598146A5051292342 @default.
- W4386598146 hasAuthorship W4386598146A5052345420 @default.
- W4386598146 hasAuthorship W4386598146A5055663663 @default.
- W4386598146 hasAuthorship W4386598146A5064508233 @default.
- W4386598146 hasAuthorship W4386598146A5067232825 @default.
- W4386598146 hasAuthorship W4386598146A5067453635 @default.
- W4386598146 hasAuthorship W4386598146A5075993453 @default.
- W4386598146 hasConcept C10347200 @default.
- W4386598146 hasConcept C11413529 @default.
- W4386598146 hasConcept C119857082 @default.
- W4386598146 hasConcept C154945302 @default.
- W4386598146 hasConcept C15744967 @default.
- W4386598146 hasConcept C162324750 @default.
- W4386598146 hasConcept C165696696 @default.
- W4386598146 hasConcept C175444787 @default.
- W4386598146 hasConcept C180747234 @default.
- W4386598146 hasConcept C185592680 @default.
- W4386598146 hasConcept C198531522 @default.
- W4386598146 hasConcept C2778869765 @default.
- W4386598146 hasConcept C38652104 @default.
- W4386598146 hasConcept C41008148 @default.
- W4386598146 hasConcept C43617362 @default.
- W4386598146 hasConcept C45374587 @default.
- W4386598146 hasConcept C97541855 @default.
- W4386598146 hasConceptScore W4386598146C10347200 @default.
- W4386598146 hasConceptScore W4386598146C11413529 @default.
- W4386598146 hasConceptScore W4386598146C119857082 @default.
- W4386598146 hasConceptScore W4386598146C154945302 @default.
- W4386598146 hasConceptScore W4386598146C15744967 @default.
- W4386598146 hasConceptScore W4386598146C162324750 @default.
- W4386598146 hasConceptScore W4386598146C165696696 @default.
- W4386598146 hasConceptScore W4386598146C175444787 @default.
- W4386598146 hasConceptScore W4386598146C180747234 @default.
- W4386598146 hasConceptScore W4386598146C185592680 @default.
- W4386598146 hasConceptScore W4386598146C198531522 @default.
- W4386598146 hasConceptScore W4386598146C2778869765 @default.
- W4386598146 hasConceptScore W4386598146C38652104 @default.
- W4386598146 hasConceptScore W4386598146C41008148 @default.
- W4386598146 hasConceptScore W4386598146C43617362 @default.
- W4386598146 hasConceptScore W4386598146C45374587 @default.
- W4386598146 hasConceptScore W4386598146C97541855 @default.
- W4386598146 hasLocation W43865981461 @default.
- W4386598146 hasOpenAccess W4386598146 @default.
- W4386598146 hasPrimaryLocation W43865981461 @default.
- W4386598146 hasRelatedWork W2493506738 @default.
- W4386598146 hasRelatedWork W2890406131 @default.
- W4386598146 hasRelatedWork W2997512100 @default.
- W4386598146 hasRelatedWork W3012552522 @default.
- W4386598146 hasRelatedWork W3173051288 @default.
- W4386598146 hasRelatedWork W3197854638 @default.
- W4386598146 hasRelatedWork W4226336685 @default.
- W4386598146 hasRelatedWork W4287827094 @default.
- W4386598146 hasRelatedWork W4295352814 @default.
- W4386598146 hasRelatedWork W4319083788 @default.
- W4386598146 isParatext "false" @default.
- W4386598146 isRetracted "false" @default.
- W4386598146 workType "article" @default.