Matches in SemOpenAlex for { <https://semopenalex.org/work/W3133214370> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W3133214370 abstract "Recent advances in deep reinforcement learning (RL) have demonstrated its potential to learn complex robotic manipulation tasks. However, RL still requires the robot to collect a large amount of real-world experience. To address this problem, recent works have proposed learning from expert demonstrations (LfD), particularly via inverse reinforcement learning (IRL), given its ability to achieve robust performance with only a small number of expert demonstrations. Nevertheless, deploying IRL on real robots is still challenging due to the large number of robot experiences it requires. This paper aims to address this scalability challenge with a robust, sample-efficient, and general meta-IRL algorithm, SQUIRL, that performs a new but related long-horizon task robustly given only a single video demonstration. First, this algorithm bootstraps the learning of a task encoder and a task-conditioned policy using behavioral cloning (BC). It then collects real-robot experiences and bypasses reward learning by directly recovering a Q-function from the combined robot and expert trajectories. Next, this algorithm uses the Q-function to re-evaluate all cumulative experiences collected by the robot to improve the policy quickly. In the end, the policy performs more robustly (90%+ success) than BC on new tasks while requiring no trial-and-errors at test time. Finally, our real-robot and simulated experiments demonstrate our algorithm's generality across different state spaces, action spaces, and vision-based manipulation tasks, e.g., pick-pour-place and pick-carry-drop." @default.
- W3133214370 created "2021-03-01" @default.
- W3133214370 creator A5014210568 @default.
- W3133214370 creator A5029944911 @default.
- W3133214370 creator A5034725374 @default.
- W3133214370 creator A5035609529 @default.
- W3133214370 creator A5036043911 @default.
- W3133214370 date "2020-10-24" @default.
- W3133214370 modified "2023-10-16" @default.
- W3133214370 title "SQUIRL: Robust and Efficient Learning from Video Demonstration of Long-Horizon Robotic Manipulation Tasks" @default.
- W3133214370 cites W1986014385 @default.
- W3133214370 cites W1999874108 @default.
- W3133214370 cites W2071841410 @default.
- W3133214370 cites W2161395589 @default.
- W3133214370 cites W2169498096 @default.
- W3133214370 cites W2769112066 @default.
- W3133214370 cites W2963669336 @default.
- W3133214370 cites W2963703448 @default.
- W3133214370 cites W2964055695 @default.
- W3133214370 cites W2979490629 @default.
- W3133214370 cites W3089739707 @default.
- W3133214370 doi "https://doi.org/10.1109/iros45743.2020.9340915" @default.
- W3133214370 hasPublicationYear "2020" @default.
- W3133214370 type Work @default.
- W3133214370 sameAs 3133214370 @default.
- W3133214370 citedByCount "3" @default.
- W3133214370 countsByYear W31332143702022 @default.
- W3133214370 crossrefType "proceedings-article" @default.
- W3133214370 hasAuthorship W3133214370A5014210568 @default.
- W3133214370 hasAuthorship W3133214370A5029944911 @default.
- W3133214370 hasAuthorship W3133214370A5034725374 @default.
- W3133214370 hasAuthorship W3133214370A5035609529 @default.
- W3133214370 hasAuthorship W3133214370A5036043911 @default.
- W3133214370 hasBestOaLocation W31332143702 @default.
- W3133214370 hasConcept C107457646 @default.
- W3133214370 hasConcept C154945302 @default.
- W3133214370 hasConcept C159176650 @default.
- W3133214370 hasConcept C2524010 @default.
- W3133214370 hasConcept C31972630 @default.
- W3133214370 hasConcept C33923547 @default.
- W3133214370 hasConcept C41008148 @default.
- W3133214370 hasConceptScore W3133214370C107457646 @default.
- W3133214370 hasConceptScore W3133214370C154945302 @default.
- W3133214370 hasConceptScore W3133214370C159176650 @default.
- W3133214370 hasConceptScore W3133214370C2524010 @default.
- W3133214370 hasConceptScore W3133214370C31972630 @default.
- W3133214370 hasConceptScore W3133214370C33923547 @default.
- W3133214370 hasConceptScore W3133214370C41008148 @default.
- W3133214370 hasLocation W31332143701 @default.
- W3133214370 hasLocation W31332143702 @default.
- W3133214370 hasOpenAccess W3133214370 @default.
- W3133214370 hasPrimaryLocation W31332143701 @default.
- W3133214370 hasRelatedWork W1891287906 @default.
- W3133214370 hasRelatedWork W1969923398 @default.
- W3133214370 hasRelatedWork W2036807459 @default.
- W3133214370 hasRelatedWork W2058170566 @default.
- W3133214370 hasRelatedWork W2170022336 @default.
- W3133214370 hasRelatedWork W2229312674 @default.
- W3133214370 hasRelatedWork W258625772 @default.
- W3133214370 hasRelatedWork W2755342338 @default.
- W3133214370 hasRelatedWork W2772917594 @default.
- W3133214370 hasRelatedWork W3116076068 @default.
- W3133214370 isParatext "false" @default.
- W3133214370 isRetracted "false" @default.
- W3133214370 magId "3133214370" @default.
- W3133214370 workType "article" @default.