Matches in SemOpenAlex for { <https://semopenalex.org/work/W2896924405> ?p ?o ?g. }
- W2896924405 abstract "Humans are experts at high-fidelity imitation -- closely mimicking a demonstration, often in one attempt. Humans use this ability to quickly solve a task instance, and to bootstrap learning of new tasks. Achieving these abilities in autonomous agents is an open problem. In this paper, we introduce an off-policy RL algorithm (MetaMimic) to narrow this gap. MetaMimic can learn both (i) policies for high-fidelity one-shot imitation of diverse novel skills, and (ii) policies that enable the agent to solve tasks more efficiently than the demonstrators. MetaMimic relies on the principle of storing all experiences in a memory and replaying these to learn massive deep neural network policies by off-policy RL. This paper introduces, to the best of our knowledge, the largest existing neural networks for deep RL and shows that larger networks with normalization are needed to achieve one-shot high-fidelity imitation on a challenging manipulation task. The results also show that both types of policy can be learned from vision, in spite of the task rewards being sparse, and without access to demonstrator actions." @default.
- W2896924405 created "2018-10-26" @default.
- W2896924405 creator A5003238807 @default.
- W2896924405 creator A5007394281 @default.
- W2896924405 creator A5056174952 @default.
- W2896924405 creator A5056410851 @default.
- W2896924405 creator A5059110895 @default.
- W2896924405 creator A5061277347 @default.
- W2896924405 creator A5064627383 @default.
- W2896924405 creator A5069219684 @default.
- W2896924405 creator A5072199759 @default.
- W2896924405 creator A5073651612 @default.
- W2896924405 creator A5082304130 @default.
- W2896924405 date "2018-09-27" @default.
- W2896924405 modified "2023-09-27" @default.
- W2896924405 title "One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL" @default.
- W2896924405 cites W1522301498 @default.
- W2896924405 cites W1540685400 @default.
- W2896924405 cites W1757796397 @default.
- W2896924405 cites W1999874108 @default.
- W2896924405 cites W2061562262 @default.
- W2896924405 cites W2098774185 @default.
- W2896924405 cites W2103462950 @default.
- W2896924405 cites W2167224731 @default.
- W2896924405 cites W2176412452 @default.
- W2896924405 cites W2194775991 @default.
- W2896924405 cites W2257979135 @default.
- W2896924405 cites W2290104316 @default.
- W2896924405 cites W2434014514 @default.
- W2896924405 cites W2481567506 @default.
- W2896924405 cites W2583137229 @default.
- W2896924405 cites W2610395436 @default.
- W2896924405 cites W2729615412 @default.
- W2896924405 cites W2734325473 @default.
- W2896924405 cites W2735025040 @default.
- W2896924405 cites W2735668228 @default.
- W2896924405 cites W2739473244 @default.
- W2896924405 cites W2741122588 @default.
- W2896924405 cites W2755546070 @default.
- W2896924405 cites W2757631751 @default.
- W2896924405 cites W2769112066 @default.
- W2896924405 cites W2770884134 @default.
- W2896924405 cites W2785962646 @default.
- W2896924405 cites W2786036274 @default.
- W2896924405 cites W2786487443 @default.
- W2896924405 cites W2786928559 @default.
- W2896924405 cites W2788781499 @default.
- W2896924405 cites W2788862220 @default.
- W2896924405 cites W2795456585 @default.
- W2896924405 cites W2796290181 @default.
- W2896924405 cites W2798705390 @default.
- W2896924405 cites W2802726207 @default.
- W2896924405 cites W2810785043 @default.
- W2896924405 cites W2823112946 @default.
- W2896924405 cites W2886265050 @default.
- W2896924405 cites W2949117887 @default.
- W2896924405 cites W2962715211 @default.
- W2896924405 cites W2962749646 @default.
- W2896924405 cites W2962871243 @default.
- W2896924405 cites W2963300719 @default.
- W2896924405 hasPublicationYear "2018" @default.
- W2896924405 type Work @default.
- W2896924405 sameAs 2896924405 @default.
- W2896924405 citedByCount "13" @default.
- W2896924405 countsByYear W28969244052019 @default.
- W2896924405 countsByYear W28969244052020 @default.
- W2896924405 countsByYear W28969244052021 @default.
- W2896924405 crossrefType "posted-content" @default.
- W2896924405 hasAuthorship W2896924405A5003238807 @default.
- W2896924405 hasAuthorship W2896924405A5007394281 @default.
- W2896924405 hasAuthorship W2896924405A5056174952 @default.
- W2896924405 hasAuthorship W2896924405A5056410851 @default.
- W2896924405 hasAuthorship W2896924405A5059110895 @default.
- W2896924405 hasAuthorship W2896924405A5061277347 @default.
- W2896924405 hasAuthorship W2896924405A5064627383 @default.
- W2896924405 hasAuthorship W2896924405A5069219684 @default.
- W2896924405 hasAuthorship W2896924405A5072199759 @default.
- W2896924405 hasAuthorship W2896924405A5073651612 @default.
- W2896924405 hasAuthorship W2896924405A5082304130 @default.
- W2896924405 hasConcept C113364801 @default.
- W2896924405 hasConcept C119599485 @default.
- W2896924405 hasConcept C119857082 @default.
- W2896924405 hasConcept C126388530 @default.
- W2896924405 hasConcept C127413603 @default.
- W2896924405 hasConcept C136886441 @default.
- W2896924405 hasConcept C144024400 @default.
- W2896924405 hasConcept C154945302 @default.
- W2896924405 hasConcept C15744967 @default.
- W2896924405 hasConcept C19165224 @default.
- W2896924405 hasConcept C201995342 @default.
- W2896924405 hasConcept C2776459999 @default.
- W2896924405 hasConcept C2780451532 @default.
- W2896924405 hasConcept C2984842247 @default.
- W2896924405 hasConcept C41008148 @default.
- W2896924405 hasConcept C50644808 @default.
- W2896924405 hasConcept C76155785 @default.
- W2896924405 hasConcept C77805123 @default.
- W2896924405 hasConceptScore W2896924405C113364801 @default.
- W2896924405 hasConceptScore W2896924405C119599485 @default.
- W2896924405 hasConceptScore W2896924405C119857082 @default.