Matches in SemOpenAlex for { <https://semopenalex.org/work/W2938421504> ?p ?o ?g. }
- W2938421504 abstract "The combination of deep neural network models and reinforcement learning algorithms can make it possible to learn policies for robotic behaviors that directly read in raw sensory inputs, such as camera images, effectively subsuming both estimation and control into one model. However, real-world applications of reinforcement learning must specify the goal of the task by means of a manually programmed reward function, which in practice requires either designing the very same perception pipeline that end-to-end reinforcement learning promises to avoid, or else instrumenting the environment with additional sensors to determine if the task has been performed successfully. In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task. While requesting labels for every single state would amount to asking the user to manually provide the reward signal, our method requires labels for only a tiny fraction of the states seen during training, making it an efficient and practical approach for learning skills without manually engineered rewards. We evaluate our method on real-world robotic manipulation tasks where the observations consist of images viewed by the robot's camera. In our experiments, our method effectively learns to arrange objects, place books, and drape cloth, directly from images and without any manually specified reward functions, and with only 1-4 hours of interaction with the real world." @default.
- W2938421504 created "2019-04-25" @default.
- W2938421504 creator A5005431772 @default.
- W2938421504 creator A5026322200 @default.
- W2938421504 creator A5071640510 @default.
- W2938421504 creator A5078419629 @default.
- W2938421504 creator A5078558551 @default.
- W2938421504 date "2019-06-22" @default.
- W2938421504 modified "2023-10-01" @default.
- W2938421504 title "End-To-End Robotic Reinforcement Learning without Reward Engineering" @default.
- W2938421504 cites W1771410628 @default.
- W2938421504 cites W1929981607 @default.
- W2938421504 cites W1999874108 @default.
- W2938421504 cites W2016765487 @default.
- W2938421504 cites W2057134775 @default.
- W2938421504 cites W2061562262 @default.
- W2938421504 cites W2098774185 @default.
- W2938421504 cites W2123694750 @default.
- W2938421504 cites W2156163138 @default.
- W2938421504 cites W2169498096 @default.
- W2938421504 cites W218896052 @default.
- W2938421504 cites W2201912979 @default.
- W2938421504 cites W2210408922 @default.
- W2938421504 cites W2228499913 @default.
- W2938421504 cites W2529601334 @default.
- W2938421504 cites W2529658650 @default.
- W2938421504 cites W2534269850 @default.
- W2938421504 cites W2552241273 @default.
- W2938421504 cites W2594103415 @default.
- W2938421504 cites W2626804490 @default.
- W2938421504 cites W2734346390 @default.
- W2938421504 cites W2781726626 @default.
- W2938421504 cites W2799151646 @default.
- W2938421504 cites W2885163910 @default.
- W2938421504 cites W2889990052 @default.
- W2938421504 cites W2890026535 @default.
- W2938421504 cites W2902125520 @default.
- W2938421504 cites W2904246096 @default.
- W2938421504 cites W2949608212 @default.
- W2938421504 cites W2963099438 @default.
- W2938421504 cites W2963277051 @default.
- W2938421504 cites W2963399829 @default.
- W2938421504 cites W2963411833 @default.
- W2938421504 cites W2963484919 @default.
- W2938421504 cites W2963504951 @default.
- W2938421504 cites W2963508354 @default.
- W2938421504 cites W2963516265 @default.
- W2938421504 cites W2963590100 @default.
- W2938421504 cites W2963823230 @default.
- W2938421504 cites W2964121744 @default.
- W2938421504 cites W2964153729 @default.
- W2938421504 cites W2964237810 @default.
- W2938421504 cites W2967355195 @default.
- W2938421504 cites W64088143 @default.
- W2938421504 doi "https://doi.org/10.15607/rss.2019.xv.073" @default.
- W2938421504 hasPublicationYear "2019" @default.
- W2938421504 type Work @default.
- W2938421504 sameAs 2938421504 @default.
- W2938421504 citedByCount "125" @default.
- W2938421504 countsByYear W29384215042019 @default.
- W2938421504 countsByYear W29384215042020 @default.
- W2938421504 countsByYear W29384215042021 @default.
- W2938421504 countsByYear W29384215042022 @default.
- W2938421504 countsByYear W29384215042023 @default.
- W2938421504 crossrefType "proceedings-article" @default.
- W2938421504 hasAuthorship W2938421504A5005431772 @default.
- W2938421504 hasAuthorship W2938421504A5026322200 @default.
- W2938421504 hasAuthorship W2938421504A5071640510 @default.
- W2938421504 hasAuthorship W2938421504A5078419629 @default.
- W2938421504 hasAuthorship W2938421504A5078558551 @default.
- W2938421504 hasBestOaLocation W29384215041 @default.
- W2938421504 hasConcept C154945302 @default.
- W2938421504 hasConcept C15744967 @default.
- W2938421504 hasConcept C41008148 @default.
- W2938421504 hasConcept C67203356 @default.
- W2938421504 hasConcept C74296488 @default.
- W2938421504 hasConcept C77805123 @default.
- W2938421504 hasConcept C97541855 @default.
- W2938421504 hasConceptScore W2938421504C154945302 @default.
- W2938421504 hasConceptScore W2938421504C15744967 @default.
- W2938421504 hasConceptScore W2938421504C41008148 @default.
- W2938421504 hasConceptScore W2938421504C67203356 @default.
- W2938421504 hasConceptScore W2938421504C74296488 @default.
- W2938421504 hasConceptScore W2938421504C77805123 @default.
- W2938421504 hasConceptScore W2938421504C97541855 @default.
- W2938421504 hasLocation W29384215041 @default.
- W2938421504 hasLocation W29384215042 @default.
- W2938421504 hasOpenAccess W2938421504 @default.
- W2938421504 hasPrimaryLocation W29384215041 @default.
- W2938421504 hasRelatedWork W2185410470 @default.
- W2938421504 hasRelatedWork W260766989 @default.
- W2938421504 hasRelatedWork W2909304650 @default.
- W2938421504 hasRelatedWork W2959276766 @default.
- W2938421504 hasRelatedWork W3074294383 @default.
- W2938421504 hasRelatedWork W3111983280 @default.
- W2938421504 hasRelatedWork W3139193008 @default.
- W2938421504 hasRelatedWork W3164468573 @default.
- W2938421504 hasRelatedWork W4206669594 @default.
- W2938421504 hasRelatedWork W4295941380 @default.
- W2938421504 isParatext "false" @default.