Matches in SemOpenAlex for { <https://semopenalex.org/work/W3044290186> ?p ?o ?g. }
- W3044290186 abstract "We present a data-efficient framework for solving visuomotor sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable generative models. Our framework trains deep visuomotor policies by introducing an action latent variable such that the feed-forward policy search can be divided into three parts: (i) training a sub-policy that outputs a distribution over the action latent variable given a state of the system, (ii) unsupervised training of a generative model that outputs a sequence of motor actions conditioned on the latent action variable, and (iii) supervised training of the deep visuomotor policy in an end-to-end fashion. Our approach enables safe exploration and alleviates the data-inefficiency problem as it exploits prior knowledge about valid sequences of motor actions. Moreover, we provide a set of measures for evaluation of generative models such that we are able to predict the performance of the RL policy training prior to the actual training on a physical robot. We define two novel measures of disentanglement and local linearity for assessing the quality of latent representations, and complement them with existing measures for assessment of the learned distribution. We experimentally determine the characteristics of different generative models that have the most influence on performance of the final policy training on a robotic picking task." @default.
- W3044290186 created "2020-07-29" @default.
- W3044290186 creator A5005900989 @default.
- W3044290186 creator A5023792180 @default.
- W3044290186 creator A5038342432 @default.
- W3044290186 creator A5080940147 @default.
- W3044290186 creator A5082269387 @default.
- W3044290186 date "2020-07-26" @default.
- W3044290186 modified "2023-10-18" @default.
- W3044290186 title "Data-efficient visuomotor policy training using reinforcement learning and generative models." @default.
- W3044290186 cites W1630678085 @default.
- W3044290186 cites W1771410628 @default.
- W3044290186 cites W1959608418 @default.
- W3044290186 cites W2012587148 @default.
- W3044290186 cites W2098284983 @default.
- W3044290186 cites W2099471712 @default.
- W3044290186 cites W2123967136 @default.
- W3044290186 cites W2125612430 @default.
- W3044290186 cites W2127107099 @default.
- W3044290186 cites W2136719407 @default.
- W3044290186 cites W2163922914 @default.
- W3044290186 cites W2211399972 @default.
- W3044290186 cites W2212660284 @default.
- W3044290186 cites W2528489519 @default.
- W3044290186 cites W2575705757 @default.
- W3044290186 cites W2592538810 @default.
- W3044290186 cites W2593768305 @default.
- W3044290186 cites W2605102758 @default.
- W3044290186 cites W2736601468 @default.
- W3044290186 cites W2753738274 @default.
- W3044290186 cites W2766614170 @default.
- W3044290186 cites W2785519580 @default.
- W3044290186 cites W2785961484 @default.
- W3044290186 cites W2786019934 @default.
- W3044290186 cites W2787053496 @default.
- W3044290186 cites W2805984778 @default.
- W3044290186 cites W2893749619 @default.
- W3044290186 cites W2896583015 @default.
- W3044290186 cites W2902476877 @default.
- W3044290186 cites W2903538854 @default.
- W3044290186 cites W2904849495 @default.
- W3044290186 cites W2908006003 @default.
- W3044290186 cites W2911300280 @default.
- W3044290186 cites W2918049070 @default.
- W3044290186 cites W2923023063 @default.
- W3044290186 cites W2936845163 @default.
- W3044290186 cites W2945358157 @default.
- W3044290186 cites W2946169124 @default.
- W3044290186 cites W2950004691 @default.
- W3044290186 cites W2950095160 @default.
- W3044290186 cites W2962736495 @default.
- W3044290186 cites W2962759351 @default.
- W3044290186 cites W2962854145 @default.
- W3044290186 cites W2962897886 @default.
- W3044290186 cites W2962919088 @default.
- W3044290186 cites W2963045453 @default.
- W3044290186 cites W2963146015 @default.
- W3044290186 cites W2963226019 @default.
- W3044290186 cites W2963366547 @default.
- W3044290186 cites W2963373786 @default.
- W3044290186 cites W2963403593 @default.
- W3044290186 cites W2963484919 @default.
- W3044290186 cites W2963634205 @default.
- W3044290186 cites W2963689319 @default.
- W3044290186 cites W2963780790 @default.
- W3044290186 cites W2963800363 @default.
- W3044290186 cites W2963981733 @default.
- W3044290186 cites W2964097578 @default.
- W3044290186 cites W2964127395 @default.
- W3044290186 cites W2964161785 @default.
- W3044290186 cites W2971127577 @default.
- W3044290186 cites W2971128425 @default.
- W3044290186 cites W2981141426 @default.
- W3044290186 cites W2981626359 @default.
- W3044290186 cites W2987221721 @default.
- W3044290186 cites W2999811719 @default.
- W3044290186 cites W3004116079 @default.
- W3044290186 cites W3011466691 @default.
- W3044290186 cites W3012835354 @default.
- W3044290186 cites W3016943329 @default.
- W3044290186 cites W3030981716 @default.
- W3044290186 cites W3035429611 @default.
- W3044290186 cites W3090612618 @default.
- W3044290186 cites W3090943552 @default.
- W3044290186 cites W3101442004 @default.
- W3044290186 cites W3159890710 @default.
- W3044290186 hasPublicationYear "2020" @default.
- W3044290186 type Work @default.
- W3044290186 sameAs 3044290186 @default.
- W3044290186 citedByCount "3" @default.
- W3044290186 countsByYear W30442901862020 @default.
- W3044290186 countsByYear W30442901862021 @default.
- W3044290186 crossrefType "posted-content" @default.
- W3044290186 hasAuthorship W3044290186A5005900989 @default.
- W3044290186 hasAuthorship W3044290186A5023792180 @default.
- W3044290186 hasAuthorship W3044290186A5038342432 @default.
- W3044290186 hasAuthorship W3044290186A5080940147 @default.
- W3044290186 hasAuthorship W3044290186A5082269387 @default.
- W3044290186 hasConcept C119857082 @default.
- W3044290186 hasConcept C121332964 @default.