Matches in SemOpenAlex for { <https://semopenalex.org/work/W2892267807> ?p ?o ?g. }
Showing items 1 to 82 of
82
with 100 items per page.
- W2892267807 abstract "Reinforcement Learning methods are capable of solving complex problems, but resulting policies might perform poorly in environments that are even slightly different. In robotics especially, training and deployment conditions often vary and data collection is expensive, making retraining undesirable. Simulation training allows for feasible training times, but on the other hand suffers from a reality-gap when applied in real-world settings. This raises the need of efficient adaptation of policies acting in new environments. We consider this as a problem of transferring knowledge within a family of similar Markov decision processes. For this purpose we assume that Q-functions are generated by some low-dimensional latent variable. Given such a Q-function, we can find a master policy that can adapt given different values of this latent variable. Our method learns both the generative mapping and an approximate posterior of the latent variables, enabling identification of policies for new tasks by searching only in the latent space, rather than the space of all policies. The low-dimensional space, and master policy found by our method enables policies to quickly adapt to new environments. We demonstrate the method on both a pendulum swing-up task in simulation, and for simulation-to-real transfer on a pushing task." @default.
- W2892267807 created "2018-09-27" @default.
- W2892267807 creator A5023785357 @default.
- W2892267807 creator A5023792180 @default.
- W2892267807 creator A5034276322 @default.
- W2892267807 date "2018-09-10" @default.
- W2892267807 modified "2023-09-27" @default.
- W2892267807 title "VPE: Variational Policy Embedding for Transfer Reinforcement Learning" @default.
- W2892267807 hasPublicationYear "2018" @default.
- W2892267807 type Work @default.
- W2892267807 sameAs 2892267807 @default.
- W2892267807 citedByCount "1" @default.
- W2892267807 countsByYear W28922678072020 @default.
- W2892267807 crossrefType "posted-content" @default.
- W2892267807 hasAuthorship W2892267807A5023785357 @default.
- W2892267807 hasAuthorship W2892267807A5023792180 @default.
- W2892267807 hasAuthorship W2892267807A5034276322 @default.
- W2892267807 hasConcept C105795698 @default.
- W2892267807 hasConcept C106189395 @default.
- W2892267807 hasConcept C111919701 @default.
- W2892267807 hasConcept C119857082 @default.
- W2892267807 hasConcept C120665830 @default.
- W2892267807 hasConcept C121332964 @default.
- W2892267807 hasConcept C139807058 @default.
- W2892267807 hasConcept C150899416 @default.
- W2892267807 hasConcept C154945302 @default.
- W2892267807 hasConcept C159886148 @default.
- W2892267807 hasConcept C162324750 @default.
- W2892267807 hasConcept C187736073 @default.
- W2892267807 hasConcept C2778572836 @default.
- W2892267807 hasConcept C2780451532 @default.
- W2892267807 hasConcept C33923547 @default.
- W2892267807 hasConcept C41008148 @default.
- W2892267807 hasConcept C41608201 @default.
- W2892267807 hasConcept C51167844 @default.
- W2892267807 hasConcept C97541855 @default.
- W2892267807 hasConceptScore W2892267807C105795698 @default.
- W2892267807 hasConceptScore W2892267807C106189395 @default.
- W2892267807 hasConceptScore W2892267807C111919701 @default.
- W2892267807 hasConceptScore W2892267807C119857082 @default.
- W2892267807 hasConceptScore W2892267807C120665830 @default.
- W2892267807 hasConceptScore W2892267807C121332964 @default.
- W2892267807 hasConceptScore W2892267807C139807058 @default.
- W2892267807 hasConceptScore W2892267807C150899416 @default.
- W2892267807 hasConceptScore W2892267807C154945302 @default.
- W2892267807 hasConceptScore W2892267807C159886148 @default.
- W2892267807 hasConceptScore W2892267807C162324750 @default.
- W2892267807 hasConceptScore W2892267807C187736073 @default.
- W2892267807 hasConceptScore W2892267807C2778572836 @default.
- W2892267807 hasConceptScore W2892267807C2780451532 @default.
- W2892267807 hasConceptScore W2892267807C33923547 @default.
- W2892267807 hasConceptScore W2892267807C41008148 @default.
- W2892267807 hasConceptScore W2892267807C41608201 @default.
- W2892267807 hasConceptScore W2892267807C51167844 @default.
- W2892267807 hasConceptScore W2892267807C97541855 @default.
- W2892267807 hasLocation W28922678071 @default.
- W2892267807 hasOpenAccess W2892267807 @default.
- W2892267807 hasPrimaryLocation W28922678071 @default.
- W2892267807 hasRelatedWork W1973102501 @default.
- W2892267807 hasRelatedWork W2121103318 @default.
- W2892267807 hasRelatedWork W2147032798 @default.
- W2892267807 hasRelatedWork W2294805292 @default.
- W2892267807 hasRelatedWork W23935843 @default.
- W2892267807 hasRelatedWork W2513373085 @default.
- W2892267807 hasRelatedWork W2770014065 @default.
- W2892267807 hasRelatedWork W2895958971 @default.
- W2892267807 hasRelatedWork W2950004691 @default.
- W2892267807 hasRelatedWork W2968021416 @default.
- W2892267807 hasRelatedWork W2968652061 @default.
- W2892267807 hasRelatedWork W2991616621 @default.
- W2892267807 hasRelatedWork W2999490157 @default.
- W2892267807 hasRelatedWork W3039386753 @default.
- W2892267807 hasRelatedWork W3080901109 @default.
- W2892267807 hasRelatedWork W3129741749 @default.
- W2892267807 hasRelatedWork W3151079898 @default.
- W2892267807 hasRelatedWork W3156082075 @default.
- W2892267807 hasRelatedWork W3159952121 @default.
- W2892267807 hasRelatedWork W3207608125 @default.
- W2892267807 isParatext "false" @default.
- W2892267807 isRetracted "false" @default.
- W2892267807 magId "2892267807" @default.
- W2892267807 workType "article" @default.