Matches in SemOpenAlex for { <https://semopenalex.org/work/W3022183679> ?p ?o ?g. }
- W3022183679 abstract "Robot control problems are often structured with a policy function that maps state values into control values, but in many dynamic problems the observed state can have a difficult to characterize relationship with useful policy actions. In this paper we present a new method for learning state embeddings from plans or other forms of demonstrations such that the embedding space has a specified geometric relationship with the demonstrations. We present a novel variational framework for learning these embeddings that attempts to optimize trajectory linearity in the learned embedding space. We show how these embedding spaces can then be used as an augmentation to the robot state in reinforcement learning problems. We use kinodynamic planning to generate training trajectories for some example environments, and then train embedding spaces for these environments. We show empirically that observing a system in the learned embedding space improves the performance of policy gradient reinforcement learning algorithms, particularly by reducing the variance between training runs. Our technique is limited to environments where demonstration data is available, but places no limits on how that data is collected. Our embedding technique provides a way to transfer domain knowledge from existing technologies such as planning and control algorithms, into more flexible policy learning algorithms, by creating an abstract representation of the robot state with meaningful geometry." @default.
- W3022183679 created "2020-05-13" @default.
- W3022183679 creator A5072199290 @default.
- W3022183679 creator A5077367921 @default.
- W3022183679 date "2020-04-30" @default.
- W3022183679 modified "2023-09-27" @default.
- W3022183679 title "Plan-Space State Embeddings for Improved Reinforcement Learning." @default.
- W3022183679 cites W1771410628 @default.
- W3022183679 cites W2061562262 @default.
- W3022183679 cites W2098774185 @default.
- W3022183679 cites W2099471712 @default.
- W3022183679 cites W2110762409 @default.
- W3022183679 cites W2158782408 @default.
- W3022183679 cites W2566467060 @default.
- W3022183679 cites W2736601468 @default.
- W3022183679 cites W2785342287 @default.
- W3022183679 cites W2885010347 @default.
- W3022183679 cites W2911695803 @default.
- W3022183679 cites W2950577311 @default.
- W3022183679 cites W2951004968 @default.
- W3022183679 cites W2963277051 @default.
- W3022183679 cites W2963430173 @default.
- W3022183679 cites W2964121744 @default.
- W3022183679 hasPublicationYear "2020" @default.
- W3022183679 type Work @default.
- W3022183679 sameAs 3022183679 @default.
- W3022183679 citedByCount "0" @default.
- W3022183679 crossrefType "posted-content" @default.
- W3022183679 hasAuthorship W3022183679A5072199290 @default.
- W3022183679 hasAuthorship W3022183679A5077367921 @default.
- W3022183679 hasConcept C105795698 @default.
- W3022183679 hasConcept C111919701 @default.
- W3022183679 hasConcept C11413529 @default.
- W3022183679 hasConcept C119857082 @default.
- W3022183679 hasConcept C121332964 @default.
- W3022183679 hasConcept C1276947 @default.
- W3022183679 hasConcept C13662910 @default.
- W3022183679 hasConcept C14036430 @default.
- W3022183679 hasConcept C154945302 @default.
- W3022183679 hasConcept C166957645 @default.
- W3022183679 hasConcept C17744445 @default.
- W3022183679 hasConcept C199539241 @default.
- W3022183679 hasConcept C2775924081 @default.
- W3022183679 hasConcept C2776359362 @default.
- W3022183679 hasConcept C2776505523 @default.
- W3022183679 hasConcept C2778572836 @default.
- W3022183679 hasConcept C33923547 @default.
- W3022183679 hasConcept C41008148 @default.
- W3022183679 hasConcept C41608201 @default.
- W3022183679 hasConcept C48103436 @default.
- W3022183679 hasConcept C72434380 @default.
- W3022183679 hasConcept C78458016 @default.
- W3022183679 hasConcept C86803240 @default.
- W3022183679 hasConcept C90509273 @default.
- W3022183679 hasConcept C94625758 @default.
- W3022183679 hasConcept C95457728 @default.
- W3022183679 hasConcept C97541855 @default.
- W3022183679 hasConceptScore W3022183679C105795698 @default.
- W3022183679 hasConceptScore W3022183679C111919701 @default.
- W3022183679 hasConceptScore W3022183679C11413529 @default.
- W3022183679 hasConceptScore W3022183679C119857082 @default.
- W3022183679 hasConceptScore W3022183679C121332964 @default.
- W3022183679 hasConceptScore W3022183679C1276947 @default.
- W3022183679 hasConceptScore W3022183679C13662910 @default.
- W3022183679 hasConceptScore W3022183679C14036430 @default.
- W3022183679 hasConceptScore W3022183679C154945302 @default.
- W3022183679 hasConceptScore W3022183679C166957645 @default.
- W3022183679 hasConceptScore W3022183679C17744445 @default.
- W3022183679 hasConceptScore W3022183679C199539241 @default.
- W3022183679 hasConceptScore W3022183679C2775924081 @default.
- W3022183679 hasConceptScore W3022183679C2776359362 @default.
- W3022183679 hasConceptScore W3022183679C2776505523 @default.
- W3022183679 hasConceptScore W3022183679C2778572836 @default.
- W3022183679 hasConceptScore W3022183679C33923547 @default.
- W3022183679 hasConceptScore W3022183679C41008148 @default.
- W3022183679 hasConceptScore W3022183679C41608201 @default.
- W3022183679 hasConceptScore W3022183679C48103436 @default.
- W3022183679 hasConceptScore W3022183679C72434380 @default.
- W3022183679 hasConceptScore W3022183679C78458016 @default.
- W3022183679 hasConceptScore W3022183679C86803240 @default.
- W3022183679 hasConceptScore W3022183679C90509273 @default.
- W3022183679 hasConceptScore W3022183679C94625758 @default.
- W3022183679 hasConceptScore W3022183679C95457728 @default.
- W3022183679 hasConceptScore W3022183679C97541855 @default.
- W3022183679 hasLocation W30221836791 @default.
- W3022183679 hasOpenAccess W3022183679 @default.
- W3022183679 hasPrimaryLocation W30221836791 @default.
- W3022183679 hasRelatedWork W106517363 @default.
- W3022183679 hasRelatedWork W1967736575 @default.
- W3022183679 hasRelatedWork W2012587148 @default.
- W3022183679 hasRelatedWork W2050090126 @default.
- W3022183679 hasRelatedWork W212906305 @default.
- W3022183679 hasRelatedWork W2912722426 @default.
- W3022183679 hasRelatedWork W2959488596 @default.
- W3022183679 hasRelatedWork W2963572779 @default.
- W3022183679 hasRelatedWork W2963619650 @default.
- W3022183679 hasRelatedWork W3023096123 @default.
- W3022183679 hasRelatedWork W3029855818 @default.
- W3022183679 hasRelatedWork W3110876951 @default.
- W3022183679 hasRelatedWork W3129456247 @default.