Matches in SemOpenAlex for { <https://semopenalex.org/work/W3021796787> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W3021796787 endingPage "453" @default.
- W3021796787 startingPage "437" @default.
- W3021796787 abstract "Model-free policy learning has enabled good performance on complex tasks that were previously intractable with traditional control techniques. However, this comes at the cost of requiring a perfectly accurate model for training. This is infeasible due to the very high sample complexity of model-free methods preventing training on the target system. This renders such methods unsuitable for physical systems. Model mismatch due to dynamics parameter differences and unmodeled dynamics error may cause suboptimal or unsafe behavior upon direct transfer. We introduce the Adaptive Policy Transfer for Stochastic Dynamics (AdaPT) algorithm that achieves provably safe and robust, dynamically-feasible zero-shot transfer of RL-policies to new domains with dynamics error. AdaPT combines the strengths of offline policy learning in a black-box source simulator with online tube-based MPC to attenuate bounded dynamics mismatch between the source and target dynamics. AdaPT allows online transfer of policies, trained solely in a simulation offline, to a family of unknown targets without fine-tuning. We also formally show that (i) AdaPT guarantees bounded state and control deviation through state-action tubes under relatively weak technical assumptions and, (ii) AdaPT results in a bounded loss of reward accumulation relative to a policy trained and evaluated in the source environment. We evaluate AdaPT on 2 continuous, non-holonomic simulated dynamical systems with 4 different disturbance models, and find that AdaPT performs between 50 and $$300%$$ better on mean reward accrual than direct policy transfer." @default.
- W3021796787 created "2020-05-13" @default.
- W3021796787 creator A5030826237 @default.
- W3021796787 creator A5031821299 @default.
- W3021796787 creator A5042646536 @default.
- W3021796787 creator A5050003000 @default.
- W3021796787 creator A5061193324 @default.
- W3021796787 creator A5082946919 @default.
- W3021796787 creator A5091869385 @default.
- W3021796787 date "2019-11-28" @default.
- W3021796787 modified "2023-09-26" @default.
- W3021796787 title "AdaPT: Zero-Shot Adaptive Policy Transfer for Stochastic Dynamical Systems" @default.
- W3021796787 cites W1603365166 @default.
- W3021796787 cites W2054585537 @default.
- W3021796787 cites W2066425650 @default.
- W3021796787 cites W2087617385 @default.
- W3021796787 cites W2132602063 @default.
- W3021796787 cites W2145339207 @default.
- W3021796787 cites W2205975260 @default.
- W3021796787 cites W23576351 @default.
- W3021796787 cites W2418368699 @default.
- W3021796787 cites W2737223130 @default.
- W3021796787 cites W2773691349 @default.
- W3021796787 doi "https://doi.org/10.1007/978-3-030-28619-4_34" @default.
- W3021796787 hasPublicationYear "2019" @default.
- W3021796787 type Work @default.
- W3021796787 sameAs 3021796787 @default.
- W3021796787 citedByCount "9" @default.
- W3021796787 countsByYear W30217967872020 @default.
- W3021796787 countsByYear W30217967872021 @default.
- W3021796787 countsByYear W30217967872022 @default.
- W3021796787 countsByYear W30217967872023 @default.
- W3021796787 crossrefType "book-chapter" @default.
- W3021796787 hasAuthorship W3021796787A5030826237 @default.
- W3021796787 hasAuthorship W3021796787A5031821299 @default.
- W3021796787 hasAuthorship W3021796787A5042646536 @default.
- W3021796787 hasAuthorship W3021796787A5050003000 @default.
- W3021796787 hasAuthorship W3021796787A5061193324 @default.
- W3021796787 hasAuthorship W3021796787A5082946919 @default.
- W3021796787 hasAuthorship W3021796787A5091869385 @default.
- W3021796787 hasBestOaLocation W30217967872 @default.
- W3021796787 hasConcept C105795698 @default.
- W3021796787 hasConcept C106189395 @default.
- W3021796787 hasConcept C126255220 @default.
- W3021796787 hasConcept C134306372 @default.
- W3021796787 hasConcept C154945302 @default.
- W3021796787 hasConcept C159886148 @default.
- W3021796787 hasConcept C183356978 @default.
- W3021796787 hasConcept C2775924081 @default.
- W3021796787 hasConcept C33923547 @default.
- W3021796787 hasConcept C34388435 @default.
- W3021796787 hasConcept C41008148 @default.
- W3021796787 hasConcept C47446073 @default.
- W3021796787 hasConcept C77405623 @default.
- W3021796787 hasConceptScore W3021796787C105795698 @default.
- W3021796787 hasConceptScore W3021796787C106189395 @default.
- W3021796787 hasConceptScore W3021796787C126255220 @default.
- W3021796787 hasConceptScore W3021796787C134306372 @default.
- W3021796787 hasConceptScore W3021796787C154945302 @default.
- W3021796787 hasConceptScore W3021796787C159886148 @default.
- W3021796787 hasConceptScore W3021796787C183356978 @default.
- W3021796787 hasConceptScore W3021796787C2775924081 @default.
- W3021796787 hasConceptScore W3021796787C33923547 @default.
- W3021796787 hasConceptScore W3021796787C34388435 @default.
- W3021796787 hasConceptScore W3021796787C41008148 @default.
- W3021796787 hasConceptScore W3021796787C47446073 @default.
- W3021796787 hasConceptScore W3021796787C77405623 @default.
- W3021796787 hasLocation W30217967871 @default.
- W3021796787 hasLocation W30217967872 @default.
- W3021796787 hasOpenAccess W3021796787 @default.
- W3021796787 hasPrimaryLocation W30217967871 @default.
- W3021796787 hasRelatedWork W1548704398 @default.
- W3021796787 hasRelatedWork W2003865124 @default.
- W3021796787 hasRelatedWork W2036803292 @default.
- W3021796787 hasRelatedWork W2056209398 @default.
- W3021796787 hasRelatedWork W2082034027 @default.
- W3021796787 hasRelatedWork W2094067382 @default.
- W3021796787 hasRelatedWork W2183877482 @default.
- W3021796787 hasRelatedWork W3033909816 @default.
- W3021796787 hasRelatedWork W4281744853 @default.
- W3021796787 hasRelatedWork W4312190622 @default.
- W3021796787 isParatext "false" @default.
- W3021796787 isRetracted "false" @default.
- W3021796787 magId "3021796787" @default.
- W3021796787 workType "book-chapter" @default.