Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287979078> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4287979078 abstract "Autonomous agents can learn by imitating teacher demonstrations of the intended behavior. Hierarchical control policies are ubiquitously useful for such learning, having the potential to break down structured tasks into simpler sub-tasks, thereby improving data efficiency and generalization. In this paper, we propose a variational inference method for imitation learning of a control policy represented by parametrized hierarchical procedures (PHP), a program-like structure in which procedures can invoke sub-procedures to perform sub-tasks. Our method discovers the hierarchical structure in a dataset of observation-action traces of teacher demonstrations, by learning an approximate posterior distribution over the latent sequence of procedure calls and terminations. Samples from this learned distribution then guide the training of the hierarchical control policy. We identify and demonstrate a novel benefit of variational inference in the context of hierarchical imitation learning: in decomposing the policy into simpler procedures, inference can leverage acausal information that is unused by other methods. Training PHP with variational inference outperforms LSTM baselines in terms of data efficiency and generalization, requiring less than half as much data to achieve a 24% error rate in executing the bubble sort algorithm, and to achieve no error in executing Karel programs." @default.
- W4287979078 created "2022-07-26" @default.
- W4287979078 creator A5004266620 @default.
- W4287979078 creator A5019426968 @default.
- W4287979078 creator A5034429817 @default.
- W4287979078 creator A5041920173 @default.
- W4287979078 creator A5049349154 @default.
- W4287979078 creator A5050068749 @default.
- W4287979078 creator A5050342525 @default.
- W4287979078 creator A5057865970 @default.
- W4287979078 date "2019-12-29" @default.
- W4287979078 modified "2023-09-27" @default.
- W4287979078 title "Hierarchical Variational Imitation Learning of Control Programs" @default.
- W4287979078 doi "https://doi.org/10.48550/arxiv.1912.12612" @default.
- W4287979078 hasPublicationYear "2019" @default.
- W4287979078 type Work @default.
- W4287979078 citedByCount "0" @default.
- W4287979078 crossrefType "posted-content" @default.
- W4287979078 hasAuthorship W4287979078A5004266620 @default.
- W4287979078 hasAuthorship W4287979078A5019426968 @default.
- W4287979078 hasAuthorship W4287979078A5034429817 @default.
- W4287979078 hasAuthorship W4287979078A5041920173 @default.
- W4287979078 hasAuthorship W4287979078A5049349154 @default.
- W4287979078 hasAuthorship W4287979078A5050068749 @default.
- W4287979078 hasAuthorship W4287979078A5050342525 @default.
- W4287979078 hasAuthorship W4287979078A5057865970 @default.
- W4287979078 hasBestOaLocation W42879790781 @default.
- W4287979078 hasConcept C119857082 @default.
- W4287979078 hasConcept C126388530 @default.
- W4287979078 hasConcept C134306372 @default.
- W4287979078 hasConcept C153083717 @default.
- W4287979078 hasConcept C154945302 @default.
- W4287979078 hasConcept C15744967 @default.
- W4287979078 hasConcept C177148314 @default.
- W4287979078 hasConcept C22367795 @default.
- W4287979078 hasConcept C2775924081 @default.
- W4287979078 hasConcept C2776214188 @default.
- W4287979078 hasConcept C2778112365 @default.
- W4287979078 hasConcept C33923547 @default.
- W4287979078 hasConcept C41008148 @default.
- W4287979078 hasConcept C54355233 @default.
- W4287979078 hasConcept C77805123 @default.
- W4287979078 hasConcept C86803240 @default.
- W4287979078 hasConceptScore W4287979078C119857082 @default.
- W4287979078 hasConceptScore W4287979078C126388530 @default.
- W4287979078 hasConceptScore W4287979078C134306372 @default.
- W4287979078 hasConceptScore W4287979078C153083717 @default.
- W4287979078 hasConceptScore W4287979078C154945302 @default.
- W4287979078 hasConceptScore W4287979078C15744967 @default.
- W4287979078 hasConceptScore W4287979078C177148314 @default.
- W4287979078 hasConceptScore W4287979078C22367795 @default.
- W4287979078 hasConceptScore W4287979078C2775924081 @default.
- W4287979078 hasConceptScore W4287979078C2776214188 @default.
- W4287979078 hasConceptScore W4287979078C2778112365 @default.
- W4287979078 hasConceptScore W4287979078C33923547 @default.
- W4287979078 hasConceptScore W4287979078C41008148 @default.
- W4287979078 hasConceptScore W4287979078C54355233 @default.
- W4287979078 hasConceptScore W4287979078C77805123 @default.
- W4287979078 hasConceptScore W4287979078C86803240 @default.
- W4287979078 hasLocation W42879790781 @default.
- W4287979078 hasOpenAccess W4287979078 @default.
- W4287979078 hasPrimaryLocation W42879790781 @default.
- W4287979078 hasRelatedWork W1781547478 @default.
- W4287979078 hasRelatedWork W2953238046 @default.
- W4287979078 hasRelatedWork W2963058055 @default.
- W4287979078 hasRelatedWork W2997732735 @default.
- W4287979078 hasRelatedWork W3038853105 @default.
- W4287979078 hasRelatedWork W3126080939 @default.
- W4287979078 hasRelatedWork W3135007252 @default.
- W4287979078 hasRelatedWork W4287979078 @default.
- W4287979078 hasRelatedWork W4300631627 @default.
- W4287979078 hasRelatedWork W4323521277 @default.
- W4287979078 isParatext "false" @default.
- W4287979078 isRetracted "false" @default.
- W4287979078 workType "article" @default.