Matches in SemOpenAlex for { <https://semopenalex.org/work/W4372269771> ?p ?o ?g. }
Showing items 1 to 74 of
74
with 100 items per page.
- W4372269771 abstract "We study the problem of imitation learning in automated decision systems, in which a learner is trained to imitate an expert demonstrator. A widely used method is adversarial imitation learning that alternately optimizes a generator (learner) and a discriminator (reward function). However, the discriminator is biased during the initial and intermediate training stages. Consequently, the gradient descent direction of the learner is misguided, which leads to unstable training and sample complexity. In this paper, we propose deep imitation learning through a guidance-based policy distillation (GIL) algorithm. First, GIL proposes a teacher model, the guidance-based variational autoencoder, which is pre-trained with expert demonstrations. Then, GIL proposes a perturbation-based policy distillation method that uses the teacher model to guide the learner in the correct optimization direction, enabling the learner to imitate the expert policy with fewer detours. The experimental results show that our approach achieve higher sample efficiency compared with multiple baselines." @default.
- W4372269771 created "2023-05-07" @default.
- W4372269771 creator A5026654212 @default.
- W4372269771 creator A5060107207 @default.
- W4372269771 creator A5060264631 @default.
- W4372269771 creator A5089057317 @default.
- W4372269771 date "2023-06-04" @default.
- W4372269771 modified "2023-10-14" @default.
- W4372269771 title "A Perturbation-Based Policy Distillation Framework with Generative Adversarial Nets" @default.
- W4372269771 cites W2911087563 @default.
- W4372269771 cites W2945559256 @default.
- W4372269771 cites W2963802910 @default.
- W4372269771 cites W2985455643 @default.
- W4372269771 cites W3000681444 @default.
- W4372269771 cites W3034368386 @default.
- W4372269771 cites W3138984732 @default.
- W4372269771 cites W3191808899 @default.
- W4372269771 doi "https://doi.org/10.1109/icassp49357.2023.10096272" @default.
- W4372269771 hasPublicationYear "2023" @default.
- W4372269771 type Work @default.
- W4372269771 citedByCount "0" @default.
- W4372269771 crossrefType "proceedings-article" @default.
- W4372269771 hasAuthorship W4372269771A5026654212 @default.
- W4372269771 hasAuthorship W4372269771A5060107207 @default.
- W4372269771 hasAuthorship W4372269771A5060264631 @default.
- W4372269771 hasAuthorship W4372269771A5089057317 @default.
- W4372269771 hasBestOaLocation W43722697711 @default.
- W4372269771 hasConcept C101738243 @default.
- W4372269771 hasConcept C119857082 @default.
- W4372269771 hasConcept C154945302 @default.
- W4372269771 hasConcept C178790620 @default.
- W4372269771 hasConcept C185592680 @default.
- W4372269771 hasConcept C198531522 @default.
- W4372269771 hasConcept C204030448 @default.
- W4372269771 hasConcept C2779803651 @default.
- W4372269771 hasConcept C37736160 @default.
- W4372269771 hasConcept C39890363 @default.
- W4372269771 hasConcept C41008148 @default.
- W4372269771 hasConcept C43617362 @default.
- W4372269771 hasConcept C50644808 @default.
- W4372269771 hasConcept C76155785 @default.
- W4372269771 hasConcept C94915269 @default.
- W4372269771 hasConceptScore W4372269771C101738243 @default.
- W4372269771 hasConceptScore W4372269771C119857082 @default.
- W4372269771 hasConceptScore W4372269771C154945302 @default.
- W4372269771 hasConceptScore W4372269771C178790620 @default.
- W4372269771 hasConceptScore W4372269771C185592680 @default.
- W4372269771 hasConceptScore W4372269771C198531522 @default.
- W4372269771 hasConceptScore W4372269771C204030448 @default.
- W4372269771 hasConceptScore W4372269771C2779803651 @default.
- W4372269771 hasConceptScore W4372269771C37736160 @default.
- W4372269771 hasConceptScore W4372269771C39890363 @default.
- W4372269771 hasConceptScore W4372269771C41008148 @default.
- W4372269771 hasConceptScore W4372269771C43617362 @default.
- W4372269771 hasConceptScore W4372269771C50644808 @default.
- W4372269771 hasConceptScore W4372269771C76155785 @default.
- W4372269771 hasConceptScore W4372269771C94915269 @default.
- W4372269771 hasFunder F4320321001 @default.
- W4372269771 hasLocation W43722697711 @default.
- W4372269771 hasOpenAccess W4372269771 @default.
- W4372269771 hasPrimaryLocation W43722697711 @default.
- W4372269771 hasRelatedWork W2412510955 @default.
- W4372269771 hasRelatedWork W2564929713 @default.
- W4372269771 hasRelatedWork W2809882560 @default.
- W4372269771 hasRelatedWork W2952936466 @default.
- W4372269771 hasRelatedWork W3005996785 @default.
- W4372269771 hasRelatedWork W3119931323 @default.
- W4372269771 hasRelatedWork W4210395953 @default.
- W4372269771 hasRelatedWork W4280544492 @default.
- W4372269771 hasRelatedWork W4293320219 @default.
- W4372269771 hasRelatedWork W4386984417 @default.
- W4372269771 isParatext "false" @default.
- W4372269771 isRetracted "false" @default.
- W4372269771 workType "article" @default.