Matches in SemOpenAlex for { <https://semopenalex.org/work/W3048128059> ?p ?o ?g. }
- W3048128059 abstract "Many modern methods for imitation learning and inverse reinforcement learning, such as GAIL or AIRL, are based on an adversarial formulation. These methods apply GANs to match the expert's distribution over states and actions with the implicit state-action distribution induced by the agent's policy. However, by framing imitation learning as a saddle point problem, adversarial methods can suffer from unstable optimization, and convergence can only be shown for small policy updates. We address these problems by proposing a framework for non-adversarial imitation learning. The resulting algorithms are similar to their adversarial counterparts and, thus, provide insights for adversarial imitation learning methods. Most notably, we show that AIRL is an instance of our non-adversarial formulation, which enables us to greatly simplify its derivations and obtain stronger convergence guarantees. We also show that our non-adversarial formulation can be used to derive novel algorithms by presenting a method for offline imitation learning that is inspired by the recent ValueDice algorithm, but does not rely on small policy updates for convergence. In our simulated robot experiments, our offline method for non-adversarial imitation learning seems to perform best when using many updates for policy and discriminator at each iteration and outperforms behavioral cloning and ValueDice." @default.
- W3048128059 created "2020-08-13" @default.
- W3048128059 creator A5014519249 @default.
- W3048128059 creator A5088394978 @default.
- W3048128059 date "2020-08-08" @default.
- W3048128059 modified "2023-09-27" @default.
- W3048128059 title "Non-Adversarial Imitation Learning and its Connections to Adversarial Methods." @default.
- W3048128059 cites W1506806321 @default.
- W3048128059 cites W1554348720 @default.
- W3048128059 cites W1959608418 @default.
- W3048128059 cites W1965555277 @default.
- W3048128059 cites W2032558547 @default.
- W3048128059 cites W2050798338 @default.
- W3048128059 cites W2061562262 @default.
- W3048128059 cites W2062291443 @default.
- W3048128059 cites W2098774185 @default.
- W3048128059 cites W2099471712 @default.
- W3048128059 cites W2117675763 @default.
- W3048128059 cites W2133068870 @default.
- W3048128059 cites W2136144249 @default.
- W3048128059 cites W2138537392 @default.
- W3048128059 cites W2142641780 @default.
- W3048128059 cites W2155183960 @default.
- W3048128059 cites W2158782408 @default.
- W3048128059 cites W2166302491 @default.
- W3048128059 cites W2167224731 @default.
- W3048128059 cites W2169498096 @default.
- W3048128059 cites W2403419699 @default.
- W3048128059 cites W2439299270 @default.
- W3048128059 cites W2566467060 @default.
- W3048128059 cites W2566991354 @default.
- W3048128059 cites W2739748921 @default.
- W3048128059 cites W2751302235 @default.
- W3048128059 cites W2794908222 @default.
- W3048128059 cites W2803106361 @default.
- W3048128059 cites W2884247313 @default.
- W3048128059 cites W2914656440 @default.
- W3048128059 cites W2947052139 @default.
- W3048128059 cites W2949916679 @default.
- W3048128059 cites W2962879692 @default.
- W3048128059 cites W2962897886 @default.
- W3048128059 cites W2962902376 @default.
- W3048128059 cites W2963043971 @default.
- W3048128059 cites W2963277051 @default.
- W3048128059 cites W2963508354 @default.
- W3048128059 cites W2963590100 @default.
- W3048128059 cites W2963800509 @default.
- W3048128059 cites W2964052395 @default.
- W3048128059 cites W2964201867 @default.
- W3048128059 cites W2971026276 @default.
- W3048128059 cites W2979719808 @default.
- W3048128059 cites W2994977742 @default.
- W3048128059 cites W2995006730 @default.
- W3048128059 cites W3028821797 @default.
- W3048128059 cites W64088143 @default.
- W3048128059 hasPublicationYear "2020" @default.
- W3048128059 type Work @default.
- W3048128059 sameAs 3048128059 @default.
- W3048128059 citedByCount "5" @default.
- W3048128059 countsByYear W30481280592021 @default.
- W3048128059 crossrefType "posted-content" @default.
- W3048128059 hasAuthorship W3048128059A5014519249 @default.
- W3048128059 hasAuthorship W3048128059A5088394978 @default.
- W3048128059 hasConcept C119857082 @default.
- W3048128059 hasConcept C126255220 @default.
- W3048128059 hasConcept C154945302 @default.
- W3048128059 hasConcept C2779803651 @default.
- W3048128059 hasConcept C33923547 @default.
- W3048128059 hasConcept C37736160 @default.
- W3048128059 hasConcept C41008148 @default.
- W3048128059 hasConcept C76155785 @default.
- W3048128059 hasConcept C94915269 @default.
- W3048128059 hasConcept C97541855 @default.
- W3048128059 hasConceptScore W3048128059C119857082 @default.
- W3048128059 hasConceptScore W3048128059C126255220 @default.
- W3048128059 hasConceptScore W3048128059C154945302 @default.
- W3048128059 hasConceptScore W3048128059C2779803651 @default.
- W3048128059 hasConceptScore W3048128059C33923547 @default.
- W3048128059 hasConceptScore W3048128059C37736160 @default.
- W3048128059 hasConceptScore W3048128059C41008148 @default.
- W3048128059 hasConceptScore W3048128059C76155785 @default.
- W3048128059 hasConceptScore W3048128059C94915269 @default.
- W3048128059 hasConceptScore W3048128059C97541855 @default.
- W3048128059 hasLocation W30481280591 @default.
- W3048128059 hasOpenAccess W3048128059 @default.
- W3048128059 hasPrimaryLocation W30481280591 @default.
- W3048128059 hasRelatedWork W2497864078 @default.
- W3048128059 hasRelatedWork W2560678327 @default.
- W3048128059 hasRelatedWork W2785635021 @default.
- W3048128059 hasRelatedWork W2788730383 @default.
- W3048128059 hasRelatedWork W2891781407 @default.
- W3048128059 hasRelatedWork W2924566522 @default.
- W3048128059 hasRelatedWork W2949384876 @default.
- W3048128059 hasRelatedWork W2952854274 @default.
- W3048128059 hasRelatedWork W2963301010 @default.
- W3048128059 hasRelatedWork W2963597678 @default.
- W3048128059 hasRelatedWork W2963638211 @default.
- W3048128059 hasRelatedWork W2969866298 @default.
- W3048128059 hasRelatedWork W2972816625 @default.
- W3048128059 hasRelatedWork W3040111939 @default.