Matches in SemOpenAlex for { <https://semopenalex.org/work/W2899041500> ?p ?o ?g. }
- W2899041500 abstract "We identify two issues with the family of algorithms based on the Adversarial Imitation Learning framework. The first problem is implicit bias present in the reward functions used in these algorithms. While these biases might work well for some environments, they can also lead to sub-optimal behavior in others. Secondly, even though these algorithms can learn from few expert demonstrations, they require a prohibitively large number of interactions with the environment in order to imitate the expert for many real-world applications. In order to address these issues, we propose a new algorithm called Discriminator-Actor-Critic that uses off-policy Reinforcement Learning to reduce policy-environment interaction sample complexity by an average factor of 10. Furthermore, since our reward function is designed to be unbiased, we can apply our algorithm to many problems without making any task-specific adjustments." @default.
- W2899041500 created "2018-11-09" @default.
- W2899041500 creator A5000412475 @default.
- W2899041500 creator A5026322200 @default.
- W2899041500 creator A5066924496 @default.
- W2899041500 creator A5086657309 @default.
- W2899041500 date "2018-09-09" @default.
- W2899041500 modified "2023-09-27" @default.
- W2899041500 title "Addressing Sample Inefficiency and Reward Bias in Inverse Reinforcement Learning." @default.
- W2899041500 cites W1771410628 @default.
- W2899041500 cites W1777239053 @default.
- W2899041500 cites W1999874108 @default.
- W2899041500 cites W2061562262 @default.
- W2899041500 cites W2098774185 @default.
- W2899041500 cites W2121863487 @default.
- W2899041500 cites W2174803659 @default.
- W2899041500 cites W2788730383 @default.
- W2899041500 cites W2803616302 @default.
- W2899041500 cites W2950735232 @default.
- W2899041500 cites W2962879692 @default.
- W2899041500 cites W2962957031 @default.
- W2899041500 cites W2963099939 @default.
- W2899041500 cites W2963277051 @default.
- W2899041500 cites W2963328631 @default.
- W2899041500 cites W2963508354 @default.
- W2899041500 cites W2963590100 @default.
- W2899041500 hasPublicationYear "2018" @default.
- W2899041500 type Work @default.
- W2899041500 sameAs 2899041500 @default.
- W2899041500 citedByCount "6" @default.
- W2899041500 countsByYear W28990415002018 @default.
- W2899041500 countsByYear W28990415002019 @default.
- W2899041500 countsByYear W28990415002020 @default.
- W2899041500 crossrefType "posted-content" @default.
- W2899041500 hasAuthorship W2899041500A5000412475 @default.
- W2899041500 hasAuthorship W2899041500A5026322200 @default.
- W2899041500 hasAuthorship W2899041500A5066924496 @default.
- W2899041500 hasAuthorship W2899041500A5086657309 @default.
- W2899041500 hasConcept C119857082 @default.
- W2899041500 hasConcept C127413603 @default.
- W2899041500 hasConcept C14036430 @default.
- W2899041500 hasConcept C154945302 @default.
- W2899041500 hasConcept C162324750 @default.
- W2899041500 hasConcept C175444787 @default.
- W2899041500 hasConcept C185592680 @default.
- W2899041500 hasConcept C198531522 @default.
- W2899041500 hasConcept C201995342 @default.
- W2899041500 hasConcept C2778869765 @default.
- W2899041500 hasConcept C2779803651 @default.
- W2899041500 hasConcept C2780451532 @default.
- W2899041500 hasConcept C37736160 @default.
- W2899041500 hasConcept C41008148 @default.
- W2899041500 hasConcept C43617362 @default.
- W2899041500 hasConcept C76155785 @default.
- W2899041500 hasConcept C78458016 @default.
- W2899041500 hasConcept C86803240 @default.
- W2899041500 hasConcept C94915269 @default.
- W2899041500 hasConcept C97541855 @default.
- W2899041500 hasConceptScore W2899041500C119857082 @default.
- W2899041500 hasConceptScore W2899041500C127413603 @default.
- W2899041500 hasConceptScore W2899041500C14036430 @default.
- W2899041500 hasConceptScore W2899041500C154945302 @default.
- W2899041500 hasConceptScore W2899041500C162324750 @default.
- W2899041500 hasConceptScore W2899041500C175444787 @default.
- W2899041500 hasConceptScore W2899041500C185592680 @default.
- W2899041500 hasConceptScore W2899041500C198531522 @default.
- W2899041500 hasConceptScore W2899041500C201995342 @default.
- W2899041500 hasConceptScore W2899041500C2778869765 @default.
- W2899041500 hasConceptScore W2899041500C2779803651 @default.
- W2899041500 hasConceptScore W2899041500C2780451532 @default.
- W2899041500 hasConceptScore W2899041500C37736160 @default.
- W2899041500 hasConceptScore W2899041500C41008148 @default.
- W2899041500 hasConceptScore W2899041500C43617362 @default.
- W2899041500 hasConceptScore W2899041500C76155785 @default.
- W2899041500 hasConceptScore W2899041500C78458016 @default.
- W2899041500 hasConceptScore W2899041500C86803240 @default.
- W2899041500 hasConceptScore W2899041500C94915269 @default.
- W2899041500 hasConceptScore W2899041500C97541855 @default.
- W2899041500 hasLocation W28990415001 @default.
- W2899041500 hasOpenAccess W2899041500 @default.
- W2899041500 hasPrimaryLocation W28990415001 @default.
- W2899041500 hasRelatedWork W1555922520 @default.
- W2899041500 hasRelatedWork W1999874108 @default.
- W2899041500 hasRelatedWork W2128786740 @default.
- W2899041500 hasRelatedWork W2183087363 @default.
- W2899041500 hasRelatedWork W2806029332 @default.
- W2899041500 hasRelatedWork W2902567911 @default.
- W2899041500 hasRelatedWork W2909588145 @default.
- W2899041500 hasRelatedWork W2952854274 @default.
- W2899041500 hasRelatedWork W2963516265 @default.
- W2899041500 hasRelatedWork W2999490157 @default.
- W2899041500 hasRelatedWork W3005607450 @default.
- W2899041500 hasRelatedWork W3007369745 @default.
- W2899041500 hasRelatedWork W3022169065 @default.
- W2899041500 hasRelatedWork W3048367007 @default.
- W2899041500 hasRelatedWork W3084024636 @default.
- W2899041500 hasRelatedWork W3125547903 @default.
- W2899041500 hasRelatedWork W3131310681 @default.
- W2899041500 hasRelatedWork W3159757479 @default.
- W2899041500 hasRelatedWork W3200996868 @default.