Matches in SemOpenAlex for { <https://semopenalex.org/work/W4290802777> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W4290802777 abstract "This paper addresses the problem of inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior. IRL can provide a generalizable and compact representation for apprenticeship learning, and enable accurately inferring the preferences of a human in order to assist them. %and provide for more accurate prediction. However, effective IRL is challenging, because many reward functions can be compatible with an observed behavior. We focus on how prior reinforcement learning (RL) experience can be leveraged to make learning these preferences faster and more efficient. We propose the IRL algorithm BASIS (Behavior Acquisition through Successor-feature Intention inference from Samples), which leverages multi-task RL pre-training and successor features to allow an agent to build a strong basis for intentions that spans the space of possible goals in a given domain. When exposed to just a few expert demonstrations optimizing a novel goal, the agent uses its basis to quickly and effectively infer the reward function. Our experiments reveal that our method is highly effective at inferring and optimizing demonstrated reward functions, accurately inferring reward functions from less than 100 trajectories." @default.
- W4290802777 created "2022-08-12" @default.
- W4290802777 creator A5026322200 @default.
- W4290802777 creator A5046953322 @default.
- W4290802777 creator A5084881611 @default.
- W4290802777 date "2022-08-09" @default.
- W4290802777 modified "2023-09-24" @default.
- W4290802777 title "Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience" @default.
- W4290802777 doi "https://doi.org/10.48550/arxiv.2208.04919" @default.
- W4290802777 hasPublicationYear "2022" @default.
- W4290802777 type Work @default.
- W4290802777 citedByCount "0" @default.
- W4290802777 crossrefType "posted-content" @default.
- W4290802777 hasAuthorship W4290802777A5026322200 @default.
- W4290802777 hasAuthorship W4290802777A5046953322 @default.
- W4290802777 hasAuthorship W4290802777A5084881611 @default.
- W4290802777 hasBestOaLocation W42908027771 @default.
- W4290802777 hasConcept C119857082 @default.
- W4290802777 hasConcept C134306372 @default.
- W4290802777 hasConcept C14036430 @default.
- W4290802777 hasConcept C154945302 @default.
- W4290802777 hasConcept C15744967 @default.
- W4290802777 hasConcept C162324750 @default.
- W4290802777 hasConcept C187736073 @default.
- W4290802777 hasConcept C2776214188 @default.
- W4290802777 hasConcept C2780451532 @default.
- W4290802777 hasConcept C33923547 @default.
- W4290802777 hasConcept C41008148 @default.
- W4290802777 hasConcept C67203356 @default.
- W4290802777 hasConcept C75306776 @default.
- W4290802777 hasConcept C77805123 @default.
- W4290802777 hasConcept C78458016 @default.
- W4290802777 hasConcept C86803240 @default.
- W4290802777 hasConcept C97541855 @default.
- W4290802777 hasConceptScore W4290802777C119857082 @default.
- W4290802777 hasConceptScore W4290802777C134306372 @default.
- W4290802777 hasConceptScore W4290802777C14036430 @default.
- W4290802777 hasConceptScore W4290802777C154945302 @default.
- W4290802777 hasConceptScore W4290802777C15744967 @default.
- W4290802777 hasConceptScore W4290802777C162324750 @default.
- W4290802777 hasConceptScore W4290802777C187736073 @default.
- W4290802777 hasConceptScore W4290802777C2776214188 @default.
- W4290802777 hasConceptScore W4290802777C2780451532 @default.
- W4290802777 hasConceptScore W4290802777C33923547 @default.
- W4290802777 hasConceptScore W4290802777C41008148 @default.
- W4290802777 hasConceptScore W4290802777C67203356 @default.
- W4290802777 hasConceptScore W4290802777C75306776 @default.
- W4290802777 hasConceptScore W4290802777C77805123 @default.
- W4290802777 hasConceptScore W4290802777C78458016 @default.
- W4290802777 hasConceptScore W4290802777C86803240 @default.
- W4290802777 hasConceptScore W4290802777C97541855 @default.
- W4290802777 hasLocation W42908027771 @default.
- W4290802777 hasOpenAccess W4290802777 @default.
- W4290802777 hasPrimaryLocation W42908027771 @default.
- W4290802777 hasRelatedWork W2149123936 @default.
- W4290802777 hasRelatedWork W2734912394 @default.
- W4290802777 hasRelatedWork W2907922678 @default.
- W4290802777 hasRelatedWork W2918392679 @default.
- W4290802777 hasRelatedWork W3011591403 @default.
- W4290802777 hasRelatedWork W3022038857 @default.
- W4290802777 hasRelatedWork W3132645524 @default.
- W4290802777 hasRelatedWork W3183420623 @default.
- W4290802777 hasRelatedWork W4225554067 @default.
- W4290802777 hasRelatedWork W4281400954 @default.
- W4290802777 isParatext "false" @default.
- W4290802777 isRetracted "false" @default.
- W4290802777 workType "article" @default.