Matches in SemOpenAlex for { <https://semopenalex.org/work/W2890752237> ?p ?o ?g. }
Showing items 1 to 76 of
76
with 100 items per page.
- W2890752237 endingPage "8473" @default.
- W2890752237 startingPage "8464" @default.
- W2890752237 abstract "Learning near-optimal behaviour from an expert's demonstrations typically relies on the assumption that the learner knows the features that the true reward function depends on. In this paper, we study the problem of learning from demonstrations in the setting where this is not the case, i.e., where there is a mismatch between the worldviews of the learner and the expert. We introduce a natural quantity, the teaching risk, which measures the potential suboptimality of policies that look optimal to the learner in this setting. We show that bounds on the teaching risk guarantee that the learner is able to find a near-optimal policy using standard algorithms based on inverse reinforcement learning. Based on these findings, we suggest a teaching scheme in which the expert can decrease the teaching risk by updating the learner's worldview, and thus ultimately enable her to find a near-optimal policy." @default.
- W2890752237 created "2018-09-27" @default.
- W2890752237 creator A5008828477 @default.
- W2890752237 creator A5027711113 @default.
- W2890752237 creator A5051794439 @default.
- W2890752237 date "2018-01-01" @default.
- W2890752237 modified "2023-09-26" @default.
- W2890752237 title "Teaching Inverse Reinforcement Learners via Features and Demonstrations" @default.
- W2890752237 hasPublicationYear "2018" @default.
- W2890752237 type Work @default.
- W2890752237 sameAs 2890752237 @default.
- W2890752237 citedByCount "29" @default.
- W2890752237 countsByYear W28907522372018 @default.
- W2890752237 countsByYear W28907522372019 @default.
- W2890752237 countsByYear W28907522372020 @default.
- W2890752237 countsByYear W28907522372021 @default.
- W2890752237 crossrefType "proceedings-article" @default.
- W2890752237 hasAuthorship W2890752237A5008828477 @default.
- W2890752237 hasAuthorship W2890752237A5027711113 @default.
- W2890752237 hasAuthorship W2890752237A5051794439 @default.
- W2890752237 hasConcept C119857082 @default.
- W2890752237 hasConcept C134306372 @default.
- W2890752237 hasConcept C14036430 @default.
- W2890752237 hasConcept C145420912 @default.
- W2890752237 hasConcept C154945302 @default.
- W2890752237 hasConcept C207467116 @default.
- W2890752237 hasConcept C2524010 @default.
- W2890752237 hasConcept C33923547 @default.
- W2890752237 hasConcept C41008148 @default.
- W2890752237 hasConcept C77618280 @default.
- W2890752237 hasConcept C78458016 @default.
- W2890752237 hasConcept C86803240 @default.
- W2890752237 hasConcept C97541855 @default.
- W2890752237 hasConceptScore W2890752237C119857082 @default.
- W2890752237 hasConceptScore W2890752237C134306372 @default.
- W2890752237 hasConceptScore W2890752237C14036430 @default.
- W2890752237 hasConceptScore W2890752237C145420912 @default.
- W2890752237 hasConceptScore W2890752237C154945302 @default.
- W2890752237 hasConceptScore W2890752237C207467116 @default.
- W2890752237 hasConceptScore W2890752237C2524010 @default.
- W2890752237 hasConceptScore W2890752237C33923547 @default.
- W2890752237 hasConceptScore W2890752237C41008148 @default.
- W2890752237 hasConceptScore W2890752237C77618280 @default.
- W2890752237 hasConceptScore W2890752237C78458016 @default.
- W2890752237 hasConceptScore W2890752237C86803240 @default.
- W2890752237 hasConceptScore W2890752237C97541855 @default.
- W2890752237 hasLocation W28907522371 @default.
- W2890752237 hasOpenAccess W2890752237 @default.
- W2890752237 hasPrimaryLocation W28907522371 @default.
- W2890752237 hasRelatedWork W1777239053 @default.
- W2890752237 hasRelatedWork W2020764470 @default.
- W2890752237 hasRelatedWork W2121863487 @default.
- W2890752237 hasRelatedWork W2145339207 @default.
- W2890752237 hasRelatedWork W2146250995 @default.
- W2890752237 hasRelatedWork W2182055801 @default.
- W2890752237 hasRelatedWork W2577466617 @default.
- W2890752237 hasRelatedWork W2783793006 @default.
- W2890752237 hasRelatedWork W2786676179 @default.
- W2890752237 hasRelatedWork W2951453233 @default.
- W2890752237 hasRelatedWork W2962930238 @default.
- W2890752237 hasRelatedWork W2963289505 @default.
- W2890752237 hasRelatedWork W2963308241 @default.
- W2890752237 hasRelatedWork W2964237080 @default.
- W2890752237 hasRelatedWork W2966120739 @default.
- W2890752237 hasRelatedWork W2970259165 @default.
- W2890752237 hasRelatedWork W2970469191 @default.
- W2890752237 hasRelatedWork W2970502043 @default.
- W2890752237 hasRelatedWork W2970734210 @default.
- W2890752237 hasRelatedWork W950880443 @default.
- W2890752237 hasVolume "31" @default.
- W2890752237 isParatext "false" @default.
- W2890752237 isRetracted "false" @default.
- W2890752237 magId "2890752237" @default.
- W2890752237 workType "article" @default.