Matches in SemOpenAlex for { <https://semopenalex.org/work/W2964086301> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W2964086301 endingPage "3402" @default.
- W2964086301 startingPage "3394" @default.
- W2964086301 abstract "Our goal is for AI systems to correctly identify and act according to their human user's objectives. Cooperative Inverse Reinforcement Learning (CIRL) formalizes this value alignment problem as a two-player game between a human and robot, in which only the human knows the parameters of the reward function: the robot needs to learn them as the interaction unfolds. Previous work showed that CIRL can be solved as a POMDP, but with an action space size exponential in the size of the reward parameter space. In this work, we exploit a specific property of CIRL---the human is a full information agent---to derive an optimality-preserving modification to the standard Bellman update; this reduces the complexity of the problem by an exponential factor and allows us to relax CIRL's assumption of human rationality. We apply this update to a variety of POMDP solvers and find that it enables us to scale CIRL to non-trivial problems, with larger reward parameter spaces, and larger action spaces for both robot and human. In solutions to these larger problems, the human exhibits pedagogic (teaching) behavior, while the robot interprets it as such and attains higher value for the human." @default.
- W2964086301 created "2019-07-30" @default.
- W2964086301 creator A5003355483 @default.
- W2964086301 creator A5005997281 @default.
- W2964086301 creator A5007305440 @default.
- W2964086301 creator A5050710435 @default.
- W2964086301 creator A5064033326 @default.
- W2964086301 creator A5076757561 @default.
- W2964086301 date "2018-07-03" @default.
- W2964086301 modified "2023-09-25" @default.
- W2964086301 title "An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning" @default.
- W2964086301 hasPublicationYear "2018" @default.
- W2964086301 type Work @default.
- W2964086301 sameAs 2964086301 @default.
- W2964086301 citedByCount "7" @default.
- W2964086301 countsByYear W29640863012019 @default.
- W2964086301 countsByYear W29640863012020 @default.
- W2964086301 countsByYear W29640863012021 @default.
- W2964086301 crossrefType "proceedings-article" @default.
- W2964086301 hasAuthorship W2964086301A5003355483 @default.
- W2964086301 hasAuthorship W2964086301A5005997281 @default.
- W2964086301 hasAuthorship W2964086301A5007305440 @default.
- W2964086301 hasAuthorship W2964086301A5050710435 @default.
- W2964086301 hasAuthorship W2964086301A5064033326 @default.
- W2964086301 hasAuthorship W2964086301A5076757561 @default.
- W2964086301 hasConcept C111919701 @default.
- W2964086301 hasConcept C119857082 @default.
- W2964086301 hasConcept C121332964 @default.
- W2964086301 hasConcept C126255220 @default.
- W2964086301 hasConcept C134306372 @default.
- W2964086301 hasConcept C136197465 @default.
- W2964086301 hasConcept C14036430 @default.
- W2964086301 hasConcept C14646407 @default.
- W2964086301 hasConcept C151376022 @default.
- W2964086301 hasConcept C154945302 @default.
- W2964086301 hasConcept C163836022 @default.
- W2964086301 hasConcept C17098449 @default.
- W2964086301 hasConcept C2778572836 @default.
- W2964086301 hasConcept C2780791683 @default.
- W2964086301 hasConcept C33923547 @default.
- W2964086301 hasConcept C41008148 @default.
- W2964086301 hasConcept C62520636 @default.
- W2964086301 hasConcept C78458016 @default.
- W2964086301 hasConcept C86803240 @default.
- W2964086301 hasConcept C90509273 @default.
- W2964086301 hasConcept C97541855 @default.
- W2964086301 hasConcept C98763669 @default.
- W2964086301 hasConceptScore W2964086301C111919701 @default.
- W2964086301 hasConceptScore W2964086301C119857082 @default.
- W2964086301 hasConceptScore W2964086301C121332964 @default.
- W2964086301 hasConceptScore W2964086301C126255220 @default.
- W2964086301 hasConceptScore W2964086301C134306372 @default.
- W2964086301 hasConceptScore W2964086301C136197465 @default.
- W2964086301 hasConceptScore W2964086301C14036430 @default.
- W2964086301 hasConceptScore W2964086301C14646407 @default.
- W2964086301 hasConceptScore W2964086301C151376022 @default.
- W2964086301 hasConceptScore W2964086301C154945302 @default.
- W2964086301 hasConceptScore W2964086301C163836022 @default.
- W2964086301 hasConceptScore W2964086301C17098449 @default.
- W2964086301 hasConceptScore W2964086301C2778572836 @default.
- W2964086301 hasConceptScore W2964086301C2780791683 @default.
- W2964086301 hasConceptScore W2964086301C33923547 @default.
- W2964086301 hasConceptScore W2964086301C41008148 @default.
- W2964086301 hasConceptScore W2964086301C62520636 @default.
- W2964086301 hasConceptScore W2964086301C78458016 @default.
- W2964086301 hasConceptScore W2964086301C86803240 @default.
- W2964086301 hasConceptScore W2964086301C90509273 @default.
- W2964086301 hasConceptScore W2964086301C97541855 @default.
- W2964086301 hasConceptScore W2964086301C98763669 @default.
- W2964086301 hasLocation W29640863011 @default.
- W2964086301 hasOpenAccess W2964086301 @default.
- W2964086301 hasPrimaryLocation W29640863011 @default.
- W2964086301 hasRelatedWork W1982948368 @default.
- W2964086301 hasRelatedWork W1999874108 @default.
- W2964086301 hasRelatedWork W2011231614 @default.
- W2964086301 hasRelatedWork W2061562262 @default.
- W2964086301 hasRelatedWork W2097031964 @default.
- W2964086301 hasRelatedWork W2117626647 @default.
- W2964086301 hasRelatedWork W2753088790 @default.
- W2964086301 hasRelatedWork W2804542227 @default.
- W2964086301 hasRelatedWork W2892172348 @default.
- W2964086301 hasRelatedWork W2915060045 @default.
- W2964086301 hasRelatedWork W2963289505 @default.
- W2964086301 hasRelatedWork W2970517708 @default.
- W2964086301 hasRelatedWork W2984409990 @default.
- W2964086301 hasRelatedWork W3035599863 @default.
- W2964086301 hasRelatedWork W3131283938 @default.
- W2964086301 hasRelatedWork W3152815381 @default.
- W2964086301 hasRelatedWork W3197209873 @default.
- W2964086301 hasRelatedWork W3202097587 @default.
- W2964086301 hasRelatedWork W3202423398 @default.
- W2964086301 hasRelatedWork W3211915286 @default.
- W2964086301 isParatext "false" @default.
- W2964086301 isRetracted "false" @default.
- W2964086301 magId "2964086301" @default.
- W2964086301 workType "article" @default.