Matches in SemOpenAlex for { <https://semopenalex.org/work/W2803661326> ?p ?o ?g. }
- W2803661326 abstract "Inverse reinforcement learning (IRL) infers a reward function from demonstrations, allowing for policy improvement and generalization. However, despite much recent interest in IRL, little work has been done to understand the minimum set of demonstrations needed to teach a specific sequential decision-making task. We formalize the problem of finding maximally informative demonstrations for IRL as a machine teaching problem where the goal is to find the minimum number of demonstrations needed to specify the reward equivalence class of the demonstrator. We extend previous work on algorithmic teaching for sequential decision-making tasks by showing a reduction to the set cover problem which enables an efficient approximation algorithm for determining the set of maximally-informative demonstrations. We apply our proposed machine teaching algorithm to two novel applications: providing a lower bound on the number of queries needed to learn a policy using active IRL and developing a novel IRL algorithm that can learn more efficiently from informative demonstrations than a standard IRL approach." @default.
- W2803661326 created "2018-06-01" @default.
- W2803661326 creator A5043572737 @default.
- W2803661326 creator A5057137533 @default.
- W2803661326 date "2018-05-20" @default.
- W2803661326 modified "2023-09-27" @default.
- W2803661326 title "Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications" @default.
- W2803661326 cites W102873864 @default.
- W2803661326 cites W1520461958 @default.
- W2803661326 cites W1583136812 @default.
- W2803661326 cites W1591675293 @default.
- W2803661326 cites W1606056663 @default.
- W2803661326 cites W1633675443 @default.
- W2803661326 cites W1680189815 @default.
- W2803661326 cites W1986014385 @default.
- W2803661326 cites W1999874108 @default.
- W2803661326 cites W2006912660 @default.
- W2803661326 cites W2018623774 @default.
- W2803661326 cites W2020764470 @default.
- W2803661326 cites W2029828838 @default.
- W2803661326 cites W2056354534 @default.
- W2803661326 cites W2061562262 @default.
- W2803661326 cites W2062525454 @default.
- W2803661326 cites W2098774185 @default.
- W2803661326 cites W2103104707 @default.
- W2803661326 cites W2105156548 @default.
- W2803661326 cites W2116442740 @default.
- W2803661326 cites W2125299871 @default.
- W2803661326 cites W2146250995 @default.
- W2803661326 cites W2156163138 @default.
- W2803661326 cites W2163037893 @default.
- W2803661326 cites W2181849516 @default.
- W2803661326 cites W2182055801 @default.
- W2803661326 cites W2283935042 @default.
- W2803661326 cites W2293844262 @default.
- W2803661326 cites W2410842990 @default.
- W2803661326 cites W2465040775 @default.
- W2803661326 cites W2557026499 @default.
- W2803661326 cites W2562989799 @default.
- W2803661326 cites W2565370028 @default.
- W2803661326 cites W2605076822 @default.
- W2803661326 cites W2735318784 @default.
- W2803661326 cites W2783793006 @default.
- W2803661326 cites W2798750840 @default.
- W2803661326 cites W2809134166 @default.
- W2803661326 cites W2809461852 @default.
- W2803661326 cites W2902909714 @default.
- W2803661326 cites W2914331073 @default.
- W2803661326 cites W2951122980 @default.
- W2803661326 cites W2962717849 @default.
- W2803661326 cites W2963099438 @default.
- W2803661326 cites W2963208223 @default.
- W2803661326 cites W2963289505 @default.
- W2803661326 cites W2963508354 @default.
- W2803661326 cites W2963670910 @default.
- W2803661326 cites W950880443 @default.
- W2803661326 hasPublicationYear "2018" @default.
- W2803661326 type Work @default.
- W2803661326 sameAs 2803661326 @default.
- W2803661326 citedByCount "6" @default.
- W2803661326 countsByYear W28036613262018 @default.
- W2803661326 countsByYear W28036613262019 @default.
- W2803661326 countsByYear W28036613262020 @default.
- W2803661326 crossrefType "posted-content" @default.
- W2803661326 hasAuthorship W2803661326A5043572737 @default.
- W2803661326 hasAuthorship W2803661326A5057137533 @default.
- W2803661326 hasConcept C11413529 @default.
- W2803661326 hasConcept C118615104 @default.
- W2803661326 hasConcept C119857082 @default.
- W2803661326 hasConcept C134306372 @default.
- W2803661326 hasConcept C154945302 @default.
- W2803661326 hasConcept C162324750 @default.
- W2803661326 hasConcept C177148314 @default.
- W2803661326 hasConcept C177264268 @default.
- W2803661326 hasConcept C187736073 @default.
- W2803661326 hasConcept C199360897 @default.
- W2803661326 hasConcept C207467116 @default.
- W2803661326 hasConcept C2524010 @default.
- W2803661326 hasConcept C2777212361 @default.
- W2803661326 hasConcept C2780069185 @default.
- W2803661326 hasConcept C2780451532 @default.
- W2803661326 hasConcept C33923547 @default.
- W2803661326 hasConcept C41008148 @default.
- W2803661326 hasConcept C97541855 @default.
- W2803661326 hasConceptScore W2803661326C11413529 @default.
- W2803661326 hasConceptScore W2803661326C118615104 @default.
- W2803661326 hasConceptScore W2803661326C119857082 @default.
- W2803661326 hasConceptScore W2803661326C134306372 @default.
- W2803661326 hasConceptScore W2803661326C154945302 @default.
- W2803661326 hasConceptScore W2803661326C162324750 @default.
- W2803661326 hasConceptScore W2803661326C177148314 @default.
- W2803661326 hasConceptScore W2803661326C177264268 @default.
- W2803661326 hasConceptScore W2803661326C187736073 @default.
- W2803661326 hasConceptScore W2803661326C199360897 @default.
- W2803661326 hasConceptScore W2803661326C207467116 @default.
- W2803661326 hasConceptScore W2803661326C2524010 @default.
- W2803661326 hasConceptScore W2803661326C2777212361 @default.
- W2803661326 hasConceptScore W2803661326C2780069185 @default.
- W2803661326 hasConceptScore W2803661326C2780451532 @default.
- W2803661326 hasConceptScore W2803661326C33923547 @default.