Matches in SemOpenAlex for { <https://semopenalex.org/work/W2948807003> ?p ?o ?g. }
- W2948807003 abstract "In artificial intelligence, we often specify tasks through a reward function. While this works well in some settings, many tasks are hard to specify this way. In deep reinforcement learning, for example, directly specifying a reward as a function of a high-dimensional observation is challenging. Instead, we present an interface for specifying tasks interactively using demonstrations. Our approach defines a set of increasingly complex policies. The interface allows the user to switch between these policies at fixed intervals to generate demonstrations of novel, more complex, tasks. We train new policies based on these demonstrations and repeat the process. We present a case study of our approach in the Lunar Lander domain, and show that this simple approach can quickly learn a successful landing policy and outperforms an existing comparison-based deep RL method." @default.
- W2948807003 created "2019-06-14" @default.
- W2948807003 creator A5005997281 @default.
- W2948807003 creator A5026740203 @default.
- W2948807003 creator A5045163381 @default.
- W2948807003 creator A5076757561 @default.
- W2948807003 date "2019-06-06" @default.
- W2948807003 modified "2023-09-23" @default.
- W2948807003 title "An Extensible Interactive Interface for Agent Design." @default.
- W2948807003 cites W2051228319 @default.
- W2948807003 cites W2109910161 @default.
- W2948807003 cites W2627585944 @default.
- W2948807003 cites W2735318784 @default.
- W2948807003 cites W2736601468 @default.
- W2948807003 cites W2786110872 @default.
- W2948807003 cites W2805465728 @default.
- W2948807003 cites W2895560838 @default.
- W2948807003 cites W2901509721 @default.
- W2948807003 cites W2938421504 @default.
- W2948807003 cites W2962943921 @default.
- W2948807003 cites W2962957031 @default.
- W2948807003 cites W2963516265 @default.
- W2948807003 cites W2964001908 @default.
- W2948807003 cites W2964263543 @default.
- W2948807003 cites W2964342357 @default.
- W2948807003 cites W567721252 @default.
- W2948807003 hasPublicationYear "2019" @default.
- W2948807003 type Work @default.
- W2948807003 sameAs 2948807003 @default.
- W2948807003 citedByCount "0" @default.
- W2948807003 crossrefType "posted-content" @default.
- W2948807003 hasAuthorship W2948807003A5005997281 @default.
- W2948807003 hasAuthorship W2948807003A5026740203 @default.
- W2948807003 hasAuthorship W2948807003A5045163381 @default.
- W2948807003 hasAuthorship W2948807003A5076757561 @default.
- W2948807003 hasConcept C107457646 @default.
- W2948807003 hasConcept C111472728 @default.
- W2948807003 hasConcept C111919701 @default.
- W2948807003 hasConcept C113843644 @default.
- W2948807003 hasConcept C120314980 @default.
- W2948807003 hasConcept C129307140 @default.
- W2948807003 hasConcept C134306372 @default.
- W2948807003 hasConcept C138885662 @default.
- W2948807003 hasConcept C14036430 @default.
- W2948807003 hasConcept C154945302 @default.
- W2948807003 hasConcept C157915830 @default.
- W2948807003 hasConcept C177264268 @default.
- W2948807003 hasConcept C199360897 @default.
- W2948807003 hasConcept C2780586882 @default.
- W2948807003 hasConcept C32833848 @default.
- W2948807003 hasConcept C33923547 @default.
- W2948807003 hasConcept C36503486 @default.
- W2948807003 hasConcept C41008148 @default.
- W2948807003 hasConcept C78458016 @default.
- W2948807003 hasConcept C86803240 @default.
- W2948807003 hasConcept C97541855 @default.
- W2948807003 hasConcept C98045186 @default.
- W2948807003 hasConceptScore W2948807003C107457646 @default.
- W2948807003 hasConceptScore W2948807003C111472728 @default.
- W2948807003 hasConceptScore W2948807003C111919701 @default.
- W2948807003 hasConceptScore W2948807003C113843644 @default.
- W2948807003 hasConceptScore W2948807003C120314980 @default.
- W2948807003 hasConceptScore W2948807003C129307140 @default.
- W2948807003 hasConceptScore W2948807003C134306372 @default.
- W2948807003 hasConceptScore W2948807003C138885662 @default.
- W2948807003 hasConceptScore W2948807003C14036430 @default.
- W2948807003 hasConceptScore W2948807003C154945302 @default.
- W2948807003 hasConceptScore W2948807003C157915830 @default.
- W2948807003 hasConceptScore W2948807003C177264268 @default.
- W2948807003 hasConceptScore W2948807003C199360897 @default.
- W2948807003 hasConceptScore W2948807003C2780586882 @default.
- W2948807003 hasConceptScore W2948807003C32833848 @default.
- W2948807003 hasConceptScore W2948807003C33923547 @default.
- W2948807003 hasConceptScore W2948807003C36503486 @default.
- W2948807003 hasConceptScore W2948807003C41008148 @default.
- W2948807003 hasConceptScore W2948807003C78458016 @default.
- W2948807003 hasConceptScore W2948807003C86803240 @default.
- W2948807003 hasConceptScore W2948807003C97541855 @default.
- W2948807003 hasConceptScore W2948807003C98045186 @default.
- W2948807003 hasOpenAccess W2948807003 @default.
- W2948807003 hasRelatedWork W1622012679 @default.
- W2948807003 hasRelatedWork W1995204256 @default.
- W2948807003 hasRelatedWork W2042587402 @default.
- W2948807003 hasRelatedWork W2066697502 @default.
- W2948807003 hasRelatedWork W2074733377 @default.
- W2948807003 hasRelatedWork W2141715368 @default.
- W2948807003 hasRelatedWork W2184752416 @default.
- W2948807003 hasRelatedWork W2467723036 @default.
- W2948807003 hasRelatedWork W2740086139 @default.
- W2948807003 hasRelatedWork W2772685522 @default.
- W2948807003 hasRelatedWork W2799139773 @default.
- W2948807003 hasRelatedWork W2801475186 @default.
- W2948807003 hasRelatedWork W2922989098 @default.
- W2948807003 hasRelatedWork W2932233781 @default.
- W2948807003 hasRelatedWork W2965332040 @default.
- W2948807003 hasRelatedWork W3038016830 @default.
- W2948807003 hasRelatedWork W3100269046 @default.
- W2948807003 hasRelatedWork W3175267858 @default.
- W2948807003 hasRelatedWork W3205571996 @default.
- W2948807003 hasRelatedWork W3207249034 @default.