Matches in SemOpenAlex for { <https://semopenalex.org/work/W4384111935> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W4384111935 abstract "We consider the problem of Imitation Learning (IL) by actively querying noisy expert for feedback. While imitation learning has been empirically successful, much of prior work assumes access to noiseless expert feedback which is not practical in many applications. In fact, when one only has access to noisy expert feedback, algorithms that rely on purely offline data (non-interactive IL) can be shown to need a prohibitively large number of samples to be successful. In contrast, in this work, we provide an interactive algorithm for IL that uses selective sampling to actively query the noisy expert for feedback. Our contributions are twofold: First, we provide a new selective sampling algorithm that works with general function classes and multiple actions, and obtains the best-known bounds for the regret and the number of queries. Next, we extend this analysis to the problem of IL with noisy expert feedback and provide a new IL algorithm that makes limited queries. Our algorithm for selective sampling leverages function approximation, and relies on an online regression oracle w.r.t.~the given model class to predict actions, and to decide whether to query the expert for its label. On the theoretical side, the regret bound of our algorithm is upper bounded by the regret of the online regression oracle, while the query complexity additionally depends on the eluder dimension of the model class. We complement this with a lower bound that demonstrates that our results are tight. We extend our selective sampling algorithm for IL with general function approximation and provide bounds on both the regret and the number of queries made to the noisy expert. A key novelty here is that our regret and query complexity bounds only depend on the number of times the optimal policy (and not the noisy expert, or the learner) go to states that have a small margin." @default.
- W4384111935 created "2023-07-13" @default.
- W4384111935 creator A5051488979 @default.
- W4384111935 creator A5062806503 @default.
- W4384111935 creator A5084265089 @default.
- W4384111935 creator A5084629039 @default.
- W4384111935 date "2023-07-10" @default.
- W4384111935 modified "2023-09-26" @default.
- W4384111935 title "Selective Sampling and Imitation Learning via Online Regression" @default.
- W4384111935 doi "https://doi.org/10.48550/arxiv.2307.04998" @default.
- W4384111935 hasPublicationYear "2023" @default.
- W4384111935 type Work @default.
- W4384111935 citedByCount "0" @default.
- W4384111935 crossrefType "posted-content" @default.
- W4384111935 hasAuthorship W4384111935A5051488979 @default.
- W4384111935 hasAuthorship W4384111935A5062806503 @default.
- W4384111935 hasAuthorship W4384111935A5084265089 @default.
- W4384111935 hasAuthorship W4384111935A5084629039 @default.
- W4384111935 hasBestOaLocation W43841119351 @default.
- W4384111935 hasConcept C104317684 @default.
- W4384111935 hasConcept C106131492 @default.
- W4384111935 hasConcept C112313634 @default.
- W4384111935 hasConcept C115903868 @default.
- W4384111935 hasConcept C119857082 @default.
- W4384111935 hasConcept C124101348 @default.
- W4384111935 hasConcept C127716648 @default.
- W4384111935 hasConcept C134306372 @default.
- W4384111935 hasConcept C14036430 @default.
- W4384111935 hasConcept C140779682 @default.
- W4384111935 hasConcept C154945302 @default.
- W4384111935 hasConcept C185592680 @default.
- W4384111935 hasConcept C188082640 @default.
- W4384111935 hasConcept C31972630 @default.
- W4384111935 hasConcept C33923547 @default.
- W4384111935 hasConcept C34388435 @default.
- W4384111935 hasConcept C41008148 @default.
- W4384111935 hasConcept C50817715 @default.
- W4384111935 hasConcept C55166926 @default.
- W4384111935 hasConcept C55493867 @default.
- W4384111935 hasConcept C73602740 @default.
- W4384111935 hasConcept C78458016 @default.
- W4384111935 hasConcept C80444323 @default.
- W4384111935 hasConcept C86803240 @default.
- W4384111935 hasConceptScore W4384111935C104317684 @default.
- W4384111935 hasConceptScore W4384111935C106131492 @default.
- W4384111935 hasConceptScore W4384111935C112313634 @default.
- W4384111935 hasConceptScore W4384111935C115903868 @default.
- W4384111935 hasConceptScore W4384111935C119857082 @default.
- W4384111935 hasConceptScore W4384111935C124101348 @default.
- W4384111935 hasConceptScore W4384111935C127716648 @default.
- W4384111935 hasConceptScore W4384111935C134306372 @default.
- W4384111935 hasConceptScore W4384111935C14036430 @default.
- W4384111935 hasConceptScore W4384111935C140779682 @default.
- W4384111935 hasConceptScore W4384111935C154945302 @default.
- W4384111935 hasConceptScore W4384111935C185592680 @default.
- W4384111935 hasConceptScore W4384111935C188082640 @default.
- W4384111935 hasConceptScore W4384111935C31972630 @default.
- W4384111935 hasConceptScore W4384111935C33923547 @default.
- W4384111935 hasConceptScore W4384111935C34388435 @default.
- W4384111935 hasConceptScore W4384111935C41008148 @default.
- W4384111935 hasConceptScore W4384111935C50817715 @default.
- W4384111935 hasConceptScore W4384111935C55166926 @default.
- W4384111935 hasConceptScore W4384111935C55493867 @default.
- W4384111935 hasConceptScore W4384111935C73602740 @default.
- W4384111935 hasConceptScore W4384111935C78458016 @default.
- W4384111935 hasConceptScore W4384111935C80444323 @default.
- W4384111935 hasConceptScore W4384111935C86803240 @default.
- W4384111935 hasLocation W43841119351 @default.
- W4384111935 hasOpenAccess W4384111935 @default.
- W4384111935 hasPrimaryLocation W43841119351 @default.
- W4384111935 hasRelatedWork W1600255059 @default.
- W4384111935 hasRelatedWork W2738218455 @default.
- W4384111935 hasRelatedWork W2951802169 @default.
- W4384111935 hasRelatedWork W2964278219 @default.
- W4384111935 hasRelatedWork W2988965244 @default.
- W4384111935 hasRelatedWork W3002095816 @default.
- W4384111935 hasRelatedWork W3108185025 @default.
- W4384111935 hasRelatedWork W3196245787 @default.
- W4384111935 hasRelatedWork W4318620749 @default.
- W4384111935 hasRelatedWork W4367190832 @default.
- W4384111935 isParatext "false" @default.
- W4384111935 isRetracted "false" @default.
- W4384111935 workType "article" @default.