Matches in SemOpenAlex for { <https://semopenalex.org/work/W107583932> ?p ?o ?g. }
- W107583932 abstract "This thesis is a detailed investigation into the following question: how much data must an agent collect in order to perform successfully? This question is analogous to the classical issue of the sample complexity in supervised learning, but is harder because of the increased realism of the reinforcement learning setting. This thesis summarizes recent sample complexity results in the reinforcement learning literature and builds on these results to provide novel algorithms with strong performance guarantees. We focus on a variety of reasonable performance criteria and sampling models by which agents may access the environment. For instance, in a policy search setting, we consider the problem of how much simulated experience is required to reliably choose a good policy among a restricted class of policies II (as in Kearns, Mansour, and Ng [2000]). In a more online setting, we consider the case in which an agent is placed in an environment and must follow one unbroken chain of experience with no access to offline simulation (as in Kearns and Singh [1998]). We build on the sample based algorithms suggested by Kearns, Mansour, and Ng [2000]. Their sample complexity bounds have no dependence on the size of the state space, an exponential dependence on the planning horizon time, and linear dependence on the complexity of II. We suggest novel algorithms with more restricted guarantees whose sample complexities are again independent of the size of the state space and depend linearly on the complexity of the policy class II, but have only a polynomial dependence on the horizon time. We pay particular attention to the tradeoffs made by such algorithms." @default.
- W107583932 created "2016-06-24" @default.
- W107583932 creator A5018792915 @default.
- W107583932 date "2003-01-01" @default.
- W107583932 modified "2023-10-02" @default.
- W107583932 title "On the Sample Complexity of Reinforcement Learning" @default.
- W107583932 cites W119412464 @default.
- W107583932 cites W1485905671 @default.
- W107583932 cites W1504212531 @default.
- W107583932 cites W1506832649 @default.
- W107583932 cites W1515891729 @default.
- W107583932 cites W1520252399 @default.
- W107583932 cites W1530699444 @default.
- W107583932 cites W1541084404 @default.
- W107583932 cites W1541730457 @default.
- W107583932 cites W1542886316 @default.
- W107583932 cites W1545148916 @default.
- W107583932 cites W1547105496 @default.
- W107583932 cites W1554015367 @default.
- W107583932 cites W1575592356 @default.
- W107583932 cites W1576452626 @default.
- W107583932 cites W1585546214 @default.
- W107583932 cites W1596364083 @default.
- W107583932 cites W1600046456 @default.
- W107583932 cites W1601974704 @default.
- W107583932 cites W1606011487 @default.
- W107583932 cites W1646707810 @default.
- W107583932 cites W1657542410 @default.
- W107583932 cites W1745373831 @default.
- W107583932 cites W195848389 @default.
- W107583932 cites W1970602736 @default.
- W107583932 cites W1970789124 @default.
- W107583932 cites W1982381767 @default.
- W107583932 cites W2019363670 @default.
- W107583932 cites W2032100464 @default.
- W107583932 cites W2039439610 @default.
- W107583932 cites W2041367235 @default.
- W107583932 cites W2083143894 @default.
- W107583932 cites W2084544490 @default.
- W107583932 cites W2091565802 @default.
- W107583932 cites W2099001564 @default.
- W107583932 cites W2102406125 @default.
- W107583932 cites W2119567691 @default.
- W107583932 cites W2119717200 @default.
- W107583932 cites W2120465407 @default.
- W107583932 cites W2121863487 @default.
- W107583932 cites W2122701159 @default.
- W107583932 cites W2125074935 @default.
- W107583932 cites W2130105540 @default.
- W107583932 cites W2130801532 @default.
- W107583932 cites W2137466452 @default.
- W107583932 cites W2139418546 @default.
- W107583932 cites W2139709458 @default.
- W107583932 cites W2155027007 @default.
- W107583932 cites W2156737235 @default.
- W107583932 cites W2158479468 @default.
- W107583932 cites W2161521419 @default.
- W107583932 cites W2161966552 @default.
- W107583932 cites W2164056559 @default.
- W107583932 cites W2168024904 @default.
- W107583932 cites W2169022337 @default.
- W107583932 cites W2317700292 @default.
- W107583932 cites W2341171179 @default.
- W107583932 cites W2489939061 @default.
- W107583932 cites W3011120880 @default.
- W107583932 cites W3023151133 @default.
- W107583932 cites W3023407077 @default.
- W107583932 cites W32198084 @default.
- W107583932 hasPublicationYear "2003" @default.
- W107583932 type Work @default.
- W107583932 sameAs 107583932 @default.
- W107583932 citedByCount "338" @default.
- W107583932 countsByYear W1075839322012 @default.
- W107583932 countsByYear W1075839322013 @default.
- W107583932 countsByYear W1075839322014 @default.
- W107583932 countsByYear W1075839322015 @default.
- W107583932 countsByYear W1075839322016 @default.
- W107583932 countsByYear W1075839322017 @default.
- W107583932 countsByYear W1075839322018 @default.
- W107583932 countsByYear W1075839322019 @default.
- W107583932 countsByYear W1075839322020 @default.
- W107583932 countsByYear W1075839322021 @default.
- W107583932 countsByYear W1075839322022 @default.
- W107583932 countsByYear W1075839322023 @default.
- W107583932 crossrefType "dissertation" @default.
- W107583932 hasAuthorship W107583932A5018792915 @default.
- W107583932 hasConcept C105795698 @default.
- W107583932 hasConcept C111919701 @default.
- W107583932 hasConcept C11413529 @default.
- W107583932 hasConcept C119857082 @default.
- W107583932 hasConcept C126255220 @default.
- W107583932 hasConcept C129848803 @default.
- W107583932 hasConcept C136197465 @default.
- W107583932 hasConcept C154945302 @default.
- W107583932 hasConcept C179799912 @default.
- W107583932 hasConcept C185592680 @default.
- W107583932 hasConcept C198531522 @default.
- W107583932 hasConcept C2777212361 @default.
- W107583932 hasConcept C2778445095 @default.
- W107583932 hasConcept C2778572836 @default.