Matches in SemOpenAlex for { <https://semopenalex.org/work/W2136723863> ?p ?o ?g. }
- W2136723863 endingPage "102" @default.
- W2136723863 startingPage "89" @default.
- W2136723863 abstract "Off-policy evaluation is the problem of evaluating a decision-making policy using data collected under a different behaviour policy. While several methods are available for addressing off-policy evaluation, little work has been done on identifying the best methods. In this paper, we conduct an in-depth comparative study of several off-policy evaluation methods in non-bandit, finite-horizon MDPs, using randomly generated MDPs, as well as a Mallard population dynamics model [Anderson, 1975] . We find that un-normalized importance sampling can exhibit prohibitively large variance in problems involving look-ahead longer than a few time steps, and that dynamic programming methods perform better than Monte-Carlo style methods." @default.
- W2136723863 created "2016-06-24" @default.
- W2136723863 creator A5056097740 @default.
- W2136723863 creator A5065836447 @default.
- W2136723863 creator A5080591144 @default.
- W2136723863 creator A5089433655 @default.
- W2136723863 date "2012-01-01" @default.
- W2136723863 modified "2023-09-23" @default.
- W2136723863 title "An Empirical Analysis of Off-policy Learning in Discrete MDPs" @default.
- W2136723863 cites W1514587017 @default.
- W2136723863 cites W1600046456 @default.
- W2136723863 cites W166862392 @default.
- W2136723863 cites W1809653203 @default.
- W2136723863 cites W1999824901 @default.
- W2136723863 cites W2009187570 @default.
- W2136723863 cites W2027795129 @default.
- W2136723863 cites W2075268401 @default.
- W2136723863 cites W2081596682 @default.
- W2136723863 cites W2094005143 @default.
- W2136723863 cites W2108692343 @default.
- W2136723863 cites W2111492941 @default.
- W2136723863 cites W2121863487 @default.
- W2136723863 cites W2125526403 @default.
- W2136723863 cites W2132351269 @default.
- W2136723863 cites W2147632348 @default.
- W2136723863 cites W2186159582 @default.
- W2136723863 cites W2791279190 @default.
- W2136723863 cites W2964297722 @default.
- W2136723863 cites W3020882730 @default.
- W2136723863 hasPublicationYear "2012" @default.
- W2136723863 type Work @default.
- W2136723863 sameAs 2136723863 @default.
- W2136723863 citedByCount "3" @default.
- W2136723863 countsByYear W21367238632017 @default.
- W2136723863 countsByYear W21367238632020 @default.
- W2136723863 crossrefType "proceedings-article" @default.
- W2136723863 hasAuthorship W2136723863A5056097740 @default.
- W2136723863 hasAuthorship W2136723863A5065836447 @default.
- W2136723863 hasAuthorship W2136723863A5080591144 @default.
- W2136723863 hasAuthorship W2136723863A5089433655 @default.
- W2136723863 hasConcept C105795698 @default.
- W2136723863 hasConcept C106131492 @default.
- W2136723863 hasConcept C11413529 @default.
- W2136723863 hasConcept C121955636 @default.
- W2136723863 hasConcept C123587114 @default.
- W2136723863 hasConcept C126255220 @default.
- W2136723863 hasConcept C140779682 @default.
- W2136723863 hasConcept C144024400 @default.
- W2136723863 hasConcept C149923435 @default.
- W2136723863 hasConcept C154945302 @default.
- W2136723863 hasConcept C162324750 @default.
- W2136723863 hasConcept C17744445 @default.
- W2136723863 hasConcept C19499675 @default.
- W2136723863 hasConcept C196083921 @default.
- W2136723863 hasConcept C199539241 @default.
- W2136723863 hasConcept C2908647359 @default.
- W2136723863 hasConcept C31972630 @default.
- W2136723863 hasConcept C33923547 @default.
- W2136723863 hasConcept C37404715 @default.
- W2136723863 hasConcept C41008148 @default.
- W2136723863 hasConcept C97541855 @default.
- W2136723863 hasConceptScore W2136723863C105795698 @default.
- W2136723863 hasConceptScore W2136723863C106131492 @default.
- W2136723863 hasConceptScore W2136723863C11413529 @default.
- W2136723863 hasConceptScore W2136723863C121955636 @default.
- W2136723863 hasConceptScore W2136723863C123587114 @default.
- W2136723863 hasConceptScore W2136723863C126255220 @default.
- W2136723863 hasConceptScore W2136723863C140779682 @default.
- W2136723863 hasConceptScore W2136723863C144024400 @default.
- W2136723863 hasConceptScore W2136723863C149923435 @default.
- W2136723863 hasConceptScore W2136723863C154945302 @default.
- W2136723863 hasConceptScore W2136723863C162324750 @default.
- W2136723863 hasConceptScore W2136723863C17744445 @default.
- W2136723863 hasConceptScore W2136723863C19499675 @default.
- W2136723863 hasConceptScore W2136723863C196083921 @default.
- W2136723863 hasConceptScore W2136723863C199539241 @default.
- W2136723863 hasConceptScore W2136723863C2908647359 @default.
- W2136723863 hasConceptScore W2136723863C31972630 @default.
- W2136723863 hasConceptScore W2136723863C33923547 @default.
- W2136723863 hasConceptScore W2136723863C37404715 @default.
- W2136723863 hasConceptScore W2136723863C41008148 @default.
- W2136723863 hasConceptScore W2136723863C97541855 @default.
- W2136723863 hasLocation W21367238631 @default.
- W2136723863 hasOpenAccess W2136723863 @default.
- W2136723863 hasPrimaryLocation W21367238631 @default.
- W2136723863 hasRelatedWork W1176136657 @default.
- W2136723863 hasRelatedWork W1593140824 @default.
- W2136723863 hasRelatedWork W1656041229 @default.
- W2136723863 hasRelatedWork W1849095486 @default.
- W2136723863 hasRelatedWork W2100857832 @default.
- W2136723863 hasRelatedWork W2782965505 @default.
- W2136723863 hasRelatedWork W2787471386 @default.
- W2136723863 hasRelatedWork W2912212713 @default.
- W2136723863 hasRelatedWork W2962688548 @default.
- W2136723863 hasRelatedWork W2963240753 @default.
- W2136723863 hasRelatedWork W2963882293 @default.
- W2136723863 hasRelatedWork W3004727279 @default.
- W2136723863 hasRelatedWork W3034261810 @default.