Matches in SemOpenAlex for { <https://semopenalex.org/work/W2951750340> ?p ?o ?g. }
- W2951750340 abstract "We consider a model-based approach to perform batch off-policy evaluation in reinforcement learning. Our method takes a mixture-of-experts approach to combine parametric and non-parametric models of the environment such that the final value estimate has the least expected error. We do so by first estimating the local accuracy of each model and then using a planner to select which model to use at every time step as to minimize the return error estimate along entire trajectories. Across a variety of domains, our mixture-based approach outperforms the individual models alone as well as state-of-the-art importance sampling-based estimators." @default.
- W2951750340 created "2019-06-27" @default.
- W2951750340 creator A5016623854 @default.
- W2951750340 creator A5038771285 @default.
- W2951750340 creator A5045518550 @default.
- W2951750340 creator A5078769916 @default.
- W2951750340 creator A5084989076 @default.
- W2951750340 date "2019-05-14" @default.
- W2951750340 modified "2023-09-26" @default.
- W2951750340 title "Combining Parametric and Nonparametric Models for Off-Policy Evaluation" @default.
- W2951750340 cites W1514587017 @default.
- W2951750340 cites W1714211023 @default.
- W2951750340 cites W1936364850 @default.
- W2951750340 cites W2052725501 @default.
- W2951750340 cites W2121506959 @default.
- W2951750340 cites W2124175081 @default.
- W2951750340 cites W2126316555 @default.
- W2951750340 cites W2134689794 @default.
- W2951750340 cites W2135721773 @default.
- W2951750340 cites W2138434918 @default.
- W2951750340 cites W2168342951 @default.
- W2951750340 cites W2204520384 @default.
- W2951750340 cites W2770695371 @default.
- W2951750340 cites W2806905826 @default.
- W2951750340 cites W2890022552 @default.
- W2951750340 cites W2900602912 @default.
- W2951750340 cites W2946529464 @default.
- W2951750340 cites W2962695761 @default.
- W2951750340 cites W2962785510 @default.
- W2951750340 cites W2962802563 @default.
- W2951750340 cites W2963395712 @default.
- W2951750340 cites W2963882293 @default.
- W2951750340 cites W2964068481 @default.
- W2951750340 cites W2964271126 @default.
- W2951750340 cites W3011120880 @default.
- W2951750340 cites W2962685150 @default.
- W2951750340 hasPublicationYear "2019" @default.
- W2951750340 type Work @default.
- W2951750340 sameAs 2951750340 @default.
- W2951750340 citedByCount "3" @default.
- W2951750340 countsByYear W29517503402020 @default.
- W2951750340 countsByYear W29517503402021 @default.
- W2951750340 crossrefType "posted-content" @default.
- W2951750340 hasAuthorship W2951750340A5016623854 @default.
- W2951750340 hasAuthorship W2951750340A5038771285 @default.
- W2951750340 hasAuthorship W2951750340A5045518550 @default.
- W2951750340 hasAuthorship W2951750340A5078769916 @default.
- W2951750340 hasAuthorship W2951750340A5084989076 @default.
- W2951750340 hasConcept C102366305 @default.
- W2951750340 hasConcept C105795698 @default.
- W2951750340 hasConcept C106131492 @default.
- W2951750340 hasConcept C117251300 @default.
- W2951750340 hasConcept C119857082 @default.
- W2951750340 hasConcept C136197465 @default.
- W2951750340 hasConcept C140779682 @default.
- W2951750340 hasConcept C149782125 @default.
- W2951750340 hasConcept C154945302 @default.
- W2951750340 hasConcept C185429906 @default.
- W2951750340 hasConcept C24574437 @default.
- W2951750340 hasConcept C2776999362 @default.
- W2951750340 hasConcept C31972630 @default.
- W2951750340 hasConcept C33923547 @default.
- W2951750340 hasConcept C41008148 @default.
- W2951750340 hasConcept C97541855 @default.
- W2951750340 hasConceptScore W2951750340C102366305 @default.
- W2951750340 hasConceptScore W2951750340C105795698 @default.
- W2951750340 hasConceptScore W2951750340C106131492 @default.
- W2951750340 hasConceptScore W2951750340C117251300 @default.
- W2951750340 hasConceptScore W2951750340C119857082 @default.
- W2951750340 hasConceptScore W2951750340C136197465 @default.
- W2951750340 hasConceptScore W2951750340C140779682 @default.
- W2951750340 hasConceptScore W2951750340C149782125 @default.
- W2951750340 hasConceptScore W2951750340C154945302 @default.
- W2951750340 hasConceptScore W2951750340C185429906 @default.
- W2951750340 hasConceptScore W2951750340C24574437 @default.
- W2951750340 hasConceptScore W2951750340C2776999362 @default.
- W2951750340 hasConceptScore W2951750340C31972630 @default.
- W2951750340 hasConceptScore W2951750340C33923547 @default.
- W2951750340 hasConceptScore W2951750340C41008148 @default.
- W2951750340 hasConceptScore W2951750340C97541855 @default.
- W2951750340 hasOpenAccess W2951750340 @default.
- W2951750340 hasRelatedWork W1821791616 @default.
- W2951750340 hasRelatedWork W1974138090 @default.
- W2951750340 hasRelatedWork W2065680606 @default.
- W2951750340 hasRelatedWork W2396377511 @default.
- W2951750340 hasRelatedWork W2474632701 @default.
- W2951750340 hasRelatedWork W2883596676 @default.
- W2951750340 hasRelatedWork W2884285181 @default.
- W2951750340 hasRelatedWork W2908151956 @default.
- W2951750340 hasRelatedWork W2946529464 @default.
- W2951750340 hasRelatedWork W2963677392 @default.
- W2951750340 hasRelatedWork W2967013971 @default.
- W2951750340 hasRelatedWork W2997897824 @default.
- W2951750340 hasRelatedWork W3047577029 @default.
- W2951750340 hasRelatedWork W3104250134 @default.
- W2951750340 hasRelatedWork W3104806396 @default.
- W2951750340 hasRelatedWork W3123056638 @default.
- W2951750340 hasRelatedWork W3127918066 @default.
- W2951750340 hasRelatedWork W3166203810 @default.
- W2951750340 hasRelatedWork W3197167740 @default.