Matches in SemOpenAlex for { <https://semopenalex.org/work/W2625967765> ?p ?o ?g. }
Showing items 1 to 84 of
84
with 100 items per page.
- W2625967765 abstract "We consider the task of evaluating a policy for a Markov decision process (MDP). The standard unbiased technique for evaluating a policy is to deploy the policy and observe its performance. We show that the data collected from deploying a different policy, commonly called the behavior policy, can be used to produce unbiased estimates with lower mean squared error than this standard technique. We derive an analytic expression for the optimal behavior policy --- the behavior policy that minimizes the mean squared error of the resulting estimates. Because this expression depends on terms that are unknown in practice, we propose a novel policy evaluation sub-problem, behavior policy search: searching for a behavior policy that reduces mean squared error. We present a behavior policy search algorithm and empirically demonstrate its effectiveness in lowering the mean squared error of policy performance estimates." @default.
- W2625967765 created "2017-06-23" @default.
- W2625967765 creator A5001594330 @default.
- W2625967765 creator A5008014974 @default.
- W2625967765 creator A5043572737 @default.
- W2625967765 creator A5066332280 @default.
- W2625967765 date "2017-06-12" @default.
- W2625967765 modified "2023-10-12" @default.
- W2625967765 title "Data-Efficient Policy Evaluation Through Behavior Policy Search" @default.
- W2625967765 cites W1514587017 @default.
- W2625967765 cites W1557189666 @default.
- W2625967765 cites W1971942712 @default.
- W2625967765 cites W2098152875 @default.
- W2625967765 cites W2139515132 @default.
- W2625967765 cites W2145901173 @default.
- W2625967765 cites W2155027007 @default.
- W2625967765 cites W2173248099 @default.
- W2625967765 cites W2188892596 @default.
- W2625967765 cites W2205490832 @default.
- W2625967765 cites W2275802500 @default.
- W2625967765 cites W2312609093 @default.
- W2625967765 cites W2342662072 @default.
- W2625967765 cites W2489939061 @default.
- W2625967765 cites W2799825833 @default.
- W2625967765 cites W2949608212 @default.
- W2625967765 cites W2962802563 @default.
- W2625967765 hasPublicationYear "2017" @default.
- W2625967765 type Work @default.
- W2625967765 sameAs 2625967765 @default.
- W2625967765 citedByCount "0" @default.
- W2625967765 crossrefType "posted-content" @default.
- W2625967765 hasAuthorship W2625967765A5001594330 @default.
- W2625967765 hasAuthorship W2625967765A5008014974 @default.
- W2625967765 hasAuthorship W2625967765A5043572737 @default.
- W2625967765 hasAuthorship W2625967765A5066332280 @default.
- W2625967765 hasConcept C105795698 @default.
- W2625967765 hasConcept C106189395 @default.
- W2625967765 hasConcept C126255220 @default.
- W2625967765 hasConcept C139945424 @default.
- W2625967765 hasConcept C149782125 @default.
- W2625967765 hasConcept C159886148 @default.
- W2625967765 hasConcept C162324750 @default.
- W2625967765 hasConcept C187736073 @default.
- W2625967765 hasConcept C2780451532 @default.
- W2625967765 hasConcept C33923547 @default.
- W2625967765 hasConcept C41008148 @default.
- W2625967765 hasConceptScore W2625967765C105795698 @default.
- W2625967765 hasConceptScore W2625967765C106189395 @default.
- W2625967765 hasConceptScore W2625967765C126255220 @default.
- W2625967765 hasConceptScore W2625967765C139945424 @default.
- W2625967765 hasConceptScore W2625967765C149782125 @default.
- W2625967765 hasConceptScore W2625967765C159886148 @default.
- W2625967765 hasConceptScore W2625967765C162324750 @default.
- W2625967765 hasConceptScore W2625967765C187736073 @default.
- W2625967765 hasConceptScore W2625967765C2780451532 @default.
- W2625967765 hasConceptScore W2625967765C33923547 @default.
- W2625967765 hasConceptScore W2625967765C41008148 @default.
- W2625967765 hasLocation W26259677651 @default.
- W2625967765 hasOpenAccess W2625967765 @default.
- W2625967765 hasPrimaryLocation W26259677651 @default.
- W2625967765 hasRelatedWork W1987434446 @default.
- W2625967765 hasRelatedWork W1991148469 @default.
- W2625967765 hasRelatedWork W199434865 @default.
- W2625967765 hasRelatedWork W2159223630 @default.
- W2625967765 hasRelatedWork W2216766766 @default.
- W2625967765 hasRelatedWork W2770510790 @default.
- W2625967765 hasRelatedWork W2955236875 @default.
- W2625967765 hasRelatedWork W2963424548 @default.
- W2625967765 hasRelatedWork W3035211893 @default.
- W2625967765 hasRelatedWork W3037286161 @default.
- W2625967765 hasRelatedWork W3047623911 @default.
- W2625967765 hasRelatedWork W3085768637 @default.
- W2625967765 hasRelatedWork W3090052039 @default.
- W2625967765 hasRelatedWork W3122285184 @default.
- W2625967765 hasRelatedWork W3124288898 @default.
- W2625967765 hasRelatedWork W3140013529 @default.
- W2625967765 hasRelatedWork W3144971854 @default.
- W2625967765 hasRelatedWork W3156999455 @default.
- W2625967765 hasRelatedWork W3197946684 @default.
- W2625967765 hasRelatedWork W3210251473 @default.
- W2625967765 isParatext "false" @default.
- W2625967765 isRetracted "false" @default.
- W2625967765 magId "2625967765" @default.
- W2625967765 workType "article" @default.