Matches in SemOpenAlex for { <https://semopenalex.org/work/W2994798986> ?p ?o ?g. }
- W2994798986 abstract "An important problem that arises in reinforcement learning and Monte Carlo methods is estimating quantities defined by the stationary distribution of a Markov chain. In many real-world applications, access to the underlying transition operator is limited to a fixed set of data that has already been collected, without additional interaction with the environment being available. We show that consistent estimation remains possible in this challenging scenario, and that effective estimation can still be achieved in important applications. Our approach is based on estimating a ratio that corrects for the discrepancy between the stationary and empirical distributions, derived from fundamental properties of the stationary distribution, and exploiting constraint reformulations based on variational divergence minimization. The resulting algorithm, GenDICE, is straightforward and effective. We prove its consistency under general conditions, provide an error analysis, and demonstrate strong empirical performance on benchmark problems, including off-line PageRank and off-policy policy evaluation." @default.
- W2994798986 created "2019-12-26" @default.
- W2994798986 creator A5004051530 @default.
- W2994798986 creator A5010575626 @default.
- W2994798986 creator A5054850777 @default.
- W2994798986 creator A5086484914 @default.
- W2994798986 date "2020-02-20" @default.
- W2994798986 modified "2023-09-23" @default.
- W2994798986 title "GenDICE: Generalized Offline Estimation of Stationary Values" @default.
- W2994798986 cites W1514587017 @default.
- W2994798986 cites W1524012148 @default.
- W2994798986 cites W1646707810 @default.
- W2994798986 cites W1757796397 @default.
- W2994798986 cites W1809653203 @default.
- W2994798986 cites W1854214752 @default.
- W2994798986 cites W1966026565 @default.
- W2994798986 cites W1992208280 @default.
- W2994798986 cites W1995713768 @default.
- W2994798986 cites W2010029425 @default.
- W2994798986 cites W2014478203 @default.
- W2994798986 cites W2045656233 @default.
- W2994798986 cites W2102689555 @default.
- W2994798986 cites W2103851188 @default.
- W2994798986 cites W2113065326 @default.
- W2994798986 cites W2119567691 @default.
- W2994798986 cites W2121863487 @default.
- W2994798986 cites W2122124659 @default.
- W2994798986 cites W2135194391 @default.
- W2994798986 cites W2139418546 @default.
- W2994798986 cites W2169269897 @default.
- W2994798986 cites W2173248099 @default.
- W2994798986 cites W2212660284 @default.
- W2994798986 cites W2266946488 @default.
- W2994798986 cites W2328111639 @default.
- W2994798986 cites W2339794156 @default.
- W2994798986 cites W2557046619 @default.
- W2994798986 cites W2583400562 @default.
- W2994798986 cites W2751325639 @default.
- W2994798986 cites W2806985155 @default.
- W2994798986 cites W2900976719 @default.
- W2994798986 cites W2904789544 @default.
- W2994798986 cites W2911319680 @default.
- W2994798986 cites W2947996939 @default.
- W2994798986 cites W2949930907 @default.
- W2994798986 cites W2962802563 @default.
- W2994798986 cites W2962879692 @default.
- W2994798986 cites W2962892300 @default.
- W2994798986 cites W2963323139 @default.
- W2994798986 cites W2963534251 @default.
- W2994798986 cites W2963565380 @default.
- W2994798986 cites W2963800509 @default.
- W2994798986 cites W2963873275 @default.
- W2994798986 cites W2964068481 @default.
- W2994798986 cites W2971026276 @default.
- W2994798986 cites W3150304496 @default.
- W2994798986 cites W607505555 @default.
- W2994798986 cites W621546036 @default.
- W2994798986 doi "https://doi.org/10.48550/arxiv.2002.09072" @default.
- W2994798986 hasPublicationYear "2020" @default.
- W2994798986 type Work @default.
- W2994798986 sameAs 2994798986 @default.
- W2994798986 citedByCount "36" @default.
- W2994798986 countsByYear W29947989862019 @default.
- W2994798986 countsByYear W29947989862020 @default.
- W2994798986 countsByYear W29947989862021 @default.
- W2994798986 crossrefType "posted-content" @default.
- W2994798986 hasAuthorship W2994798986A5004051530 @default.
- W2994798986 hasAuthorship W2994798986A5010575626 @default.
- W2994798986 hasAuthorship W2994798986A5054850777 @default.
- W2994798986 hasAuthorship W2994798986A5086484914 @default.
- W2994798986 hasBestOaLocation W29947989861 @default.
- W2994798986 hasConcept C110121322 @default.
- W2994798986 hasConcept C119857082 @default.
- W2994798986 hasConcept C126255220 @default.
- W2994798986 hasConcept C13280743 @default.
- W2994798986 hasConcept C134306372 @default.
- W2994798986 hasConcept C138885662 @default.
- W2994798986 hasConcept C154945302 @default.
- W2994798986 hasConcept C177264268 @default.
- W2994798986 hasConcept C185798385 @default.
- W2994798986 hasConcept C199360897 @default.
- W2994798986 hasConcept C205649164 @default.
- W2994798986 hasConcept C207390915 @default.
- W2994798986 hasConcept C2524010 @default.
- W2994798986 hasConcept C2776036281 @default.
- W2994798986 hasConcept C2776436953 @default.
- W2994798986 hasConcept C28826006 @default.
- W2994798986 hasConcept C33923547 @default.
- W2994798986 hasConcept C41008148 @default.
- W2994798986 hasConcept C41895202 @default.
- W2994798986 hasConcept C97541855 @default.
- W2994798986 hasConcept C98763669 @default.
- W2994798986 hasConcept C98951983 @default.
- W2994798986 hasConceptScore W2994798986C110121322 @default.
- W2994798986 hasConceptScore W2994798986C119857082 @default.
- W2994798986 hasConceptScore W2994798986C126255220 @default.
- W2994798986 hasConceptScore W2994798986C13280743 @default.
- W2994798986 hasConceptScore W2994798986C134306372 @default.
- W2994798986 hasConceptScore W2994798986C138885662 @default.
- W2994798986 hasConceptScore W2994798986C154945302 @default.