Matches in SemOpenAlex for { <https://semopenalex.org/work/W2963080043> ?p ?o ?g. }
- W2963080043 endingPage "3023" @default.
- W2963080043 startingPage "3016" @default.
- W2963080043 abstract "This paper describes a new information-theoretic policy evaluation technique for reinforcement learning. This technique converts any compression or density model into a corresponding estimate of value. Under appropriate stationarity and ergodicity conditions, we show that the use of a sufficiently powerful model gives rise to a consistent value function estimator. We also study the behavior of this technique when applied to various Atari 2600 video games, where the use of suboptimal modeling techniques is unavoidable. We consider three fundamentally different models, all too limited to perfectly model the dynamics of the system. Remarkably, we find that our technique provides sufficiently accurate value estimates for effective on-policy control. We conclude with a suggestive study highlighting the potential of our technique to scale to large problems." @default.
- W2963080043 created "2019-07-30" @default.
- W2963080043 creator A5001087292 @default.
- W2963080043 creator A5028585856 @default.
- W2963080043 creator A5060709021 @default.
- W2963080043 creator A5073944062 @default.
- W2963080043 creator A5089648583 @default.
- W2963080043 date "2015-01-25" @default.
- W2963080043 modified "2023-10-18" @default.
- W2963080043 title "Compress and control" @default.
- W2963080043 cites W107054272 @default.
- W2963080043 cites W136985510 @default.
- W2963080043 cites W1506806321 @default.
- W2963080043 cites W1515308897 @default.
- W2963080043 cites W1515851193 @default.
- W2963080043 cites W1528086278 @default.
- W2963080043 cites W1576452626 @default.
- W2963080043 cites W1601081659 @default.
- W2963080043 cites W1638203394 @default.
- W2963080043 cites W1757796397 @default.
- W2963080043 cites W1922608396 @default.
- W2963080043 cites W2013391942 @default.
- W2963080043 cites W2028145673 @default.
- W2963080043 cites W2064175235 @default.
- W2963080043 cites W2073384958 @default.
- W2963080043 cites W2094342172 @default.
- W2963080043 cites W2097998348 @default.
- W2963080043 cites W2099111195 @default.
- W2963080043 cites W2104290684 @default.
- W2963080043 cites W2105236698 @default.
- W2963080043 cites W2105474305 @default.
- W2963080043 cites W2107745473 @default.
- W2963080043 cites W2108147145 @default.
- W2963080043 cites W2114202040 @default.
- W2963080043 cites W2117667273 @default.
- W2963080043 cites W2119984390 @default.
- W2963080043 cites W2121863487 @default.
- W2963080043 cites W2123372395 @default.
- W2963080043 cites W2128859735 @default.
- W2963080043 cites W2129652681 @default.
- W2963080043 cites W2136065708 @default.
- W2963080043 cites W2137509429 @default.
- W2963080043 cites W2146502635 @default.
- W2963080043 cites W2148029210 @default.
- W2963080043 cites W2149418961 @default.
- W2963080043 cites W2157477959 @default.
- W2963080043 cites W2158191646 @default.
- W2963080043 cites W2160484851 @default.
- W2963080043 cites W2163294786 @default.
- W2963080043 cites W2171886309 @default.
- W2963080043 cites W2404689820 @default.
- W2963080043 cites W2408670836 @default.
- W2963080043 cites W2621280964 @default.
- W2963080043 cites W2964050205 @default.
- W2963080043 hasPublicationYear "2015" @default.
- W2963080043 type Work @default.
- W2963080043 sameAs 2963080043 @default.
- W2963080043 citedByCount "18" @default.
- W2963080043 countsByYear W29630800432015 @default.
- W2963080043 countsByYear W29630800432016 @default.
- W2963080043 countsByYear W29630800432017 @default.
- W2963080043 countsByYear W29630800432018 @default.
- W2963080043 countsByYear W29630800432019 @default.
- W2963080043 countsByYear W29630800432020 @default.
- W2963080043 countsByYear W29630800432021 @default.
- W2963080043 crossrefType "proceedings-article" @default.
- W2963080043 hasAuthorship W2963080043A5001087292 @default.
- W2963080043 hasAuthorship W2963080043A5028585856 @default.
- W2963080043 hasAuthorship W2963080043A5060709021 @default.
- W2963080043 hasAuthorship W2963080043A5073944062 @default.
- W2963080043 hasAuthorship W2963080043A5089648583 @default.
- W2963080043 hasConcept C105795698 @default.
- W2963080043 hasConcept C119857082 @default.
- W2963080043 hasConcept C121332964 @default.
- W2963080043 hasConcept C126255220 @default.
- W2963080043 hasConcept C14036430 @default.
- W2963080043 hasConcept C14646407 @default.
- W2963080043 hasConcept C154945302 @default.
- W2963080043 hasConcept C185429906 @default.
- W2963080043 hasConcept C201779956 @default.
- W2963080043 hasConcept C2775924081 @default.
- W2963080043 hasConcept C2776291640 @default.
- W2963080043 hasConcept C2778755073 @default.
- W2963080043 hasConcept C33923547 @default.
- W2963080043 hasConcept C41008148 @default.
- W2963080043 hasConcept C62520636 @default.
- W2963080043 hasConcept C78458016 @default.
- W2963080043 hasConcept C86803240 @default.
- W2963080043 hasConcept C97541855 @default.
- W2963080043 hasConceptScore W2963080043C105795698 @default.
- W2963080043 hasConceptScore W2963080043C119857082 @default.
- W2963080043 hasConceptScore W2963080043C121332964 @default.
- W2963080043 hasConceptScore W2963080043C126255220 @default.
- W2963080043 hasConceptScore W2963080043C14036430 @default.
- W2963080043 hasConceptScore W2963080043C14646407 @default.
- W2963080043 hasConceptScore W2963080043C154945302 @default.
- W2963080043 hasConceptScore W2963080043C185429906 @default.
- W2963080043 hasConceptScore W2963080043C201779956 @default.