Matches in SemOpenAlex for { <https://semopenalex.org/work/W3204373176> ?p ?o ?g. }
- W3204373176 abstract "When the sizes of the state and action spaces are large, solving MDPs can be computationally prohibitive even if the probability transition matrix is known. So in practice, a number of techniques are used to approximately solve the dynamic programming problem, including lookahead, approximate policy evaluation using an m-step return, and function approximation. In a recent paper, (Efroni et al. 2019) studied the impact of lookahead on the convergence rate of approximate dynamic programming. In this paper, we show that these convergence results change dramatically when function approximation is used in conjunction with lookout and approximate policy evaluation using an m-step return. Specifically, we show that when linear function approximation is used to represent the value function, a certain minimum amount of lookahead and multi-step return is needed for the algorithm to even converge. And when this condition is met, we characterize the finite-time performance of policies obtained using such approximate policy iteration. Our results are presented for two different procedures to compute the function approximation: linear least-squares regression and gradient descent." @default.
- W3204373176 created "2021-10-11" @default.
- W3204373176 creator A5049891134 @default.
- W3204373176 creator A5052203465 @default.
- W3204373176 creator A5058857221 @default.
- W3204373176 creator A5078518595 @default.
- W3204373176 date "2021-09-28" @default.
- W3204373176 modified "2023-09-27" @default.
- W3204373176 title "The Role of Lookahead and Approximate Policy Evaluation in Policy Iteration with Linear Value Function Approximation." @default.
- W3204373176 cites W1576452626 @default.
- W3204373176 cites W1625390266 @default.
- W3204373176 cites W1837062119 @default.
- W3204373176 cites W1857660445 @default.
- W3204373176 cites W2020609518 @default.
- W3204373176 cites W2029572069 @default.
- W3204373176 cites W2114735315 @default.
- W3204373176 cites W2121863487 @default.
- W3204373176 cites W2124477018 @default.
- W3204373176 cites W2126316555 @default.
- W3204373176 cites W2131420237 @default.
- W3204373176 cites W2169982856 @default.
- W3204373176 cites W2766447205 @default.
- W3204373176 cites W2772709170 @default.
- W3204373176 cites W2803533594 @default.
- W3204373176 cites W2809090039 @default.
- W3204373176 cites W2962806059 @default.
- W3204373176 cites W2963763772 @default.
- W3204373176 cites W2964043796 @default.
- W3204373176 cites W2970332347 @default.
- W3204373176 cites W3034377861 @default.
- W3204373176 cites W3037396296 @default.
- W3204373176 cites W3040891685 @default.
- W3204373176 cites W3092183849 @default.
- W3204373176 cites W3103253567 @default.
- W3204373176 cites W3162026808 @default.
- W3204373176 cites W3193388933 @default.
- W3204373176 hasPublicationYear "2021" @default.
- W3204373176 type Work @default.
- W3204373176 sameAs 3204373176 @default.
- W3204373176 citedByCount "0" @default.
- W3204373176 crossrefType "posted-content" @default.
- W3204373176 hasAuthorship W3204373176A5049891134 @default.
- W3204373176 hasAuthorship W3204373176A5052203465 @default.
- W3204373176 hasAuthorship W3204373176A5058857221 @default.
- W3204373176 hasAuthorship W3204373176A5078518595 @default.
- W3204373176 hasConcept C105795698 @default.
- W3204373176 hasConcept C106189395 @default.
- W3204373176 hasConcept C119857082 @default.
- W3204373176 hasConcept C121332964 @default.
- W3204373176 hasConcept C126255220 @default.
- W3204373176 hasConcept C127162648 @default.
- W3204373176 hasConcept C14036430 @default.
- W3204373176 hasConcept C14646407 @default.
- W3204373176 hasConcept C148764684 @default.
- W3204373176 hasConcept C158622935 @default.
- W3204373176 hasConcept C159886148 @default.
- W3204373176 hasConcept C160824197 @default.
- W3204373176 hasConcept C162324750 @default.
- W3204373176 hasConcept C2777303404 @default.
- W3204373176 hasConcept C28826006 @default.
- W3204373176 hasConcept C31258907 @default.
- W3204373176 hasConcept C33923547 @default.
- W3204373176 hasConcept C37404715 @default.
- W3204373176 hasConcept C41008148 @default.
- W3204373176 hasConcept C41045048 @default.
- W3204373176 hasConcept C50522688 @default.
- W3204373176 hasConcept C50644808 @default.
- W3204373176 hasConcept C57869625 @default.
- W3204373176 hasConcept C62520636 @default.
- W3204373176 hasConcept C78458016 @default.
- W3204373176 hasConcept C86803240 @default.
- W3204373176 hasConcept C91873725 @default.
- W3204373176 hasConceptScore W3204373176C105795698 @default.
- W3204373176 hasConceptScore W3204373176C106189395 @default.
- W3204373176 hasConceptScore W3204373176C119857082 @default.
- W3204373176 hasConceptScore W3204373176C121332964 @default.
- W3204373176 hasConceptScore W3204373176C126255220 @default.
- W3204373176 hasConceptScore W3204373176C127162648 @default.
- W3204373176 hasConceptScore W3204373176C14036430 @default.
- W3204373176 hasConceptScore W3204373176C14646407 @default.
- W3204373176 hasConceptScore W3204373176C148764684 @default.
- W3204373176 hasConceptScore W3204373176C158622935 @default.
- W3204373176 hasConceptScore W3204373176C159886148 @default.
- W3204373176 hasConceptScore W3204373176C160824197 @default.
- W3204373176 hasConceptScore W3204373176C162324750 @default.
- W3204373176 hasConceptScore W3204373176C2777303404 @default.
- W3204373176 hasConceptScore W3204373176C28826006 @default.
- W3204373176 hasConceptScore W3204373176C31258907 @default.
- W3204373176 hasConceptScore W3204373176C33923547 @default.
- W3204373176 hasConceptScore W3204373176C37404715 @default.
- W3204373176 hasConceptScore W3204373176C41008148 @default.
- W3204373176 hasConceptScore W3204373176C41045048 @default.
- W3204373176 hasConceptScore W3204373176C50522688 @default.
- W3204373176 hasConceptScore W3204373176C50644808 @default.
- W3204373176 hasConceptScore W3204373176C57869625 @default.
- W3204373176 hasConceptScore W3204373176C62520636 @default.
- W3204373176 hasConceptScore W3204373176C78458016 @default.
- W3204373176 hasConceptScore W3204373176C86803240 @default.
- W3204373176 hasConceptScore W3204373176C91873725 @default.
- W3204373176 hasLocation W32043731761 @default.