Matches in SemOpenAlex for { <https://semopenalex.org/work/W2904188377> ?p ?o ?g. }
- W2904188377 abstract "The effectiveness of model-based versus model-free methods is a long-standing question in reinforcement learning (RL). Motivated by recent empirical success of RL on continuous control tasks, we study the sample complexity of popular model-based and model-free algorithms on the Linear Quadratic Regulator (LQR). We show that for policy evaluation, a simple model-based plugin method requires asymptotically less samples than the classical least-squares temporal difference (LSTD) estimator to reach the same quality of solution; the sample complexity gap between the two methods can be at least a factor of state dimension. For policy evaluation, we study a simple family of problem instances and show that nominal (certainty equivalence principle) control also requires several factors of state and input dimension fewer samples than the policy gradient method to reach the same level of control performance on these instances. Furthermore, the gap persists even when employing commonly used baselines. To the best of our knowledge, this is the first theoretical result which demonstrates a separation in the sample complexity between model-based and model-free methods on a continuous control task." @default.
- W2904188377 created "2018-12-22" @default.
- W2904188377 creator A5012870568 @default.
- W2904188377 creator A5047090527 @default.
- W2904188377 date "2018-12-09" @default.
- W2904188377 modified "2023-09-27" @default.
- W2904188377 title "The Gap Between Model-Based and Model-Free Methods on the Linear Quadratic Regulator: An Asymptotic Viewpoint" @default.
- W2904188377 cites W1492933304 @default.
- W2904188377 cites W1499021337 @default.
- W2904188377 cites W1597303641 @default.
- W2904188377 cites W1850488217 @default.
- W2904188377 cites W1969276875 @default.
- W2904188377 cites W1996484339 @default.
- W2904188377 cites W2003686272 @default.
- W2904188377 cites W2035702014 @default.
- W2904188377 cites W2056099894 @default.
- W2904188377 cites W2072931156 @default.
- W2904188377 cites W2082210980 @default.
- W2904188377 cites W2106606323 @default.
- W2904188377 cites W2112269233 @default.
- W2904188377 cites W2119717200 @default.
- W2904188377 cites W2123447947 @default.
- W2904188377 cites W2125612430 @default.
- W2904188377 cites W2129670787 @default.
- W2904188377 cites W2130005627 @default.
- W2904188377 cites W2149479912 @default.
- W2904188377 cites W2275844880 @default.
- W2904188377 cites W2321541861 @default.
- W2904188377 cites W2525518963 @default.
- W2904188377 cites W2596367596 @default.
- W2904188377 cites W2604209058 @default.
- W2904188377 cites W2761923184 @default.
- W2904188377 cites W2768425162 @default.
- W2904188377 cites W2769648743 @default.
- W2904188377 cites W2783297289 @default.
- W2904188377 cites W2789525339 @default.
- W2904188377 cites W2804569092 @default.
- W2904188377 cites W2886474253 @default.
- W2904188377 cites W2892230114 @default.
- W2904188377 cites W2901345551 @default.
- W2904188377 cites W2949754295 @default.
- W2904188377 cites W2951222758 @default.
- W2904188377 cites W2962861921 @default.
- W2904188377 cites W2962872206 @default.
- W2904188377 cites W2963030777 @default.
- W2904188377 cites W2963049774 @default.
- W2904188377 cites W2963184621 @default.
- W2904188377 cites W2963412706 @default.
- W2904188377 cites W2963681938 @default.
- W2904188377 cites W2964036701 @default.
- W2904188377 cites W2964054583 @default.
- W2904188377 cites W2964084913 @default.
- W2904188377 cites W3008744877 @default.
- W2904188377 cites W3202414037 @default.
- W2904188377 cites W53582479 @default.
- W2904188377 hasPublicationYear "2018" @default.
- W2904188377 type Work @default.
- W2904188377 sameAs 2904188377 @default.
- W2904188377 citedByCount "5" @default.
- W2904188377 countsByYear W29041883772018 @default.
- W2904188377 countsByYear W29041883772019 @default.
- W2904188377 countsByYear W29041883772020 @default.
- W2904188377 crossrefType "posted-content" @default.
- W2904188377 hasAuthorship W2904188377A5012870568 @default.
- W2904188377 hasAuthorship W2904188377A5047090527 @default.
- W2904188377 hasConcept C105795698 @default.
- W2904188377 hasConcept C118615104 @default.
- W2904188377 hasConcept C119857082 @default.
- W2904188377 hasConcept C126255220 @default.
- W2904188377 hasConcept C129844170 @default.
- W2904188377 hasConcept C154945302 @default.
- W2904188377 hasConcept C163175372 @default.
- W2904188377 hasConcept C185429906 @default.
- W2904188377 hasConcept C185592680 @default.
- W2904188377 hasConcept C198531522 @default.
- W2904188377 hasConcept C202444582 @default.
- W2904188377 hasConcept C2524010 @default.
- W2904188377 hasConcept C2780069185 @default.
- W2904188377 hasConcept C33676613 @default.
- W2904188377 hasConcept C33923547 @default.
- W2904188377 hasConcept C41008148 @default.
- W2904188377 hasConcept C43617362 @default.
- W2904188377 hasConcept C91575142 @default.
- W2904188377 hasConcept C97541855 @default.
- W2904188377 hasConcept C98779006 @default.
- W2904188377 hasConceptScore W2904188377C105795698 @default.
- W2904188377 hasConceptScore W2904188377C118615104 @default.
- W2904188377 hasConceptScore W2904188377C119857082 @default.
- W2904188377 hasConceptScore W2904188377C126255220 @default.
- W2904188377 hasConceptScore W2904188377C129844170 @default.
- W2904188377 hasConceptScore W2904188377C154945302 @default.
- W2904188377 hasConceptScore W2904188377C163175372 @default.
- W2904188377 hasConceptScore W2904188377C185429906 @default.
- W2904188377 hasConceptScore W2904188377C185592680 @default.
- W2904188377 hasConceptScore W2904188377C198531522 @default.
- W2904188377 hasConceptScore W2904188377C202444582 @default.
- W2904188377 hasConceptScore W2904188377C2524010 @default.
- W2904188377 hasConceptScore W2904188377C2780069185 @default.
- W2904188377 hasConceptScore W2904188377C33676613 @default.
- W2904188377 hasConceptScore W2904188377C33923547 @default.