Matches in SemOpenAlex for { <https://semopenalex.org/work/W3213048145> ?p ?o ?g. }
- W3213048145 abstract "Model based reinforcement learning (RL) refers to an approximate optimal control design for infinite-horizon (IH) problems that aims at approximating the optimal IH controller and associated cost parametrically. In online RL, the training process of the respective approximators is performed along the de facto system trajectory (potentially in addition to offline data). While there exist stability results for online RL, the IH controller performance has been addressed only fragmentary, rarely considering the parametric and error-prone nature of the approximation explicitly even in the model based case. To assess the performance for such a case, this work utilizes a model predictive control framework to mimic an online RL controller. More precisely, the optimization based controller is associated with an online adapted approximate cost which serves as a terminal cost function. The results include a stability and performance estimate statement for the control and training scheme and demonstrate the dependence of the controller's performance bound on the error resulting from parameterized cost approximation." @default.
- W3213048145 created "2021-11-22" @default.
- W3213048145 creator A5050202669 @default.
- W3213048145 creator A5067098516 @default.
- W3213048145 date "2021-11-16" @default.
- W3213048145 modified "2023-09-27" @default.
- W3213048145 title "A Performance Bound for Model Based Online Reinforcement Learning." @default.
- W3213048145 cites W1631623342 @default.
- W3213048145 cites W173093232 @default.
- W3213048145 cites W2005675298 @default.
- W3213048145 cites W2024122443 @default.
- W3213048145 cites W2038341035 @default.
- W3213048145 cites W2039439610 @default.
- W3213048145 cites W2048687352 @default.
- W3213048145 cites W2062457326 @default.
- W3213048145 cites W2062915697 @default.
- W3213048145 cites W2073787051 @default.
- W3213048145 cites W2081522185 @default.
- W3213048145 cites W2082900542 @default.
- W3213048145 cites W2093831009 @default.
- W3213048145 cites W2096164155 @default.
- W3213048145 cites W2106768170 @default.
- W3213048145 cites W2130599357 @default.
- W3213048145 cites W2151088727 @default.
- W3213048145 cites W2165545652 @default.
- W3213048145 cites W2165726932 @default.
- W3213048145 cites W2396444450 @default.
- W3213048145 cites W2476930474 @default.
- W3213048145 cites W2492516341 @default.
- W3213048145 cites W2516720609 @default.
- W3213048145 cites W2526693657 @default.
- W3213048145 cites W2594780386 @default.
- W3213048145 cites W2616226010 @default.
- W3213048145 cites W2734541791 @default.
- W3213048145 cites W2900806034 @default.
- W3213048145 cites W2901415045 @default.
- W3213048145 cites W2926270496 @default.
- W3213048145 cites W2930426397 @default.
- W3213048145 cites W2942517366 @default.
- W3213048145 cites W2963683522 @default.
- W3213048145 cites W2974060888 @default.
- W3213048145 cites W3092727668 @default.
- W3213048145 cites W3101479673 @default.
- W3213048145 cites W3107776766 @default.
- W3213048145 cites W3129896193 @default.
- W3213048145 cites W3147053055 @default.
- W3213048145 cites W3200157983 @default.
- W3213048145 cites W3200750397 @default.
- W3213048145 hasPublicationYear "2021" @default.
- W3213048145 type Work @default.
- W3213048145 sameAs 3213048145 @default.
- W3213048145 citedByCount "0" @default.
- W3213048145 crossrefType "posted-content" @default.
- W3213048145 hasAuthorship W3213048145A5050202669 @default.
- W3213048145 hasAuthorship W3213048145A5067098516 @default.
- W3213048145 hasConcept C105795698 @default.
- W3213048145 hasConcept C111919701 @default.
- W3213048145 hasConcept C112972136 @default.
- W3213048145 hasConcept C11413529 @default.
- W3213048145 hasConcept C117251300 @default.
- W3213048145 hasConcept C119857082 @default.
- W3213048145 hasConcept C121332964 @default.
- W3213048145 hasConcept C122383733 @default.
- W3213048145 hasConcept C126255220 @default.
- W3213048145 hasConcept C1276947 @default.
- W3213048145 hasConcept C134306372 @default.
- W3213048145 hasConcept C13662910 @default.
- W3213048145 hasConcept C14036430 @default.
- W3213048145 hasConcept C154945302 @default.
- W3213048145 hasConcept C165464430 @default.
- W3213048145 hasConcept C203479927 @default.
- W3213048145 hasConcept C2775924081 @default.
- W3213048145 hasConcept C33923547 @default.
- W3213048145 hasConcept C41008148 @default.
- W3213048145 hasConcept C47446073 @default.
- W3213048145 hasConcept C6557445 @default.
- W3213048145 hasConcept C77553402 @default.
- W3213048145 hasConcept C78458016 @default.
- W3213048145 hasConcept C86803240 @default.
- W3213048145 hasConcept C91575142 @default.
- W3213048145 hasConcept C97541855 @default.
- W3213048145 hasConcept C98045186 @default.
- W3213048145 hasConceptScore W3213048145C105795698 @default.
- W3213048145 hasConceptScore W3213048145C111919701 @default.
- W3213048145 hasConceptScore W3213048145C112972136 @default.
- W3213048145 hasConceptScore W3213048145C11413529 @default.
- W3213048145 hasConceptScore W3213048145C117251300 @default.
- W3213048145 hasConceptScore W3213048145C119857082 @default.
- W3213048145 hasConceptScore W3213048145C121332964 @default.
- W3213048145 hasConceptScore W3213048145C122383733 @default.
- W3213048145 hasConceptScore W3213048145C126255220 @default.
- W3213048145 hasConceptScore W3213048145C1276947 @default.
- W3213048145 hasConceptScore W3213048145C134306372 @default.
- W3213048145 hasConceptScore W3213048145C13662910 @default.
- W3213048145 hasConceptScore W3213048145C14036430 @default.
- W3213048145 hasConceptScore W3213048145C154945302 @default.
- W3213048145 hasConceptScore W3213048145C165464430 @default.
- W3213048145 hasConceptScore W3213048145C203479927 @default.
- W3213048145 hasConceptScore W3213048145C2775924081 @default.
- W3213048145 hasConceptScore W3213048145C33923547 @default.