Matches in SemOpenAlex for { <https://semopenalex.org/work/W1863485266> ?p ?o ?g. }
- W1863485266 endingPage "158" @default.
- W1863485266 startingPage "150" @default.
- W1863485266 abstract "The constrained optimal control problem depends on the solution of the complicated Hamilton-Jacobi-Bellman equation (HJBE). In this paper, a data-based off-policy reinforcement learning (RL) method is proposed, which learns the solution of the HJBE and the optimal control policy from real system data. One important feature of the off-policy RL is that its policy evaluation can be realized with data generated by other behavior policies, not necessarily the target policy, which solves the insufficient exploration problem. The convergence of the off-policy RL is proved by demonstrating its equivalence to the successive approximation approach. Its implementation procedure is based on the actor-critic neural networks structure, where the function approximation is conducted with linearly independent basis functions. Subsequently, the convergence of the implementation procedure with function approximation is also proved. Finally, its effectiveness is verified through computer simulations." @default.
- W1863485266 created "2016-06-24" @default.
- W1863485266 creator A5000266144 @default.
- W1863485266 creator A5004057129 @default.
- W1863485266 creator A5012004938 @default.
- W1863485266 creator A5074290686 @default.
- W1863485266 date "2015-11-01" @default.
- W1863485266 modified "2023-10-16" @default.
- W1863485266 title "Reinforcement learning solution for HJB equation arising in constrained optimal control problem" @default.
- W1863485266 cites W1531602929 @default.
- W1863485266 cites W1972243698 @default.
- W1863485266 cites W1972809999 @default.
- W1863485266 cites W1977237536 @default.
- W1863485266 cites W1978010349 @default.
- W1863485266 cites W1978732706 @default.
- W1863485266 cites W1983523797 @default.
- W1863485266 cites W1987657756 @default.
- W1863485266 cites W1988523476 @default.
- W1863485266 cites W1999678919 @default.
- W1863485266 cites W2010152647 @default.
- W1863485266 cites W2012451615 @default.
- W1863485266 cites W2024303516 @default.
- W1863485266 cites W2046386580 @default.
- W1863485266 cites W2047090868 @default.
- W1863485266 cites W2048687352 @default.
- W1863485266 cites W2077195478 @default.
- W1863485266 cites W2081514674 @default.
- W1863485266 cites W2083283377 @default.
- W1863485266 cites W2085194340 @default.
- W1863485266 cites W2087063454 @default.
- W1863485266 cites W2107674817 @default.
- W1863485266 cites W2108286682 @default.
- W1863485266 cites W2113501460 @default.
- W1863485266 cites W2132468772 @default.
- W1863485266 cites W2137092694 @default.
- W1863485266 cites W2138131694 @default.
- W1863485266 cites W2148439597 @default.
- W1863485266 cites W2150841368 @default.
- W1863485266 cites W2160561608 @default.
- W1863485266 cites W2161130209 @default.
- W1863485266 cites W2171754584 @default.
- W1863485266 cites W2201570790 @default.
- W1863485266 cites W2767784613 @default.
- W1863485266 doi "https://doi.org/10.1016/j.neunet.2015.08.007" @default.
- W1863485266 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/26356598" @default.
- W1863485266 hasPublicationYear "2015" @default.
- W1863485266 type Work @default.
- W1863485266 sameAs 1863485266 @default.
- W1863485266 citedByCount "93" @default.
- W1863485266 countsByYear W18634852662015 @default.
- W1863485266 countsByYear W18634852662016 @default.
- W1863485266 countsByYear W18634852662017 @default.
- W1863485266 countsByYear W18634852662018 @default.
- W1863485266 countsByYear W18634852662019 @default.
- W1863485266 countsByYear W18634852662020 @default.
- W1863485266 countsByYear W18634852662021 @default.
- W1863485266 countsByYear W18634852662022 @default.
- W1863485266 countsByYear W18634852662023 @default.
- W1863485266 crossrefType "journal-article" @default.
- W1863485266 hasAuthorship W1863485266A5000266144 @default.
- W1863485266 hasAuthorship W1863485266A5004057129 @default.
- W1863485266 hasAuthorship W1863485266A5012004938 @default.
- W1863485266 hasAuthorship W1863485266A5074290686 @default.
- W1863485266 hasConcept C118615104 @default.
- W1863485266 hasConcept C126255220 @default.
- W1863485266 hasConcept C14036430 @default.
- W1863485266 hasConcept C14646407 @default.
- W1863485266 hasConcept C154945302 @default.
- W1863485266 hasConcept C162324750 @default.
- W1863485266 hasConcept C196978813 @default.
- W1863485266 hasConcept C2775924081 @default.
- W1863485266 hasConcept C2777303404 @default.
- W1863485266 hasConcept C2780069185 @default.
- W1863485266 hasConcept C33923547 @default.
- W1863485266 hasConcept C41008148 @default.
- W1863485266 hasConcept C50522688 @default.
- W1863485266 hasConcept C50644808 @default.
- W1863485266 hasConcept C78458016 @default.
- W1863485266 hasConcept C86803240 @default.
- W1863485266 hasConcept C91575142 @default.
- W1863485266 hasConcept C91873725 @default.
- W1863485266 hasConcept C97541855 @default.
- W1863485266 hasConceptScore W1863485266C118615104 @default.
- W1863485266 hasConceptScore W1863485266C126255220 @default.
- W1863485266 hasConceptScore W1863485266C14036430 @default.
- W1863485266 hasConceptScore W1863485266C14646407 @default.
- W1863485266 hasConceptScore W1863485266C154945302 @default.
- W1863485266 hasConceptScore W1863485266C162324750 @default.
- W1863485266 hasConceptScore W1863485266C196978813 @default.
- W1863485266 hasConceptScore W1863485266C2775924081 @default.
- W1863485266 hasConceptScore W1863485266C2777303404 @default.
- W1863485266 hasConceptScore W1863485266C2780069185 @default.
- W1863485266 hasConceptScore W1863485266C33923547 @default.
- W1863485266 hasConceptScore W1863485266C41008148 @default.
- W1863485266 hasConceptScore W1863485266C50522688 @default.
- W1863485266 hasConceptScore W1863485266C50644808 @default.
- W1863485266 hasConceptScore W1863485266C78458016 @default.
- W1863485266 hasConceptScore W1863485266C86803240 @default.