Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226145367> ?p ?o ?g. }
- W4226145367 endingPage "2033" @default.
- W4226145367 startingPage "2018" @default.
- W4226145367 abstract "Maximum entropy reinforcement learning methods have been successfully applied to a range of challenging sequential decision-making and control tasks. However, most of the existing techniques are designed for discrete-time systems although there has been a growing interest to handle physical processes evolving in continuous time. As a first step toward their extension to continuous-time systems, this article aims to study the theory of maximum entropy optimal control in continuous time. Applying the dynamic programming principle, we derive a novel class of Hamilton–Jacobi–Bellman (HJB) equations and prove that the optimal value function of the maximum entropy control problem corresponds to the unique viscosity solution of the HJB equation. We further show that the optimal control is uniquely characterized as Gaussian in the case of control-affine systems and that, for linear-quadratic problems, the HJB equation is reduced to a Riccati equation, which can be used to obtain an explicit expression of the optimal control. The results of our numerical experiments demonstrate the performance of our maximum entropy method in continuous-time optimal control and reinforcement learning problems." @default.
- W4226145367 created "2022-05-05" @default.
- W4226145367 creator A5049129395 @default.
- W4226145367 creator A5063507594 @default.
- W4226145367 date "2023-04-01" @default.
- W4226145367 modified "2023-10-15" @default.
- W4226145367 title "Maximum Entropy Optimal Control of Continuous-Time Dynamical Systems" @default.
- W4226145367 cites W1487586009 @default.
- W4226145367 cites W153353184 @default.
- W4226145367 cites W1966514629 @default.
- W4226145367 cites W1967575591 @default.
- W4226145367 cites W1967663836 @default.
- W4226145367 cites W1969853731 @default.
- W4226145367 cites W1982931088 @default.
- W4226145367 cites W2008674670 @default.
- W4226145367 cites W2027438381 @default.
- W4226145367 cites W2043962331 @default.
- W4226145367 cites W2061474199 @default.
- W4226145367 cites W2063204493 @default.
- W4226145367 cites W2074451231 @default.
- W4226145367 cites W2080823861 @default.
- W4226145367 cites W2093524643 @default.
- W4226145367 cites W2155772159 @default.
- W4226145367 cites W2159998880 @default.
- W4226145367 cites W2484646121 @default.
- W4226145367 cites W2605995510 @default.
- W4226145367 cites W2767915842 @default.
- W4226145367 cites W2951021288 @default.
- W4226145367 cites W2963056268 @default.
- W4226145367 cites W2963395620 @default.
- W4226145367 cites W2963946658 @default.
- W4226145367 cites W4210869902 @default.
- W4226145367 cites W4226300562 @default.
- W4226145367 cites W4239369248 @default.
- W4226145367 cites W4241127195 @default.
- W4226145367 cites W4242695360 @default.
- W4226145367 cites W4252400669 @default.
- W4226145367 cites W4301886962 @default.
- W4226145367 doi "https://doi.org/10.1109/tac.2022.3168168" @default.
- W4226145367 hasPublicationYear "2023" @default.
- W4226145367 type Work @default.
- W4226145367 citedByCount "2" @default.
- W4226145367 countsByYear W42261453672022 @default.
- W4226145367 countsByYear W42261453672023 @default.
- W4226145367 crossrefType "journal-article" @default.
- W4226145367 hasAuthorship W4226145367A5049129395 @default.
- W4226145367 hasAuthorship W4226145367A5063507594 @default.
- W4226145367 hasConcept C105795698 @default.
- W4226145367 hasConcept C106301342 @default.
- W4226145367 hasConcept C121332964 @default.
- W4226145367 hasConcept C126255220 @default.
- W4226145367 hasConcept C134306372 @default.
- W4226145367 hasConcept C13847129 @default.
- W4226145367 hasConcept C14646407 @default.
- W4226145367 hasConcept C154945302 @default.
- W4226145367 hasConcept C170131372 @default.
- W4226145367 hasConcept C196978813 @default.
- W4226145367 hasConcept C204495892 @default.
- W4226145367 hasConcept C2775924081 @default.
- W4226145367 hasConcept C28826006 @default.
- W4226145367 hasConcept C33923547 @default.
- W4226145367 hasConcept C37404715 @default.
- W4226145367 hasConcept C41008148 @default.
- W4226145367 hasConcept C44415725 @default.
- W4226145367 hasConcept C45473103 @default.
- W4226145367 hasConcept C47446073 @default.
- W4226145367 hasConcept C62520636 @default.
- W4226145367 hasConcept C78045399 @default.
- W4226145367 hasConcept C91575142 @default.
- W4226145367 hasConcept C9679016 @default.
- W4226145367 hasConcept C97541855 @default.
- W4226145367 hasConcept C98779006 @default.
- W4226145367 hasConceptScore W4226145367C105795698 @default.
- W4226145367 hasConceptScore W4226145367C106301342 @default.
- W4226145367 hasConceptScore W4226145367C121332964 @default.
- W4226145367 hasConceptScore W4226145367C126255220 @default.
- W4226145367 hasConceptScore W4226145367C134306372 @default.
- W4226145367 hasConceptScore W4226145367C13847129 @default.
- W4226145367 hasConceptScore W4226145367C14646407 @default.
- W4226145367 hasConceptScore W4226145367C154945302 @default.
- W4226145367 hasConceptScore W4226145367C170131372 @default.
- W4226145367 hasConceptScore W4226145367C196978813 @default.
- W4226145367 hasConceptScore W4226145367C204495892 @default.
- W4226145367 hasConceptScore W4226145367C2775924081 @default.
- W4226145367 hasConceptScore W4226145367C28826006 @default.
- W4226145367 hasConceptScore W4226145367C33923547 @default.
- W4226145367 hasConceptScore W4226145367C37404715 @default.
- W4226145367 hasConceptScore W4226145367C41008148 @default.
- W4226145367 hasConceptScore W4226145367C44415725 @default.
- W4226145367 hasConceptScore W4226145367C45473103 @default.
- W4226145367 hasConceptScore W4226145367C47446073 @default.
- W4226145367 hasConceptScore W4226145367C62520636 @default.
- W4226145367 hasConceptScore W4226145367C78045399 @default.
- W4226145367 hasConceptScore W4226145367C91575142 @default.
- W4226145367 hasConceptScore W4226145367C9679016 @default.
- W4226145367 hasConceptScore W4226145367C97541855 @default.
- W4226145367 hasConceptScore W4226145367C98779006 @default.
- W4226145367 hasFunder F4320322120 @default.