Matches in SemOpenAlex for { <https://semopenalex.org/work/W2783396564> ?p ?o ?g. }
- W2783396564 endingPage "187" @default.
- W2783396564 startingPage "178" @default.
- W2783396564 abstract "This paper addresses the problem of deriving a policy from the value function in the context of critic-only reinforcement learning (RL) in continuous state and action spaces. With continuous-valued states, RL algorithms have to rely on a numerical approximator to represent the value function. Numerical approximation due to its nature virtually always exhibits artifacts which damage the overall performance of the controlled system. In addition, when continuous-valued action is used, the most common approach is to discretize the action space and exhaustively search for the action that maximizes the right-hand side of the Bellman equation. Such a policy derivation procedure is computationally involved and results in steady-state error due to the lack of continuity. In this work, we propose policy derivation methods which alleviate the above problems by means of action space refinement, continuous approximation, and post-processing of the V-function by using symbolic regression. The proposed methods are tested on nonlinear control problems: 1-DOF and 2-DOF pendulum swing-up problems, and on magnetic manipulation. The results show significantly improved performance in terms of cumulative return and computational complexity." @default.
- W2783396564 created "2018-01-26" @default.
- W2783396564 creator A5004892921 @default.
- W2783396564 creator A5017789700 @default.
- W2783396564 creator A5084264842 @default.
- W2783396564 date "2018-03-01" @default.
- W2783396564 modified "2023-09-27" @default.
- W2783396564 title "Policy derivation methods for critic-only reinforcement learning in continuous spaces" @default.
- W2783396564 cites W1988505764 @default.
- W2783396564 cites W2005660297 @default.
- W2783396564 cites W2006141568 @default.
- W2783396564 cites W2025485823 @default.
- W2783396564 cites W2080588896 @default.
- W2783396564 cites W2101068209 @default.
- W2783396564 cites W2120968583 @default.
- W2783396564 cites W2139055047 @default.
- W2783396564 cites W2146199577 @default.
- W2783396564 cites W2152161277 @default.
- W2783396564 cites W2154549708 @default.
- W2783396564 cites W2483158547 @default.
- W2783396564 cites W2580909119 @default.
- W2783396564 cites W2760223394 @default.
- W2783396564 doi "https://doi.org/10.1016/j.engappai.2017.12.004" @default.
- W2783396564 hasPublicationYear "2018" @default.
- W2783396564 type Work @default.
- W2783396564 sameAs 2783396564 @default.
- W2783396564 citedByCount "14" @default.
- W2783396564 countsByYear W27833965642018 @default.
- W2783396564 countsByYear W27833965642019 @default.
- W2783396564 countsByYear W27833965642020 @default.
- W2783396564 countsByYear W27833965642021 @default.
- W2783396564 countsByYear W27833965642022 @default.
- W2783396564 crossrefType "journal-article" @default.
- W2783396564 hasAuthorship W2783396564A5004892921 @default.
- W2783396564 hasAuthorship W2783396564A5017789700 @default.
- W2783396564 hasAuthorship W2783396564A5084264842 @default.
- W2783396564 hasBestOaLocation W27833965642 @default.
- W2783396564 hasConcept C105795698 @default.
- W2783396564 hasConcept C121332964 @default.
- W2783396564 hasConcept C126255220 @default.
- W2783396564 hasConcept C134306372 @default.
- W2783396564 hasConcept C14036430 @default.
- W2783396564 hasConcept C14646407 @default.
- W2783396564 hasConcept C151730666 @default.
- W2783396564 hasConcept C154945302 @default.
- W2783396564 hasConcept C158622935 @default.
- W2783396564 hasConcept C188116033 @default.
- W2783396564 hasConcept C192921069 @default.
- W2783396564 hasConcept C2779343474 @default.
- W2783396564 hasConcept C2780791683 @default.
- W2783396564 hasConcept C28826006 @default.
- W2783396564 hasConcept C33923547 @default.
- W2783396564 hasConcept C41008148 @default.
- W2783396564 hasConcept C50644808 @default.
- W2783396564 hasConcept C62520636 @default.
- W2783396564 hasConcept C72434380 @default.
- W2783396564 hasConcept C73000952 @default.
- W2783396564 hasConcept C78458016 @default.
- W2783396564 hasConcept C86803240 @default.
- W2783396564 hasConcept C91575142 @default.
- W2783396564 hasConcept C91873725 @default.
- W2783396564 hasConcept C97541855 @default.
- W2783396564 hasConceptScore W2783396564C105795698 @default.
- W2783396564 hasConceptScore W2783396564C121332964 @default.
- W2783396564 hasConceptScore W2783396564C126255220 @default.
- W2783396564 hasConceptScore W2783396564C134306372 @default.
- W2783396564 hasConceptScore W2783396564C14036430 @default.
- W2783396564 hasConceptScore W2783396564C14646407 @default.
- W2783396564 hasConceptScore W2783396564C151730666 @default.
- W2783396564 hasConceptScore W2783396564C154945302 @default.
- W2783396564 hasConceptScore W2783396564C158622935 @default.
- W2783396564 hasConceptScore W2783396564C188116033 @default.
- W2783396564 hasConceptScore W2783396564C192921069 @default.
- W2783396564 hasConceptScore W2783396564C2779343474 @default.
- W2783396564 hasConceptScore W2783396564C2780791683 @default.
- W2783396564 hasConceptScore W2783396564C28826006 @default.
- W2783396564 hasConceptScore W2783396564C33923547 @default.
- W2783396564 hasConceptScore W2783396564C41008148 @default.
- W2783396564 hasConceptScore W2783396564C50644808 @default.
- W2783396564 hasConceptScore W2783396564C62520636 @default.
- W2783396564 hasConceptScore W2783396564C72434380 @default.
- W2783396564 hasConceptScore W2783396564C73000952 @default.
- W2783396564 hasConceptScore W2783396564C78458016 @default.
- W2783396564 hasConceptScore W2783396564C86803240 @default.
- W2783396564 hasConceptScore W2783396564C91575142 @default.
- W2783396564 hasConceptScore W2783396564C91873725 @default.
- W2783396564 hasConceptScore W2783396564C97541855 @default.
- W2783396564 hasFunder F4320321006 @default.
- W2783396564 hasLocation W27833965641 @default.
- W2783396564 hasLocation W27833965642 @default.
- W2783396564 hasOpenAccess W2783396564 @default.
- W2783396564 hasPrimaryLocation W27833965641 @default.
- W2783396564 hasRelatedWork W1592209052 @default.
- W2783396564 hasRelatedWork W1600352493 @default.
- W2783396564 hasRelatedWork W2120968583 @default.
- W2783396564 hasRelatedWork W2169029225 @default.
- W2783396564 hasRelatedWork W2183243664 @default.
- W2783396564 hasRelatedWork W2359199643 @default.