Matches in SemOpenAlex for { <https://semopenalex.org/work/W4295277065> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W4295277065 abstract "In this paper, we propose enhancing actor-critic reinforcement learning agents by parameterising the final actor layer which produces the actions in order to accommodate the behaviour discrepancy of different actuators, under different load conditions during interaction with the environment. We propose branching the action producing layer in the actor to learn the tuning parameter controlling the activation layer (e.g. Tanh and Sigmoid). The learned parameters are then used to create tailored activation functions for each actuator. We ran experiments on three OpenAI Gym environments, i.e. Pendulum-v0, LunarLanderContinuous-v2 and BipedalWalker-v2. Results have shown an average of 23.15% and 33.80% increase in total episode reward of the LunarLanderContinuous-v2 and BipedalWalker-v2 environments, respectively. There was no significant improvement in Pendulum-v0 environment but the proposed method produces a more stable actuation signal compared to the state-of-the-art method. The proposed method allows the reinforcement learning actor to produce more robust actions that accommodate the discrepancy in the actuators' response functions. This is particularly useful for real life scenarios where actuators exhibit different response functions depending on the load and the interaction with the environment. This also simplifies the transfer learning problem by fine tuning the parameterised activation layers instead of retraining the entire policy every time an actuator is replaced. Finally, the proposed method would allow better accommodation to biological actuators (e.g. muscles) in biomechanical systems." @default.
- W4295277065 created "2022-09-12" @default.
- W4295277065 creator A5018882728 @default.
- W4295277065 creator A5032697755 @default.
- W4295277065 creator A5080894673 @default.
- W4295277065 creator A5090193033 @default.
- W4295277065 date "2020-06-04" @default.
- W4295277065 modified "2023-09-26" @default.
- W4295277065 title "Refined Continuous Control of DDPG Actors via Parametrised Activation" @default.
- W4295277065 doi "https://doi.org/10.48550/arxiv.2006.02818" @default.
- W4295277065 hasPublicationYear "2020" @default.
- W4295277065 type Work @default.
- W4295277065 citedByCount "0" @default.
- W4295277065 crossrefType "posted-content" @default.
- W4295277065 hasAuthorship W4295277065A5018882728 @default.
- W4295277065 hasAuthorship W4295277065A5032697755 @default.
- W4295277065 hasAuthorship W4295277065A5080894673 @default.
- W4295277065 hasAuthorship W4295277065A5090193033 @default.
- W4295277065 hasBestOaLocation W42952770651 @default.
- W4295277065 hasConcept C121332964 @default.
- W4295277065 hasConcept C127413603 @default.
- W4295277065 hasConcept C133731056 @default.
- W4295277065 hasConcept C154945302 @default.
- W4295277065 hasConcept C158622935 @default.
- W4295277065 hasConcept C172707124 @default.
- W4295277065 hasConcept C192921069 @default.
- W4295277065 hasConcept C2775924081 @default.
- W4295277065 hasConcept C41008148 @default.
- W4295277065 hasConcept C47446073 @default.
- W4295277065 hasConcept C50644808 @default.
- W4295277065 hasConcept C62520636 @default.
- W4295277065 hasConcept C81388566 @default.
- W4295277065 hasConcept C97541855 @default.
- W4295277065 hasConceptScore W4295277065C121332964 @default.
- W4295277065 hasConceptScore W4295277065C127413603 @default.
- W4295277065 hasConceptScore W4295277065C133731056 @default.
- W4295277065 hasConceptScore W4295277065C154945302 @default.
- W4295277065 hasConceptScore W4295277065C158622935 @default.
- W4295277065 hasConceptScore W4295277065C172707124 @default.
- W4295277065 hasConceptScore W4295277065C192921069 @default.
- W4295277065 hasConceptScore W4295277065C2775924081 @default.
- W4295277065 hasConceptScore W4295277065C41008148 @default.
- W4295277065 hasConceptScore W4295277065C47446073 @default.
- W4295277065 hasConceptScore W4295277065C50644808 @default.
- W4295277065 hasConceptScore W4295277065C62520636 @default.
- W4295277065 hasConceptScore W4295277065C81388566 @default.
- W4295277065 hasConceptScore W4295277065C97541855 @default.
- W4295277065 hasLocation W42952770651 @default.
- W4295277065 hasOpenAccess W4295277065 @default.
- W4295277065 hasPrimaryLocation W42952770651 @default.
- W4295277065 hasRelatedWork W1583158799 @default.
- W4295277065 hasRelatedWork W2008072799 @default.
- W4295277065 hasRelatedWork W2056867023 @default.
- W4295277065 hasRelatedWork W2090301054 @default.
- W4295277065 hasRelatedWork W2158120962 @default.
- W4295277065 hasRelatedWork W2163605827 @default.
- W4295277065 hasRelatedWork W2333000837 @default.
- W4295277065 hasRelatedWork W2612501290 @default.
- W4295277065 hasRelatedWork W2617901630 @default.
- W4295277065 hasRelatedWork W3202123297 @default.
- W4295277065 isParatext "false" @default.
- W4295277065 isRetracted "false" @default.
- W4295277065 workType "article" @default.