Matches in SemOpenAlex for { <https://semopenalex.org/work/W2951799422> ?p ?o ?g. }
- W2951799422 abstract "Deep Reinforcement Learning (DRL) algorithms have been successfully applied to a range of challenging control tasks. However, these methods typically suffer from three core difficulties: temporal credit assignment with sparse rewards, lack of effective exploration, and brittle convergence properties that are extremely sensitive to hyperparameters. Collectively, these challenges severely limit the applicability of these approaches to real-world problems. Evolutionary Algorithms (EAs), a class of black box optimization techniques inspired by natural evolution, are well suited to address each of these three challenges. However, EAs typically suffer from high sample complexity and struggle to solve problems that require optimization of a large number of parameters. In this paper, we introduce Evolutionary Reinforcement Learning (ERL), a hybrid algorithm that leverages the population of an EA to provide diversified data to train an RL agent, and reinserts the RL agent into the EA population periodically to inject gradient information into the EA. ERL inherits EA's ability of temporal credit assignment with a fitness metric, effective exploration with a diverse set of policies, and stability of a population-based approach and complements it with off-policy DRL's ability to leverage gradients for higher sample efficiency and faster learning. Experiments in a range of challenging continuous control benchmarks demonstrate that ERL significantly outperforms prior DRL and EA methods." @default.
- W2951799422 created "2019-06-27" @default.
- W2951799422 creator A5008832526 @default.
- W2951799422 creator A5084748531 @default.
- W2951799422 date "2018-05-21" @default.
- W2951799422 modified "2023-09-27" @default.
- W2951799422 title "Evolution-Guided Policy Gradient in Reinforcement Learning" @default.
- W2951799422 cites W1191599655 @default.
- W2951799422 cites W1522301498 @default.
- W2951799422 cites W1674110665 @default.
- W2951799422 cites W1738827650 @default.
- W2951799422 cites W1966456942 @default.
- W2951799422 cites W1978970913 @default.
- W2951799422 cites W2017957151 @default.
- W2951799422 cites W2111935653 @default.
- W2951799422 cites W2115967884 @default.
- W2951799422 cites W2116339921 @default.
- W2951799422 cites W2126811106 @default.
- W2951799422 cites W2134042548 @default.
- W2951799422 cites W2145339207 @default.
- W2951799422 cites W2154022540 @default.
- W2951799422 cites W2158782408 @default.
- W2951799422 cites W2171658832 @default.
- W2951799422 cites W2173248099 @default.
- W2951799422 cites W2176412452 @default.
- W2951799422 cites W2257979135 @default.
- W2951799422 cites W2342662072 @default.
- W2951799422 cites W2414371988 @default.
- W2951799422 cites W2417786368 @default.
- W2951799422 cites W2419612459 @default.
- W2951799422 cites W2462548332 @default.
- W2951799422 cites W2556958149 @default.
- W2951799422 cites W2561776174 @default.
- W2951799422 cites W2583761661 @default.
- W2951799422 cites W2592901955 @default.
- W2951799422 cites W2593237273 @default.
- W2951799422 cites W2596367596 @default.
- W2951799422 cites W2596982695 @default.
- W2951799422 cites W2598247389 @default.
- W2951799422 cites W2614839826 @default.
- W2951799422 cites W2620671107 @default.
- W2951799422 cites W2623491082 @default.
- W2951799422 cites W2724169821 @default.
- W2951799422 cites W2733961795 @default.
- W2951799422 cites W2736601468 @default.
- W2951799422 cites W2747402019 @default.
- W2951799422 cites W2754517384 @default.
- W2951799422 cites W2767002384 @default.
- W2951799422 cites W2778749116 @default.
- W2951799422 cites W2781726626 @default.
- W2951799422 cites W2784464072 @default.
- W2951799422 cites W2785542505 @default.
- W2951799422 cites W2786036274 @default.
- W2951799422 cites W2794711922 @default.
- W2951799422 cites W2899771611 @default.
- W2951799422 cites W2949608212 @default.
- W2951799422 cites W2950854694 @default.
- W2951799422 cites W2963790038 @default.
- W2951799422 cites W2964043796 @default.
- W2951799422 cites W2998349125 @default.
- W2951799422 cites W2770298516 @default.
- W2951799422 hasPublicationYear "2018" @default.
- W2951799422 type Work @default.
- W2951799422 sameAs 2951799422 @default.
- W2951799422 citedByCount "17" @default.
- W2951799422 countsByYear W29517994222018 @default.
- W2951799422 countsByYear W29517994222019 @default.
- W2951799422 countsByYear W29517994222020 @default.
- W2951799422 countsByYear W29517994222021 @default.
- W2951799422 crossrefType "posted-content" @default.
- W2951799422 hasAuthorship W2951799422A5008832526 @default.
- W2951799422 hasAuthorship W2951799422A5084748531 @default.
- W2951799422 hasConcept C119857082 @default.
- W2951799422 hasConcept C126255220 @default.
- W2951799422 hasConcept C127413603 @default.
- W2951799422 hasConcept C144024400 @default.
- W2951799422 hasConcept C146978453 @default.
- W2951799422 hasConcept C149923435 @default.
- W2951799422 hasConcept C153083717 @default.
- W2951799422 hasConcept C154945302 @default.
- W2951799422 hasConcept C159149176 @default.
- W2951799422 hasConcept C204323151 @default.
- W2951799422 hasConcept C2908647359 @default.
- W2951799422 hasConcept C33923547 @default.
- W2951799422 hasConcept C41008148 @default.
- W2951799422 hasConcept C8642999 @default.
- W2951799422 hasConcept C97541855 @default.
- W2951799422 hasConceptScore W2951799422C119857082 @default.
- W2951799422 hasConceptScore W2951799422C126255220 @default.
- W2951799422 hasConceptScore W2951799422C127413603 @default.
- W2951799422 hasConceptScore W2951799422C144024400 @default.
- W2951799422 hasConceptScore W2951799422C146978453 @default.
- W2951799422 hasConceptScore W2951799422C149923435 @default.
- W2951799422 hasConceptScore W2951799422C153083717 @default.
- W2951799422 hasConceptScore W2951799422C154945302 @default.
- W2951799422 hasConceptScore W2951799422C159149176 @default.
- W2951799422 hasConceptScore W2951799422C204323151 @default.
- W2951799422 hasConceptScore W2951799422C2908647359 @default.
- W2951799422 hasConceptScore W2951799422C33923547 @default.
- W2951799422 hasConceptScore W2951799422C41008148 @default.