Matches in SemOpenAlex for { <https://semopenalex.org/work/W2807908072> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W2807908072 abstract "Deep Deterministic Policy Gradient (DDPG) algorithm has been successful for state-of-the-art performance in high-dimensional continuous control tasks. However, due to the complexity and randomness of the environment, DDPG tends to suffer from inefficient exploration and unstable training. In this work, we propose Self-Adaptive Double Bootstrapped DDPG (SOUP), an algorithm that extends DDPG to bootstrapped actor-critic architecture. SOUP improves the efficiency of exploration by multiple actor heads capturing more potential actions and multiple critic heads evaluating more reasonable Q-values collaboratively. The crux of double bootstrapped architecture is to tackle the fluctuations in performance, caused by multiple heads of spotty capacity varying throughout training. To alleviate the instability, a self-adaptive confidence mechanism is introduced to dynamically adjust the weights of bootstrapped heads and enhance the ensemble performance effectively and efficiently. We demonstrate that SOUP achieves faster learning by at least 45% while improving cumulative reward and stability substantially in comparison to vanilla DDPG on OpenAI Gym's MuJoCo environments." @default.
- W2807908072 created "2018-06-21" @default.
- W2807908072 creator A5008769328 @default.
- W2807908072 creator A5021760829 @default.
- W2807908072 creator A5049760745 @default.
- W2807908072 creator A5054238848 @default.
- W2807908072 creator A5061771496 @default.
- W2807908072 date "2018-07-01" @default.
- W2807908072 modified "2023-09-27" @default.
- W2807908072 title "Self-Adaptive Double Bootstrapped DDPG" @default.
- W2807908072 cites W2145339207 @default.
- W2807908072 cites W2614839826 @default.
- W2807908072 cites W2739678353 @default.
- W2807908072 cites W2774354230 @default.
- W2807908072 cites W2950872548 @default.
- W2807908072 cites W2963938771 @default.
- W2807908072 cites W2964043796 @default.
- W2807908072 cites W2964121744 @default.
- W2807908072 cites W2964161785 @default.
- W2807908072 doi "https://doi.org/10.24963/ijcai.2018/444" @default.
- W2807908072 hasPublicationYear "2018" @default.
- W2807908072 type Work @default.
- W2807908072 sameAs 2807908072 @default.
- W2807908072 citedByCount "13" @default.
- W2807908072 countsByYear W28079080722019 @default.
- W2807908072 countsByYear W28079080722020 @default.
- W2807908072 countsByYear W28079080722021 @default.
- W2807908072 countsByYear W28079080722022 @default.
- W2807908072 countsByYear W28079080722023 @default.
- W2807908072 crossrefType "proceedings-article" @default.
- W2807908072 hasAuthorship W2807908072A5008769328 @default.
- W2807908072 hasAuthorship W2807908072A5021760829 @default.
- W2807908072 hasAuthorship W2807908072A5049760745 @default.
- W2807908072 hasAuthorship W2807908072A5054238848 @default.
- W2807908072 hasAuthorship W2807908072A5061771496 @default.
- W2807908072 hasBestOaLocation W28079080721 @default.
- W2807908072 hasConcept C105795698 @default.
- W2807908072 hasConcept C112972136 @default.
- W2807908072 hasConcept C119857082 @default.
- W2807908072 hasConcept C123657996 @default.
- W2807908072 hasConcept C125112378 @default.
- W2807908072 hasConcept C142362112 @default.
- W2807908072 hasConcept C153349607 @default.
- W2807908072 hasConcept C154945302 @default.
- W2807908072 hasConcept C33923547 @default.
- W2807908072 hasConcept C41008148 @default.
- W2807908072 hasConcept C97541855 @default.
- W2807908072 hasConceptScore W2807908072C105795698 @default.
- W2807908072 hasConceptScore W2807908072C112972136 @default.
- W2807908072 hasConceptScore W2807908072C119857082 @default.
- W2807908072 hasConceptScore W2807908072C123657996 @default.
- W2807908072 hasConceptScore W2807908072C125112378 @default.
- W2807908072 hasConceptScore W2807908072C142362112 @default.
- W2807908072 hasConceptScore W2807908072C153349607 @default.
- W2807908072 hasConceptScore W2807908072C154945302 @default.
- W2807908072 hasConceptScore W2807908072C33923547 @default.
- W2807908072 hasConceptScore W2807908072C41008148 @default.
- W2807908072 hasConceptScore W2807908072C97541855 @default.
- W2807908072 hasLocation W28079080721 @default.
- W2807908072 hasOpenAccess W2807908072 @default.
- W2807908072 hasPrimaryLocation W28079080721 @default.
- W2807908072 hasRelatedWork W260766989 @default.
- W2807908072 hasRelatedWork W2765302304 @default.
- W2807908072 hasRelatedWork W2959276766 @default.
- W2807908072 hasRelatedWork W2963757175 @default.
- W2807908072 hasRelatedWork W3074294383 @default.
- W2807908072 hasRelatedWork W3111983280 @default.
- W2807908072 hasRelatedWork W3139193008 @default.
- W2807908072 hasRelatedWork W3164468573 @default.
- W2807908072 hasRelatedWork W4206669594 @default.
- W2807908072 hasRelatedWork W4295941380 @default.
- W2807908072 isParatext "false" @default.
- W2807908072 isRetracted "false" @default.
- W2807908072 magId "2807908072" @default.
- W2807908072 workType "article" @default.