Matches in SemOpenAlex for { <https://semopenalex.org/work/W3090993757> ?p ?o ?g. }
- W3090993757 abstract "In this work, we propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods, particularly in the deep Q-networks where the overestimation is exaggerated by function approximation errors. Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network, which leads to reduced overestimation bias as well as the variance. We provide empirical evidence on the advantages of our method by evaluating on some benchmark environment, the experimental results demonstrate significant improvement of performance in reducing the overestimation bias and stabilizing the training, further leading to better derived policies." @default.
- W3090993757 created "2020-10-08" @default.
- W3090993757 creator A5006795236 @default.
- W3090993757 creator A5051644960 @default.
- W3090993757 date "2020-09-29" @default.
- W3090993757 modified "2023-09-27" @default.
- W3090993757 title "Cross Learning in Deep Q-Networks." @default.
- W3090993757 cites W1522301498 @default.
- W3090993757 cites W1784384189 @default.
- W3090993757 cites W2091565802 @default.
- W3090993757 cites W2093253120 @default.
- W3090993757 cites W2100677568 @default.
- W3090993757 cites W2107438106 @default.
- W3090993757 cites W2115211925 @default.
- W3090993757 cites W2135482703 @default.
- W3090993757 cites W2141559645 @default.
- W3090993757 cites W2145339207 @default.
- W3090993757 cites W2150339816 @default.
- W3090993757 cites W2155968351 @default.
- W3090993757 cites W2173248099 @default.
- W3090993757 cites W2334782222 @default.
- W3090993757 cites W2469051754 @default.
- W3090993757 cites W2596758708 @default.
- W3090993757 cites W2787938642 @default.
- W3090993757 cites W2795729123 @default.
- W3090993757 cites W2884265051 @default.
- W3090993757 cites W2951799221 @default.
- W3090993757 cites W2963156201 @default.
- W3090993757 cites W2963938771 @default.
- W3090993757 cites W2964043796 @default.
- W3090993757 cites W2983277893 @default.
- W3090993757 hasPublicationYear "2020" @default.
- W3090993757 type Work @default.
- W3090993757 sameAs 3090993757 @default.
- W3090993757 citedByCount "0" @default.
- W3090993757 crossrefType "posted-content" @default.
- W3090993757 hasAuthorship W3090993757A5006795236 @default.
- W3090993757 hasAuthorship W3090993757A5051644960 @default.
- W3090993757 hasConcept C108583219 @default.
- W3090993757 hasConcept C11413529 @default.
- W3090993757 hasConcept C119857082 @default.
- W3090993757 hasConcept C121955636 @default.
- W3090993757 hasConcept C13280743 @default.
- W3090993757 hasConcept C14036430 @default.
- W3090993757 hasConcept C154945302 @default.
- W3090993757 hasConcept C162324750 @default.
- W3090993757 hasConcept C177264268 @default.
- W3090993757 hasConcept C185798385 @default.
- W3090993757 hasConcept C188116033 @default.
- W3090993757 hasConcept C196083921 @default.
- W3090993757 hasConcept C199360897 @default.
- W3090993757 hasConcept C205649164 @default.
- W3090993757 hasConcept C2776291640 @default.
- W3090993757 hasConcept C41008148 @default.
- W3090993757 hasConcept C78458016 @default.
- W3090993757 hasConcept C86803240 @default.
- W3090993757 hasConcept C97541855 @default.
- W3090993757 hasConceptScore W3090993757C108583219 @default.
- W3090993757 hasConceptScore W3090993757C11413529 @default.
- W3090993757 hasConceptScore W3090993757C119857082 @default.
- W3090993757 hasConceptScore W3090993757C121955636 @default.
- W3090993757 hasConceptScore W3090993757C13280743 @default.
- W3090993757 hasConceptScore W3090993757C14036430 @default.
- W3090993757 hasConceptScore W3090993757C154945302 @default.
- W3090993757 hasConceptScore W3090993757C162324750 @default.
- W3090993757 hasConceptScore W3090993757C177264268 @default.
- W3090993757 hasConceptScore W3090993757C185798385 @default.
- W3090993757 hasConceptScore W3090993757C188116033 @default.
- W3090993757 hasConceptScore W3090993757C196083921 @default.
- W3090993757 hasConceptScore W3090993757C199360897 @default.
- W3090993757 hasConceptScore W3090993757C205649164 @default.
- W3090993757 hasConceptScore W3090993757C2776291640 @default.
- W3090993757 hasConceptScore W3090993757C41008148 @default.
- W3090993757 hasConceptScore W3090993757C78458016 @default.
- W3090993757 hasConceptScore W3090993757C86803240 @default.
- W3090993757 hasConceptScore W3090993757C97541855 @default.
- W3090993757 hasLocation W30909937571 @default.
- W3090993757 hasOpenAccess W3090993757 @default.
- W3090993757 hasPrimaryLocation W30909937571 @default.
- W3090993757 hasRelatedWork W2115211925 @default.
- W3090993757 hasRelatedWork W2773456610 @default.
- W3090993757 hasRelatedWork W2807557496 @default.
- W3090993757 hasRelatedWork W2901269509 @default.
- W3090993757 hasRelatedWork W2950726307 @default.
- W3090993757 hasRelatedWork W2952523895 @default.
- W3090993757 hasRelatedWork W2995509794 @default.
- W3090993757 hasRelatedWork W3000642679 @default.
- W3090993757 hasRelatedWork W3012857547 @default.
- W3090993757 hasRelatedWork W3036162308 @default.
- W3090993757 hasRelatedWork W3037812429 @default.
- W3090993757 hasRelatedWork W3037989090 @default.
- W3090993757 hasRelatedWork W3098793220 @default.
- W3090993757 hasRelatedWork W3130972561 @default.
- W3090993757 hasRelatedWork W3135838158 @default.
- W3090993757 hasRelatedWork W3138397191 @default.
- W3090993757 hasRelatedWork W3174772619 @default.
- W3090993757 hasRelatedWork W3199711351 @default.
- W3090993757 hasRelatedWork W3203189308 @default.
- W3090993757 hasRelatedWork W3213562542 @default.
- W3090993757 isParatext "false" @default.