Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226156728> ?p ?o ?g. }
Showing items 1 to 73 of
73
with 100 items per page.
- W4226156728 abstract "Deep reinforcement learning algorithms often use two networks for value function optimization: an online network, and a target network that tracks the online network with some delay. Using two separate networks enables the agent to hedge against issues that arise when performing bootstrapping. In this paper we endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network. This improves the robustness of deep reinforcement learning in presence of noisy updates. The resultant agents, called DQN Pro and Rainbow Pro, exhibit significant performance improvements over their original counterparts on the Atari benchmark demonstrating the effectiveness of this simple idea in deep reinforcement learning. The code for our paper is available here: Github.com/amazon-research/fast-rl-with-slow-updates." @default.
- W4226156728 created "2022-05-05" @default.
- W4226156728 creator A5000245150 @default.
- W4226156728 creator A5009722403 @default.
- W4226156728 creator A5010753342 @default.
- W4226156728 creator A5037667167 @default.
- W4226156728 creator A5045518550 @default.
- W4226156728 creator A5065728469 @default.
- W4226156728 date "2021-12-10" @default.
- W4226156728 modified "2023-09-24" @default.
- W4226156728 title "Faster Deep Reinforcement Learning with Slower Online Network" @default.
- W4226156728 doi "https://doi.org/10.48550/arxiv.2112.05848" @default.
- W4226156728 hasPublicationYear "2021" @default.
- W4226156728 type Work @default.
- W4226156728 citedByCount "0" @default.
- W4226156728 crossrefType "posted-content" @default.
- W4226156728 hasAuthorship W4226156728A5000245150 @default.
- W4226156728 hasAuthorship W4226156728A5009722403 @default.
- W4226156728 hasAuthorship W4226156728A5010753342 @default.
- W4226156728 hasAuthorship W4226156728A5037667167 @default.
- W4226156728 hasAuthorship W4226156728A5045518550 @default.
- W4226156728 hasAuthorship W4226156728A5065728469 @default.
- W4226156728 hasBestOaLocation W42261567281 @default.
- W4226156728 hasConcept C104317684 @default.
- W4226156728 hasConcept C106159729 @default.
- W4226156728 hasConcept C108583219 @default.
- W4226156728 hasConcept C119857082 @default.
- W4226156728 hasConcept C13280743 @default.
- W4226156728 hasConcept C136764020 @default.
- W4226156728 hasConcept C154945302 @default.
- W4226156728 hasConcept C162324750 @default.
- W4226156728 hasConcept C185592680 @default.
- W4226156728 hasConcept C185798385 @default.
- W4226156728 hasConcept C205649164 @default.
- W4226156728 hasConcept C207609745 @default.
- W4226156728 hasConcept C2986087404 @default.
- W4226156728 hasConcept C41008148 @default.
- W4226156728 hasConcept C55493867 @default.
- W4226156728 hasConcept C63479239 @default.
- W4226156728 hasConcept C97541855 @default.
- W4226156728 hasConceptScore W4226156728C104317684 @default.
- W4226156728 hasConceptScore W4226156728C106159729 @default.
- W4226156728 hasConceptScore W4226156728C108583219 @default.
- W4226156728 hasConceptScore W4226156728C119857082 @default.
- W4226156728 hasConceptScore W4226156728C13280743 @default.
- W4226156728 hasConceptScore W4226156728C136764020 @default.
- W4226156728 hasConceptScore W4226156728C154945302 @default.
- W4226156728 hasConceptScore W4226156728C162324750 @default.
- W4226156728 hasConceptScore W4226156728C185592680 @default.
- W4226156728 hasConceptScore W4226156728C185798385 @default.
- W4226156728 hasConceptScore W4226156728C205649164 @default.
- W4226156728 hasConceptScore W4226156728C207609745 @default.
- W4226156728 hasConceptScore W4226156728C2986087404 @default.
- W4226156728 hasConceptScore W4226156728C41008148 @default.
- W4226156728 hasConceptScore W4226156728C55493867 @default.
- W4226156728 hasConceptScore W4226156728C63479239 @default.
- W4226156728 hasConceptScore W4226156728C97541855 @default.
- W4226156728 hasLocation W42261567281 @default.
- W4226156728 hasOpenAccess W4226156728 @default.
- W4226156728 hasPrimaryLocation W42261567281 @default.
- W4226156728 hasRelatedWork W2946567716 @default.
- W4226156728 hasRelatedWork W3014300295 @default.
- W4226156728 hasRelatedWork W3044383684 @default.
- W4226156728 hasRelatedWork W3164822677 @default.
- W4226156728 hasRelatedWork W4223943233 @default.
- W4226156728 hasRelatedWork W4225161397 @default.
- W4226156728 hasRelatedWork W4309045103 @default.
- W4226156728 hasRelatedWork W4312200629 @default.
- W4226156728 hasRelatedWork W4360585206 @default.
- W4226156728 hasRelatedWork W4364306694 @default.
- W4226156728 isParatext "false" @default.
- W4226156728 isRetracted "false" @default.
- W4226156728 workType "article" @default.