Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287693790> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W4287693790 abstract "This paper introduces two simple techniques to improve off-policy Reinforcement Learning (RL) algorithms. First, we formulate off-policy RL as a stochastic proximal point iteration. The target network plays the role of the variable of optimization and the value network computes the proximal operator. Second, we exploits the two value functions commonly employed in state-of-the-art off-policy algorithms to provide an improved action value estimate through bootstrapping with limited increase of computational resources. Further, we demonstrate significant performance improvement over state-of-the-art algorithms on standard continuous-control RL benchmarks." @default.
- W4287693790 created "2022-07-26" @default.
- W4287693790 creator A5006010436 @default.
- W4287693790 creator A5006038667 @default.
- W4287693790 creator A5026617079 @default.
- W4287693790 date "2020-08-03" @default.
- W4287693790 modified "2023-09-24" @default.
- W4287693790 title "Proximal Deterministic Policy Gradient" @default.
- W4287693790 doi "https://doi.org/10.48550/arxiv.2008.00759" @default.
- W4287693790 hasPublicationYear "2020" @default.
- W4287693790 type Work @default.
- W4287693790 citedByCount "0" @default.
- W4287693790 crossrefType "posted-content" @default.
- W4287693790 hasAuthorship W4287693790A5006010436 @default.
- W4287693790 hasAuthorship W4287693790A5006038667 @default.
- W4287693790 hasAuthorship W4287693790A5026617079 @default.
- W4287693790 hasBestOaLocation W42876937901 @default.
- W4287693790 hasConcept C104317684 @default.
- W4287693790 hasConcept C105795698 @default.
- W4287693790 hasConcept C106189395 @default.
- W4287693790 hasConcept C111472728 @default.
- W4287693790 hasConcept C11413529 @default.
- W4287693790 hasConcept C119857082 @default.
- W4287693790 hasConcept C126255220 @default.
- W4287693790 hasConcept C138885662 @default.
- W4287693790 hasConcept C14646407 @default.
- W4287693790 hasConcept C149782125 @default.
- W4287693790 hasConcept C154945302 @default.
- W4287693790 hasConcept C158448853 @default.
- W4287693790 hasConcept C159886148 @default.
- W4287693790 hasConcept C165696696 @default.
- W4287693790 hasConcept C17020691 @default.
- W4287693790 hasConcept C185592680 @default.
- W4287693790 hasConcept C207609745 @default.
- W4287693790 hasConcept C2776291640 @default.
- W4287693790 hasConcept C2780586882 @default.
- W4287693790 hasConcept C33923547 @default.
- W4287693790 hasConcept C38652104 @default.
- W4287693790 hasConcept C41008148 @default.
- W4287693790 hasConcept C55493867 @default.
- W4287693790 hasConcept C86339819 @default.
- W4287693790 hasConcept C97541855 @default.
- W4287693790 hasConceptScore W4287693790C104317684 @default.
- W4287693790 hasConceptScore W4287693790C105795698 @default.
- W4287693790 hasConceptScore W4287693790C106189395 @default.
- W4287693790 hasConceptScore W4287693790C111472728 @default.
- W4287693790 hasConceptScore W4287693790C11413529 @default.
- W4287693790 hasConceptScore W4287693790C119857082 @default.
- W4287693790 hasConceptScore W4287693790C126255220 @default.
- W4287693790 hasConceptScore W4287693790C138885662 @default.
- W4287693790 hasConceptScore W4287693790C14646407 @default.
- W4287693790 hasConceptScore W4287693790C149782125 @default.
- W4287693790 hasConceptScore W4287693790C154945302 @default.
- W4287693790 hasConceptScore W4287693790C158448853 @default.
- W4287693790 hasConceptScore W4287693790C159886148 @default.
- W4287693790 hasConceptScore W4287693790C165696696 @default.
- W4287693790 hasConceptScore W4287693790C17020691 @default.
- W4287693790 hasConceptScore W4287693790C185592680 @default.
- W4287693790 hasConceptScore W4287693790C207609745 @default.
- W4287693790 hasConceptScore W4287693790C2776291640 @default.
- W4287693790 hasConceptScore W4287693790C2780586882 @default.
- W4287693790 hasConceptScore W4287693790C33923547 @default.
- W4287693790 hasConceptScore W4287693790C38652104 @default.
- W4287693790 hasConceptScore W4287693790C41008148 @default.
- W4287693790 hasConceptScore W4287693790C55493867 @default.
- W4287693790 hasConceptScore W4287693790C86339819 @default.
- W4287693790 hasConceptScore W4287693790C97541855 @default.
- W4287693790 hasLocation W42876937901 @default.
- W4287693790 hasOpenAccess W4287693790 @default.
- W4287693790 hasPrimaryLocation W42876937901 @default.
- W4287693790 hasRelatedWork W10913952 @default.
- W4287693790 hasRelatedWork W11104910 @default.
- W4287693790 hasRelatedWork W11960889 @default.
- W4287693790 hasRelatedWork W1279312 @default.
- W4287693790 hasRelatedWork W13469974 @default.
- W4287693790 hasRelatedWork W13717812 @default.
- W4287693790 hasRelatedWork W3422034 @default.
- W4287693790 hasRelatedWork W6242441 @default.
- W4287693790 hasRelatedWork W7318248 @default.
- W4287693790 hasRelatedWork W8536059 @default.
- W4287693790 isParatext "false" @default.
- W4287693790 isRetracted "false" @default.
- W4287693790 workType "article" @default.