Matches in SemOpenAlex for { <https://semopenalex.org/work/W3111464135> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W3111464135 abstract "Actor-critic methods, a type of model-free reinforcement learning (RL), have achieved state-of-the-art performances in many real-world domains in continuous control. Despite their success, the wide-scale deployment of these models is still a far cry. The main problems in these actor-critic methods are inefficient exploration and sub-optimal policies. Soft Actor-Critic (SAC) and Twin Delayed Deep Deterministic Policy Gradient (TD3), two cutting edge such algorithms, suffer from these issues. SAC effectively addressed the problems of sample complexity and convergence brittleness to hyper-parameters and thus outperformed all state-of-the-art algorithms including TD3 in harder tasks, whereas TD3 produced moderate results in all environments. SAC suffers from inefficient exploration owing to the Gaussian nature of its policy which causes borderline performance in simpler tasks. In this paper, we introduce Opportunistic Actor-Critic (OPAC), a novel model-free deep RL algorithm that employs better exploration policy and lesser variance. OPAC combines some of the most powerful features of TD3 and SAC and aims to optimize a stochastic policy in an off-policy way. For calculating the target Q-values, instead of two critics, OPAC uses three critics and based on the environment complexity, opportunistically chooses how the target Q-value is computed from the critics' evaluation. We have systematically evaluated the algorithm on MuJoCo environments where it achieves state-of-the-art performance and outperforms or at least equals the performance of TD3 and SAC." @default.
- W3111464135 created "2020-12-21" @default.
- W3111464135 creator A5074678535 @default.
- W3111464135 creator A5075656144 @default.
- W3111464135 creator A5088981615 @default.
- W3111464135 date "2020-12-11" @default.
- W3111464135 modified "2023-09-27" @default.
- W3111464135 title "OPAC: Opportunistic Actor-Critic." @default.
- W3111464135 cites W1757796397 @default.
- W3111464135 cites W2098432798 @default.
- W3111464135 cites W2098774185 @default.
- W3111464135 cites W2107464055 @default.
- W3111464135 cites W2145339207 @default.
- W3111464135 cites W2150339816 @default.
- W3111464135 cites W2155772159 @default.
- W3111464135 cites W2158782408 @default.
- W3111464135 cites W2257979135 @default.
- W3111464135 cites W2294241375 @default.
- W3111464135 cites W2593044849 @default.
- W3111464135 cites W2609650878 @default.
- W3111464135 cites W2736601468 @default.
- W3111464135 cites W2787938642 @default.
- W3111464135 cites W2904246096 @default.
- W3111464135 cites W2949608212 @default.
- W3111464135 cites W2963403593 @default.
- W3111464135 cites W2963864421 @default.
- W3111464135 cites W2970961171 @default.
- W3111464135 cites W3037207827 @default.
- W3111464135 cites W64088143 @default.
- W3111464135 cites W3089091950 @default.
- W3111464135 hasPublicationYear "2020" @default.
- W3111464135 type Work @default.
- W3111464135 sameAs 3111464135 @default.
- W3111464135 citedByCount "2" @default.
- W3111464135 countsByYear W31114641352021 @default.
- W3111464135 countsByYear W31114641352022 @default.
- W3111464135 crossrefType "posted-content" @default.
- W3111464135 hasAuthorship W3111464135A5074678535 @default.
- W3111464135 hasAuthorship W3111464135A5075656144 @default.
- W3111464135 hasAuthorship W3111464135A5088981615 @default.
- W3111464135 hasConcept C11413529 @default.
- W3111464135 hasConcept C121332964 @default.
- W3111464135 hasConcept C121955636 @default.
- W3111464135 hasConcept C126255220 @default.
- W3111464135 hasConcept C144133560 @default.
- W3111464135 hasConcept C154945302 @default.
- W3111464135 hasConcept C163716315 @default.
- W3111464135 hasConcept C196083921 @default.
- W3111464135 hasConcept C33923547 @default.
- W3111464135 hasConcept C41008148 @default.
- W3111464135 hasConcept C48103436 @default.
- W3111464135 hasConcept C62520636 @default.
- W3111464135 hasConcept C97541855 @default.
- W3111464135 hasConceptScore W3111464135C11413529 @default.
- W3111464135 hasConceptScore W3111464135C121332964 @default.
- W3111464135 hasConceptScore W3111464135C121955636 @default.
- W3111464135 hasConceptScore W3111464135C126255220 @default.
- W3111464135 hasConceptScore W3111464135C144133560 @default.
- W3111464135 hasConceptScore W3111464135C154945302 @default.
- W3111464135 hasConceptScore W3111464135C163716315 @default.
- W3111464135 hasConceptScore W3111464135C196083921 @default.
- W3111464135 hasConceptScore W3111464135C33923547 @default.
- W3111464135 hasConceptScore W3111464135C41008148 @default.
- W3111464135 hasConceptScore W3111464135C48103436 @default.
- W3111464135 hasConceptScore W3111464135C62520636 @default.
- W3111464135 hasConceptScore W3111464135C97541855 @default.
- W3111464135 hasLocation W31114641351 @default.
- W3111464135 hasOpenAccess W3111464135 @default.
- W3111464135 hasPrimaryLocation W31114641351 @default.
- W3111464135 hasRelatedWork W2156347136 @default.
- W3111464135 hasRelatedWork W2607992281 @default.
- W3111464135 hasRelatedWork W2612795874 @default.
- W3111464135 hasRelatedWork W2911305319 @default.
- W3111464135 hasRelatedWork W2921778527 @default.
- W3111464135 hasRelatedWork W2948275977 @default.
- W3111464135 hasRelatedWork W2948708918 @default.
- W3111464135 hasRelatedWork W2950435926 @default.
- W3111464135 hasRelatedWork W2962902376 @default.
- W3111464135 hasRelatedWork W2970163238 @default.
- W3111464135 hasRelatedWork W2986409676 @default.
- W3111464135 hasRelatedWork W2995031981 @default.
- W3111464135 hasRelatedWork W3092578608 @default.
- W3111464135 hasRelatedWork W3104595455 @default.
- W3111464135 hasRelatedWork W3127139973 @default.
- W3111464135 hasRelatedWork W3180179613 @default.
- W3111464135 hasRelatedWork W3199207920 @default.
- W3111464135 hasRelatedWork W3204075372 @default.
- W3111464135 hasRelatedWork W3206637819 @default.
- W3111464135 hasRelatedWork W3207294509 @default.
- W3111464135 isParatext "false" @default.
- W3111464135 isRetracted "false" @default.
- W3111464135 magId "3111464135" @default.
- W3111464135 workType "article" @default.