Matches in SemOpenAlex for { <https://semopenalex.org/work/W3123169626> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W3123169626 abstract "Despite definite success in deep reinforcement learning problems, actor-critic algorithms are still confronted with sample inefficiency in complex environments, particularly in tasks where efficient exploration is a bottleneck. These methods consider a policy (the actor) and a value function (the critic) whose respective losses are built using different motivations and approaches. This paper introduces a third protagonist: the adversary. While the adversary mimics the actor by minimizing the KL-divergence between their respective action distributions, the actor, in addition to learning to solve the task, tries to differentiate itself from the adversary predictions. This novel objective stimulates the actor to follow strategies that could not have been correctly predicted from previous trajectories, making its behavior innovative in tasks where the reward is extremely rare. Our experimental analysis shows that the resulting Adversarially Guided Actor-Critic (AGAC) algorithm leads to more exhaustive exploration. Notably, AGAC outperforms current state-of-the-art methods on a set of various hard-exploration and procedurally-generated tasks." @default.
- W3123169626 created "2021-02-01" @default.
- W3123169626 creator A5004267040 @default.
- W3123169626 creator A5045192384 @default.
- W3123169626 creator A5065100569 @default.
- W3123169626 creator A5087706654 @default.
- W3123169626 creator A5087891858 @default.
- W3123169626 date "2021-05-04" @default.
- W3123169626 modified "2023-09-23" @default.
- W3123169626 title "Adversarially Guided Actor-Critic" @default.
- W3123169626 cites W1569296262 @default.
- W3123169626 cites W1771410628 @default.
- W3123169626 cites W2091565802 @default.
- W3123169626 cites W2099471712 @default.
- W3123169626 cites W2145339207 @default.
- W3123169626 cites W2165150801 @default.
- W3123169626 cites W2527819024 @default.
- W3123169626 cites W2736601468 @default.
- W3123169626 cites W2751973545 @default.
- W3123169626 cites W2787613197 @default.
- W3123169626 cites W2797527950 @default.
- W3123169626 cites W2809668646 @default.
- W3123169626 cites W2891790128 @default.
- W3123169626 cites W2893662673 @default.
- W3123169626 cites W2895560838 @default.
- W3123169626 cites W2914261249 @default.
- W3123169626 cites W2914920107 @default.
- W3123169626 cites W2962821147 @default.
- W3123169626 cites W2962938178 @default.
- W3123169626 cites W2963095800 @default.
- W3123169626 cites W2963184621 @default.
- W3123169626 cites W2963207607 @default.
- W3123169626 cites W2963276097 @default.
- W3123169626 cites W2963277051 @default.
- W3123169626 cites W2963285578 @default.
- W3123169626 cites W2963293881 @default.
- W3123169626 cites W2963313316 @default.
- W3123169626 cites W2963389226 @default.
- W3123169626 cites W2963438456 @default.
- W3123169626 cites W2963674921 @default.
- W3123169626 cites W2963864421 @default.
- W3123169626 cites W2963871073 @default.
- W3123169626 cites W2963985863 @default.
- W3123169626 cites W2964040467 @default.
- W3123169626 cites W2964043796 @default.
- W3123169626 cites W2964062135 @default.
- W3123169626 cites W2964067469 @default.
- W3123169626 cites W2964121744 @default.
- W3123169626 cites W2964174623 @default.
- W3123169626 cites W2964201867 @default.
- W3123169626 cites W2970214542 @default.
- W3123169626 cites W2976021111 @default.
- W3123169626 cites W2982316857 @default.
- W3123169626 cites W2990376820 @default.
- W3123169626 cites W2996283175 @default.
- W3123169626 cites W2999617596 @default.
- W3123169626 cites W3014137283 @default.
- W3123169626 cites W3032956793 @default.
- W3123169626 cites W3034946435 @default.
- W3123169626 cites W3099050578 @default.
- W3123169626 cites W3118674454 @default.
- W3123169626 cites W3132970947 @default.
- W3123169626 cites W3176926605 @default.
- W3123169626 hasPublicationYear "2021" @default.
- W3123169626 type Work @default.
- W3123169626 sameAs 3123169626 @default.
- W3123169626 citedByCount "4" @default.
- W3123169626 countsByYear W31231696262021 @default.
- W3123169626 crossrefType "proceedings-article" @default.
- W3123169626 hasAuthorship W3123169626A5004267040 @default.
- W3123169626 hasAuthorship W3123169626A5045192384 @default.
- W3123169626 hasAuthorship W3123169626A5065100569 @default.
- W3123169626 hasAuthorship W3123169626A5087706654 @default.
- W3123169626 hasAuthorship W3123169626A5087891858 @default.
- W3123169626 hasBestOaLocation W31231696261 @default.
- W3123169626 hasConcept C107457646 @default.
- W3123169626 hasConcept C41008148 @default.
- W3123169626 hasConceptScore W3123169626C107457646 @default.
- W3123169626 hasConceptScore W3123169626C41008148 @default.
- W3123169626 hasLocation W31231696261 @default.
- W3123169626 hasLocation W31231696262 @default.
- W3123169626 hasLocation W31231696263 @default.
- W3123169626 hasOpenAccess W3123169626 @default.
- W3123169626 hasPrimaryLocation W31231696261 @default.
- W3123169626 hasRelatedWork W110311947 @default.
- W3123169626 hasRelatedWork W1993084869 @default.
- W3123169626 hasRelatedWork W2051154247 @default.
- W3123169626 hasRelatedWork W2076610045 @default.
- W3123169626 hasRelatedWork W2278205256 @default.
- W3123169626 hasRelatedWork W2883555950 @default.
- W3123169626 hasRelatedWork W3005892291 @default.
- W3123169626 hasRelatedWork W4244299974 @default.
- W3123169626 hasRelatedWork W4246426965 @default.
- W3123169626 hasRelatedWork W3106945349 @default.
- W3123169626 isParatext "false" @default.
- W3123169626 isRetracted "false" @default.
- W3123169626 magId "3123169626" @default.
- W3123169626 workType "article" @default.