Matches in SemOpenAlex for { <https://semopenalex.org/work/W2950872548> ?p ?o ?g. }
- W2950872548 abstract "Deep reinforcement learning agents have achieved state-of-the-art results by directly maximising cumulative reward. However, environments contain a much wider variety of possible training signals. In this paper, we introduce an agent that also maximises many other pseudo-reward functions simultaneously by reinforcement learning. All of these tasks share a common representation that, like unsupervised learning, continues to develop in the absence of extrinsic rewards. We also introduce a novel mechanism for focusing this representation upon extrinsic rewards, so that learning can rapidly adapt to the most relevant aspects of the actual task. Our agent significantly outperforms the previous state-of-the-art on Atari, averaging 880% expert human performance, and a challenging suite of first-person, three-dimensional emph{Labyrinth} tasks leading to a mean speedup in learning of 10$times$ and averaging 87% expert human performance on Labyrinth." @default.
- W2950872548 created "2019-06-27" @default.
- W2950872548 creator A5004153109 @default.
- W2950872548 creator A5008904142 @default.
- W2950872548 creator A5054808675 @default.
- W2950872548 creator A5064372346 @default.
- W2950872548 creator A5081322018 @default.
- W2950872548 creator A5090341705 @default.
- W2950872548 creator A5091771290 @default.
- W2950872548 date "2016-11-16" @default.
- W2950872548 modified "2023-10-01" @default.
- W2950872548 title "Reinforcement Learning with Unsupervised Auxiliary Tasks" @default.
- W2950872548 cites W1480527676 @default.
- W2950872548 cites W1594297126 @default.
- W2950872548 cites W2034806191 @default.
- W2950872548 cites W2056354534 @default.
- W2950872548 cites W2108535023 @default.
- W2950872548 cites W2118688707 @default.
- W2950872548 cites W2136848157 @default.
- W2950872548 cites W2155027007 @default.
- W2950872548 cites W2166620359 @default.
- W2950872548 cites W2173564293 @default.
- W2950872548 cites W2201581102 @default.
- W2950872548 cites W2257979135 @default.
- W2950872548 cites W2344023930 @default.
- W2950872548 cites W2362143032 @default.
- W2950872548 cites W2417089653 @default.
- W2950872548 cites W2440926996 @default.
- W2950872548 cites W2610686804 @default.
- W2950872548 cites W2950708852 @default.
- W2950872548 cites W2951982866 @default.
- W2950872548 cites W2963781688 @default.
- W2950872548 cites W2964043796 @default.
- W2950872548 cites W567721252 @default.
- W2950872548 hasPublicationYear "2016" @default.
- W2950872548 type Work @default.
- W2950872548 sameAs 2950872548 @default.
- W2950872548 citedByCount "388" @default.
- W2950872548 countsByYear W29508725482016 @default.
- W2950872548 countsByYear W29508725482017 @default.
- W2950872548 countsByYear W29508725482018 @default.
- W2950872548 countsByYear W29508725482019 @default.
- W2950872548 countsByYear W29508725482020 @default.
- W2950872548 countsByYear W29508725482021 @default.
- W2950872548 countsByYear W29508725482022 @default.
- W2950872548 countsByYear W29508725482023 @default.
- W2950872548 crossrefType "posted-content" @default.
- W2950872548 hasAuthorship W2950872548A5004153109 @default.
- W2950872548 hasAuthorship W2950872548A5008904142 @default.
- W2950872548 hasAuthorship W2950872548A5054808675 @default.
- W2950872548 hasAuthorship W2950872548A5064372346 @default.
- W2950872548 hasAuthorship W2950872548A5081322018 @default.
- W2950872548 hasAuthorship W2950872548A5090341705 @default.
- W2950872548 hasAuthorship W2950872548A5091771290 @default.
- W2950872548 hasConcept C111472728 @default.
- W2950872548 hasConcept C111919701 @default.
- W2950872548 hasConcept C119857082 @default.
- W2950872548 hasConcept C127413603 @default.
- W2950872548 hasConcept C136197465 @default.
- W2950872548 hasConcept C138885662 @default.
- W2950872548 hasConcept C154945302 @default.
- W2950872548 hasConcept C15744967 @default.
- W2950872548 hasConcept C166957645 @default.
- W2950872548 hasConcept C17744445 @default.
- W2950872548 hasConcept C199539241 @default.
- W2950872548 hasConcept C201995342 @default.
- W2950872548 hasConcept C2776359362 @default.
- W2950872548 hasConcept C2780451532 @default.
- W2950872548 hasConcept C41008148 @default.
- W2950872548 hasConcept C67203356 @default.
- W2950872548 hasConcept C68339613 @default.
- W2950872548 hasConcept C77805123 @default.
- W2950872548 hasConcept C79581498 @default.
- W2950872548 hasConcept C8038995 @default.
- W2950872548 hasConcept C89611455 @default.
- W2950872548 hasConcept C94625758 @default.
- W2950872548 hasConcept C95457728 @default.
- W2950872548 hasConcept C97541855 @default.
- W2950872548 hasConceptScore W2950872548C111472728 @default.
- W2950872548 hasConceptScore W2950872548C111919701 @default.
- W2950872548 hasConceptScore W2950872548C119857082 @default.
- W2950872548 hasConceptScore W2950872548C127413603 @default.
- W2950872548 hasConceptScore W2950872548C136197465 @default.
- W2950872548 hasConceptScore W2950872548C138885662 @default.
- W2950872548 hasConceptScore W2950872548C154945302 @default.
- W2950872548 hasConceptScore W2950872548C15744967 @default.
- W2950872548 hasConceptScore W2950872548C166957645 @default.
- W2950872548 hasConceptScore W2950872548C17744445 @default.
- W2950872548 hasConceptScore W2950872548C199539241 @default.
- W2950872548 hasConceptScore W2950872548C201995342 @default.
- W2950872548 hasConceptScore W2950872548C2776359362 @default.
- W2950872548 hasConceptScore W2950872548C2780451532 @default.
- W2950872548 hasConceptScore W2950872548C41008148 @default.
- W2950872548 hasConceptScore W2950872548C67203356 @default.
- W2950872548 hasConceptScore W2950872548C68339613 @default.
- W2950872548 hasConceptScore W2950872548C77805123 @default.
- W2950872548 hasConceptScore W2950872548C79581498 @default.
- W2950872548 hasConceptScore W2950872548C8038995 @default.
- W2950872548 hasConceptScore W2950872548C89611455 @default.
- W2950872548 hasConceptScore W2950872548C94625758 @default.