Matches in SemOpenAlex for { <https://semopenalex.org/work/W2951557330> ?p ?o ?g. }
- W2951557330 abstract "Learning to control an environment without hand-crafted rewards or expert data remains challenging and is at the frontier of reinforcement learning research. We present an unsupervised learning algorithm to train agents to achieve perceptually-specified goals using only a stream of observations and actions. Our agent simultaneously learns a goal-conditioned policy and a goal achievement reward function that measures how similar a state is to the goal state. This dual optimization leads to a co-operative game, giving rise to a learned reward function that reflects similarity in controllable aspects of the environment instead of distance in the space of observations. We demonstrate the efficacy of our agent to learn, in an unsupervised manner, to reach a diverse set of goals on three domains -- Atari, the DeepMind Control Suite and DeepMind Lab." @default.
- W2951557330 created "2019-06-27" @default.
- W2951557330 creator A5008904142 @default.
- W2951557330 creator A5014069192 @default.
- W2951557330 creator A5046449484 @default.
- W2951557330 creator A5056424874 @default.
- W2951557330 creator A5085746798 @default.
- W2951557330 creator A5091197128 @default.
- W2951557330 date "2018-11-28" @default.
- W2951557330 modified "2023-10-01" @default.
- W2951557330 title "Unsupervised Control Through Non-Parametric Discriminative Rewards" @default.
- W2951557330 cites W1515851193 @default.
- W2951557330 cites W1595483645 @default.
- W2951557330 cites W2118688707 @default.
- W2951557330 cites W2119985666 @default.
- W2951557330 cites W2138779671 @default.
- W2951557330 cites W2145339207 @default.
- W2951557330 cites W2147880316 @default.
- W2951557330 cites W2148989240 @default.
- W2951557330 cites W2432717477 @default.
- W2951557330 cites W2434014514 @default.
- W2951557330 cites W2606433045 @default.
- W2951557330 cites W2616430965 @default.
- W2951557330 cites W2781585732 @default.
- W2951557330 cites W2786036274 @default.
- W2951557330 cites W2786917922 @default.
- W2951557330 cites W2786928559 @default.
- W2951557330 cites W2795756076 @default.
- W2951557330 cites W2804672169 @default.
- W2951557330 cites W2810132790 @default.
- W2951557330 cites W2823112946 @default.
- W2951557330 cites W2886380293 @default.
- W2951557330 cites W2949267040 @default.
- W2951557330 cites W2949536664 @default.
- W2951557330 cites W2950040888 @default.
- W2951557330 cites W2950471160 @default.
- W2951557330 cites W2963305465 @default.
- W2951557330 cites W567721252 @default.
- W2951557330 hasPublicationYear "2018" @default.
- W2951557330 type Work @default.
- W2951557330 sameAs 2951557330 @default.
- W2951557330 citedByCount "41" @default.
- W2951557330 countsByYear W29515573302018 @default.
- W2951557330 countsByYear W29515573302019 @default.
- W2951557330 countsByYear W29515573302020 @default.
- W2951557330 countsByYear W29515573302021 @default.
- W2951557330 countsByYear W29515573302022 @default.
- W2951557330 crossrefType "posted-content" @default.
- W2951557330 hasAuthorship W2951557330A5008904142 @default.
- W2951557330 hasAuthorship W2951557330A5014069192 @default.
- W2951557330 hasAuthorship W2951557330A5046449484 @default.
- W2951557330 hasAuthorship W2951557330A5056424874 @default.
- W2951557330 hasAuthorship W2951557330A5085746798 @default.
- W2951557330 hasAuthorship W2951557330A5091197128 @default.
- W2951557330 hasConcept C103278499 @default.
- W2951557330 hasConcept C105795698 @default.
- W2951557330 hasConcept C111919701 @default.
- W2951557330 hasConcept C11413529 @default.
- W2951557330 hasConcept C115961682 @default.
- W2951557330 hasConcept C119857082 @default.
- W2951557330 hasConcept C14036430 @default.
- W2951557330 hasConcept C154945302 @default.
- W2951557330 hasConcept C166957645 @default.
- W2951557330 hasConcept C177264268 @default.
- W2951557330 hasConcept C199360897 @default.
- W2951557330 hasConcept C2775924081 @default.
- W2951557330 hasConcept C2778572836 @default.
- W2951557330 hasConcept C33923547 @default.
- W2951557330 hasConcept C41008148 @default.
- W2951557330 hasConcept C48103436 @default.
- W2951557330 hasConcept C72434380 @default.
- W2951557330 hasConcept C78458016 @default.
- W2951557330 hasConcept C79581498 @default.
- W2951557330 hasConcept C8038995 @default.
- W2951557330 hasConcept C86803240 @default.
- W2951557330 hasConcept C95457728 @default.
- W2951557330 hasConcept C97541855 @default.
- W2951557330 hasConcept C97931131 @default.
- W2951557330 hasConceptScore W2951557330C103278499 @default.
- W2951557330 hasConceptScore W2951557330C105795698 @default.
- W2951557330 hasConceptScore W2951557330C111919701 @default.
- W2951557330 hasConceptScore W2951557330C11413529 @default.
- W2951557330 hasConceptScore W2951557330C115961682 @default.
- W2951557330 hasConceptScore W2951557330C119857082 @default.
- W2951557330 hasConceptScore W2951557330C14036430 @default.
- W2951557330 hasConceptScore W2951557330C154945302 @default.
- W2951557330 hasConceptScore W2951557330C166957645 @default.
- W2951557330 hasConceptScore W2951557330C177264268 @default.
- W2951557330 hasConceptScore W2951557330C199360897 @default.
- W2951557330 hasConceptScore W2951557330C2775924081 @default.
- W2951557330 hasConceptScore W2951557330C2778572836 @default.
- W2951557330 hasConceptScore W2951557330C33923547 @default.
- W2951557330 hasConceptScore W2951557330C41008148 @default.
- W2951557330 hasConceptScore W2951557330C48103436 @default.
- W2951557330 hasConceptScore W2951557330C72434380 @default.
- W2951557330 hasConceptScore W2951557330C78458016 @default.
- W2951557330 hasConceptScore W2951557330C79581498 @default.
- W2951557330 hasConceptScore W2951557330C8038995 @default.
- W2951557330 hasConceptScore W2951557330C86803240 @default.
- W2951557330 hasConceptScore W2951557330C95457728 @default.