Matches in SemOpenAlex for { <https://semopenalex.org/work/W2806029110> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W2806029110 abstract "Deep reinforcement learning methods traditionally struggle with tasks where environment rewards are particularly sparse. One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator. However, these demonstrations are typically collected under artificial conditions, i.e. with access to the agent's exact environment setup and the demonstrator's action and reward trajectories. Here we propose a two-stage method that overcomes these limitations by relying on noisy, unaligned footage without access to such data. First, we learn to map unaligned videos from multiple sources to a common representation using self-supervised objectives constructed over both time and modality (i.e. vision and sound). Second, we embed a single YouTube video in this representation to construct a reward function that encourages an agent to imitate human gameplay. This method of one-shot imitation allows our agent to convincingly exceed human-level performance on the infamously hard exploration games Montezuma's Revenge, Pitfall! and Private Eye for the first time, even if the agent is not presented with any environment rewards." @default.
- W2806029110 created "2018-06-13" @default.
- W2806029110 creator A5003238807 @default.
- W2806029110 creator A5056410851 @default.
- W2806029110 creator A5061277347 @default.
- W2806029110 creator A5064627383 @default.
- W2806029110 creator A5072199759 @default.
- W2806029110 creator A5082304130 @default.
- W2806029110 date "2018-05-29" @default.
- W2806029110 modified "2023-09-27" @default.
- W2806029110 title "Playing hard exploration games by watching YouTube" @default.
- W2806029110 hasPublicationYear "2018" @default.
- W2806029110 type Work @default.
- W2806029110 sameAs 2806029110 @default.
- W2806029110 citedByCount "0" @default.
- W2806029110 crossrefType "posted-content" @default.
- W2806029110 hasAuthorship W2806029110A5003238807 @default.
- W2806029110 hasAuthorship W2806029110A5056410851 @default.
- W2806029110 hasAuthorship W2806029110A5061277347 @default.
- W2806029110 hasAuthorship W2806029110A5064627383 @default.
- W2806029110 hasAuthorship W2806029110A5072199759 @default.
- W2806029110 hasAuthorship W2806029110A5082304130 @default.
- W2806029110 hasConcept C107457646 @default.
- W2806029110 hasConcept C121332964 @default.
- W2806029110 hasConcept C126388530 @default.
- W2806029110 hasConcept C154945302 @default.
- W2806029110 hasConcept C15744967 @default.
- W2806029110 hasConcept C17744445 @default.
- W2806029110 hasConcept C199360897 @default.
- W2806029110 hasConcept C199539241 @default.
- W2806029110 hasConcept C2776359362 @default.
- W2806029110 hasConcept C2780791683 @default.
- W2806029110 hasConcept C2780801425 @default.
- W2806029110 hasConcept C41008148 @default.
- W2806029110 hasConcept C62520636 @default.
- W2806029110 hasConcept C77805123 @default.
- W2806029110 hasConcept C94625758 @default.
- W2806029110 hasConcept C97541855 @default.
- W2806029110 hasConceptScore W2806029110C107457646 @default.
- W2806029110 hasConceptScore W2806029110C121332964 @default.
- W2806029110 hasConceptScore W2806029110C126388530 @default.
- W2806029110 hasConceptScore W2806029110C154945302 @default.
- W2806029110 hasConceptScore W2806029110C15744967 @default.
- W2806029110 hasConceptScore W2806029110C17744445 @default.
- W2806029110 hasConceptScore W2806029110C199360897 @default.
- W2806029110 hasConceptScore W2806029110C199539241 @default.
- W2806029110 hasConceptScore W2806029110C2776359362 @default.
- W2806029110 hasConceptScore W2806029110C2780791683 @default.
- W2806029110 hasConceptScore W2806029110C2780801425 @default.
- W2806029110 hasConceptScore W2806029110C41008148 @default.
- W2806029110 hasConceptScore W2806029110C62520636 @default.
- W2806029110 hasConceptScore W2806029110C77805123 @default.
- W2806029110 hasConceptScore W2806029110C94625758 @default.
- W2806029110 hasConceptScore W2806029110C97541855 @default.
- W2806029110 hasLocation W28060291101 @default.
- W2806029110 hasOpenAccess W2806029110 @default.
- W2806029110 hasPrimaryLocation W28060291101 @default.
- W2806029110 hasRelatedWork W11497584 @default.
- W2806029110 hasRelatedWork W1543473597 @default.
- W2806029110 hasRelatedWork W174404835 @default.
- W2806029110 hasRelatedWork W2090854271 @default.
- W2806029110 hasRelatedWork W2100026102 @default.
- W2806029110 hasRelatedWork W2296531872 @default.
- W2806029110 hasRelatedWork W2753316839 @default.
- W2806029110 hasRelatedWork W2888144465 @default.
- W2806029110 hasRelatedWork W2947458343 @default.
- W2806029110 hasRelatedWork W2948779987 @default.
- W2806029110 hasRelatedWork W2952451489 @default.
- W2806029110 hasRelatedWork W2962715211 @default.
- W2806029110 hasRelatedWork W2963687684 @default.
- W2806029110 hasRelatedWork W2963948533 @default.
- W2806029110 hasRelatedWork W2971905168 @default.
- W2806029110 hasRelatedWork W2996235414 @default.
- W2806029110 hasRelatedWork W3089546934 @default.
- W2806029110 hasRelatedWork W3094454393 @default.
- W2806029110 hasRelatedWork W3176150112 @default.
- W2806029110 hasRelatedWork W2218287383 @default.
- W2806029110 isParatext "false" @default.
- W2806029110 isRetracted "false" @default.
- W2806029110 magId "2806029110" @default.
- W2806029110 workType "article" @default.