Matches in SemOpenAlex for { <https://semopenalex.org/work/W3164527857> ?p ?o ?g. }
- W3164527857 abstract "In imitation learning, it is common to learn a behavior policy to match an unknown target policy via max-likelihood training on a collected set of target demonstrations. In this work, we consider using offline experience datasets - potentially far from the target distribution - to learn low-dimensional state representations that provably accelerate the sample-efficiency of downstream imitation learning. A central challenge in this setting is that the unknown target policy itself may not exhibit low-dimensional behavior, and so there is a potential for the representation learning objective to alias states in which the target policy acts differently. Circumventing this challenge, we derive a representation learning objective that provides an upper bound on the performance difference between the target policy and a lowdimensional policy trained with max-likelihood, and this bound is tight regardless of whether the target policy itself exhibits low-dimensional structure. Moving to the practicality of our method, we show that our objective can be implemented as contrastive learning, in which the transition dynamics are approximated by either an implicit energy-based model or, in some special cases, an implicit linear model with representations given by random Fourier features. Experiments on both tabular environments and high-dimensional Atari games provide quantitative evidence for the practical benefits of our proposed objective." @default.
- W3164527857 created "2021-06-07" @default.
- W3164527857 creator A5057773393 @default.
- W3164527857 creator A5085157310 @default.
- W3164527857 date "2021-05-25" @default.
- W3164527857 modified "2023-09-26" @default.
- W3164527857 title "Provable Representation Learning for Imitation with Contrastive Fourier Features" @default.
- W3164527857 cites W1481933447 @default.
- W3164527857 cites W2051228319 @default.
- W3164527857 cites W2071841410 @default.
- W3164527857 cites W2119567691 @default.
- W3164527857 cites W2121863487 @default.
- W3164527857 cites W2134689794 @default.
- W3164527857 cites W2142641780 @default.
- W3164527857 cites W2144902422 @default.
- W3164527857 cites W2468354762 @default.
- W3164527857 cites W2568646110 @default.
- W3164527857 cites W2787666871 @default.
- W3164527857 cites W2894605519 @default.
- W3164527857 cites W2905342215 @default.
- W3164527857 cites W2950577311 @default.
- W3164527857 cites W2952853356 @default.
- W3164527857 cites W2962715211 @default.
- W3164527857 cites W2962803570 @default.
- W3164527857 cites W2962957031 @default.
- W3164527857 cites W2963521487 @default.
- W3164527857 cites W2964329252 @default.
- W3164527857 cites W2992977009 @default.
- W3164527857 cites W2995146921 @default.
- W3164527857 cites W2997101648 @default.
- W3164527857 cites W3016525976 @default.
- W3164527857 cites W3022566517 @default.
- W3164527857 cites W3023640063 @default.
- W3164527857 cites W3034607397 @default.
- W3164527857 cites W3034978746 @default.
- W3164527857 cites W3035180992 @default.
- W3164527857 cites W3035273634 @default.
- W3164527857 cites W3035810656 @default.
- W3164527857 cites W3036185205 @default.
- W3164527857 cites W3037024314 @default.
- W3164527857 cites W3039845099 @default.
- W3164527857 cites W3096964654 @default.
- W3164527857 cites W3103780890 @default.
- W3164527857 cites W3115293622 @default.
- W3164527857 cites W3119486431 @default.
- W3164527857 cites W3126188503 @default.
- W3164527857 cites W3131920644 @default.
- W3164527857 cites W3158726102 @default.
- W3164527857 doi "https://doi.org/10.48550/arxiv.2105.12272" @default.
- W3164527857 hasPublicationYear "2021" @default.
- W3164527857 type Work @default.
- W3164527857 sameAs 3164527857 @default.
- W3164527857 citedByCount "4" @default.
- W3164527857 countsByYear W31645278572021 @default.
- W3164527857 crossrefType "posted-content" @default.
- W3164527857 hasAuthorship W3164527857A5057773393 @default.
- W3164527857 hasAuthorship W3164527857A5085157310 @default.
- W3164527857 hasBestOaLocation W31645278571 @default.
- W3164527857 hasConcept C102519508 @default.
- W3164527857 hasConcept C119857082 @default.
- W3164527857 hasConcept C124101348 @default.
- W3164527857 hasConcept C126388530 @default.
- W3164527857 hasConcept C134306372 @default.
- W3164527857 hasConcept C154945302 @default.
- W3164527857 hasConcept C15744967 @default.
- W3164527857 hasConcept C177264268 @default.
- W3164527857 hasConcept C17744445 @default.
- W3164527857 hasConcept C199360897 @default.
- W3164527857 hasConcept C199539241 @default.
- W3164527857 hasConcept C2776359362 @default.
- W3164527857 hasConcept C33923547 @default.
- W3164527857 hasConcept C41008148 @default.
- W3164527857 hasConcept C46681722 @default.
- W3164527857 hasConcept C59404180 @default.
- W3164527857 hasConcept C77805123 @default.
- W3164527857 hasConcept C94625758 @default.
- W3164527857 hasConceptScore W3164527857C102519508 @default.
- W3164527857 hasConceptScore W3164527857C119857082 @default.
- W3164527857 hasConceptScore W3164527857C124101348 @default.
- W3164527857 hasConceptScore W3164527857C126388530 @default.
- W3164527857 hasConceptScore W3164527857C134306372 @default.
- W3164527857 hasConceptScore W3164527857C154945302 @default.
- W3164527857 hasConceptScore W3164527857C15744967 @default.
- W3164527857 hasConceptScore W3164527857C177264268 @default.
- W3164527857 hasConceptScore W3164527857C17744445 @default.
- W3164527857 hasConceptScore W3164527857C199360897 @default.
- W3164527857 hasConceptScore W3164527857C199539241 @default.
- W3164527857 hasConceptScore W3164527857C2776359362 @default.
- W3164527857 hasConceptScore W3164527857C33923547 @default.
- W3164527857 hasConceptScore W3164527857C41008148 @default.
- W3164527857 hasConceptScore W3164527857C46681722 @default.
- W3164527857 hasConceptScore W3164527857C59404180 @default.
- W3164527857 hasConceptScore W3164527857C77805123 @default.
- W3164527857 hasConceptScore W3164527857C94625758 @default.
- W3164527857 hasLocation W31645278571 @default.
- W3164527857 hasOpenAccess W3164527857 @default.
- W3164527857 hasPrimaryLocation W31645278571 @default.
- W3164527857 hasRelatedWork W2891961174 @default.
- W3164527857 hasRelatedWork W2908875379 @default.
- W3164527857 hasRelatedWork W2972984751 @default.