Matches in SemOpenAlex for { <https://semopenalex.org/work/W3129170303> ?p ?o ?g. }
- W3129170303 abstract "Reinforcement learning methods trained on few environments rarely learn policies that generalize to unseen environments. To improve generalization, we incorporate the inherent sequential structure in reinforcement learning into the representation learning process. This approach is orthogonal to recent approaches, which rarely exploit this structure explicitly. Specifically, we introduce a theoretically motivated policy similarity metric (PSM) for measuring behavioral similarity between states. PSM assigns high similarity to states for which the optimal policies in those states as well as in future states are similar. We also present a contrastive representation learning procedure to embed any state similarity metric, which we instantiate with PSM to obtain policy similarity embeddings (PSEs). We demonstrate that PSEs improve generalization on diverse benchmarks, including LQR with spurious correlations, a jumping task from pixels, and Distracting DM Control Suite." @default.
- W3129170303 created "2021-02-15" @default.
- W3129170303 creator A5001087292 @default.
- W3129170303 creator A5068291173 @default.
- W3129170303 creator A5070953294 @default.
- W3129170303 creator A5085413987 @default.
- W3129170303 date "2021-05-03" @default.
- W3129170303 modified "2023-09-24" @default.
- W3129170303 title "Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning" @default.
- W3129170303 cites W1504915502 @default.
- W3129170303 cites W1976268573 @default.
- W3129170303 cites W1995688924 @default.
- W3129170303 cites W2058735307 @default.
- W3129170303 cites W2097381042 @default.
- W3129170303 cites W2119567691 @default.
- W3129170303 cites W2138621090 @default.
- W3129170303 cites W2405468764 @default.
- W3129170303 cites W2568646110 @default.
- W3129170303 cites W2604763608 @default.
- W3129170303 cites W2605102758 @default.
- W3129170303 cites W2620290674 @default.
- W3129170303 cites W2732547613 @default.
- W3129170303 cites W2751319288 @default.
- W3129170303 cites W2781726626 @default.
- W3129170303 cites W2786672974 @default.
- W3129170303 cites W2822752092 @default.
- W3129170303 cites W2842511635 @default.
- W3129170303 cites W2889965839 @default.
- W3129170303 cites W2891790128 @default.
- W3129170303 cites W2893662673 @default.
- W3129170303 cites W2898436992 @default.
- W3129170303 cites W2899476926 @default.
- W3129170303 cites W2901796235 @default.
- W3129170303 cites W2904815624 @default.
- W3129170303 cites W2916743882 @default.
- W3129170303 cites W2962754721 @default.
- W3129170303 cites W2963043696 @default.
- W3129170303 cites W2963085895 @default.
- W3129170303 cites W2963680188 @default.
- W3129170303 cites W2964331425 @default.
- W3129170303 cites W2966556569 @default.
- W3129170303 cites W2970214542 @default.
- W3129170303 cites W2970259623 @default.
- W3129170303 cites W2996283175 @default.
- W3129170303 cites W2996795455 @default.
- W3129170303 cites W2997101648 @default.
- W3129170303 cites W2999617596 @default.
- W3129170303 cites W3002447977 @default.
- W3129170303 cites W3005680577 @default.
- W3129170303 cites W3021708257 @default.
- W3129170303 cites W3023640063 @default.
- W3129170303 cites W3029947299 @default.
- W3129170303 cites W3033450512 @default.
- W3129170303 cites W3034932139 @default.
- W3129170303 cites W3036185205 @default.
- W3129170303 cites W3036619998 @default.
- W3129170303 cites W3036670859 @default.
- W3129170303 cites W3037007134 @default.
- W3129170303 cites W3041890730 @default.
- W3129170303 cites W3085605093 @default.
- W3129170303 cites W3088584794 @default.
- W3129170303 cites W3115293622 @default.
- W3129170303 cites W385466589 @default.
- W3129170303 hasPublicationYear "2021" @default.
- W3129170303 type Work @default.
- W3129170303 sameAs 3129170303 @default.
- W3129170303 citedByCount "14" @default.
- W3129170303 countsByYear W31291703032021 @default.
- W3129170303 crossrefType "proceedings-article" @default.
- W3129170303 hasAuthorship W3129170303A5001087292 @default.
- W3129170303 hasAuthorship W3129170303A5068291173 @default.
- W3129170303 hasAuthorship W3129170303A5070953294 @default.
- W3129170303 hasAuthorship W3129170303A5085413987 @default.
- W3129170303 hasConcept C103278499 @default.
- W3129170303 hasConcept C115961682 @default.
- W3129170303 hasConcept C119857082 @default.
- W3129170303 hasConcept C127413603 @default.
- W3129170303 hasConcept C134306372 @default.
- W3129170303 hasConcept C154945302 @default.
- W3129170303 hasConcept C176217482 @default.
- W3129170303 hasConcept C177148314 @default.
- W3129170303 hasConcept C17744445 @default.
- W3129170303 hasConcept C199539241 @default.
- W3129170303 hasConcept C21547014 @default.
- W3129170303 hasConcept C2776359362 @default.
- W3129170303 hasConcept C33923547 @default.
- W3129170303 hasConcept C41008148 @default.
- W3129170303 hasConcept C94625758 @default.
- W3129170303 hasConcept C97256817 @default.
- W3129170303 hasConcept C97541855 @default.
- W3129170303 hasConceptScore W3129170303C103278499 @default.
- W3129170303 hasConceptScore W3129170303C115961682 @default.
- W3129170303 hasConceptScore W3129170303C119857082 @default.
- W3129170303 hasConceptScore W3129170303C127413603 @default.
- W3129170303 hasConceptScore W3129170303C134306372 @default.
- W3129170303 hasConceptScore W3129170303C154945302 @default.
- W3129170303 hasConceptScore W3129170303C176217482 @default.
- W3129170303 hasConceptScore W3129170303C177148314 @default.
- W3129170303 hasConceptScore W3129170303C17744445 @default.
- W3129170303 hasConceptScore W3129170303C199539241 @default.