Matches in SemOpenAlex for { <https://semopenalex.org/work/W4310921838> ?p ?o ?g. }
Showing items 1 to 92 of
92
with 100 items per page.
- W4310921838 abstract "We study the learning dynamics of self-predictive learning for reinforcement learning, a family of algorithms that learn representations by minimizing the prediction error of their own future latent representations. Despite its recent empirical success, such algorithms have an apparent defect: trivial representations (such as constants) minimize the prediction error, yet it is obviously undesirable to converge to such solutions. Our central insight is that careful designs of the optimization dynamics are critical to learning meaningful representations. We identify that a faster paced optimization of the predictor and semi-gradient updates on the representation, are crucial to preventing the representation collapse. Then in an idealized setup, we show self-predictive learning dynamics carries out spectral decomposition on the state transition matrix, effectively capturing information of the transition dynamics. Building on the theoretical insights, we propose bidirectional self-predictive learning, a novel self-predictive algorithm that learns two representations simultaneously. We examine the robustness of our theoretical insights with a number of small-scale experiments and showcase the promise of the novel representation learning algorithm with large-scale experiments." @default.
- W4310921838 created "2022-12-21" @default.
- W4310921838 creator A5002692470 @default.
- W4310921838 creator A5006533777 @default.
- W4310921838 creator A5018745259 @default.
- W4310921838 creator A5019956577 @default.
- W4310921838 creator A5033033832 @default.
- W4310921838 creator A5037981481 @default.
- W4310921838 creator A5038039948 @default.
- W4310921838 creator A5041609088 @default.
- W4310921838 creator A5043355670 @default.
- W4310921838 creator A5047099585 @default.
- W4310921838 creator A5051968502 @default.
- W4310921838 creator A5053855452 @default.
- W4310921838 creator A5070500506 @default.
- W4310921838 creator A5073570605 @default.
- W4310921838 creator A5089486474 @default.
- W4310921838 creator A5089902230 @default.
- W4310921838 date "2022-12-06" @default.
- W4310921838 modified "2023-10-16" @default.
- W4310921838 title "Understanding Self-Predictive Learning for Reinforcement Learning" @default.
- W4310921838 doi "https://doi.org/10.48550/arxiv.2212.03319" @default.
- W4310921838 hasPublicationYear "2022" @default.
- W4310921838 type Work @default.
- W4310921838 citedByCount "0" @default.
- W4310921838 crossrefType "posted-content" @default.
- W4310921838 hasAuthorship W4310921838A5002692470 @default.
- W4310921838 hasAuthorship W4310921838A5006533777 @default.
- W4310921838 hasAuthorship W4310921838A5018745259 @default.
- W4310921838 hasAuthorship W4310921838A5019956577 @default.
- W4310921838 hasAuthorship W4310921838A5033033832 @default.
- W4310921838 hasAuthorship W4310921838A5037981481 @default.
- W4310921838 hasAuthorship W4310921838A5038039948 @default.
- W4310921838 hasAuthorship W4310921838A5041609088 @default.
- W4310921838 hasAuthorship W4310921838A5043355670 @default.
- W4310921838 hasAuthorship W4310921838A5047099585 @default.
- W4310921838 hasAuthorship W4310921838A5051968502 @default.
- W4310921838 hasAuthorship W4310921838A5053855452 @default.
- W4310921838 hasAuthorship W4310921838A5070500506 @default.
- W4310921838 hasAuthorship W4310921838A5073570605 @default.
- W4310921838 hasAuthorship W4310921838A5089486474 @default.
- W4310921838 hasAuthorship W4310921838A5089902230 @default.
- W4310921838 hasBestOaLocation W43109218381 @default.
- W4310921838 hasConcept C104317684 @default.
- W4310921838 hasConcept C111472728 @default.
- W4310921838 hasConcept C119857082 @default.
- W4310921838 hasConcept C138885662 @default.
- W4310921838 hasConcept C154945302 @default.
- W4310921838 hasConcept C17744445 @default.
- W4310921838 hasConcept C185592680 @default.
- W4310921838 hasConcept C196340769 @default.
- W4310921838 hasConcept C199539241 @default.
- W4310921838 hasConcept C2776359362 @default.
- W4310921838 hasConcept C2778136018 @default.
- W4310921838 hasConcept C41008148 @default.
- W4310921838 hasConcept C55493867 @default.
- W4310921838 hasConcept C63479239 @default.
- W4310921838 hasConcept C94625758 @default.
- W4310921838 hasConcept C97541855 @default.
- W4310921838 hasConceptScore W4310921838C104317684 @default.
- W4310921838 hasConceptScore W4310921838C111472728 @default.
- W4310921838 hasConceptScore W4310921838C119857082 @default.
- W4310921838 hasConceptScore W4310921838C138885662 @default.
- W4310921838 hasConceptScore W4310921838C154945302 @default.
- W4310921838 hasConceptScore W4310921838C17744445 @default.
- W4310921838 hasConceptScore W4310921838C185592680 @default.
- W4310921838 hasConceptScore W4310921838C196340769 @default.
- W4310921838 hasConceptScore W4310921838C199539241 @default.
- W4310921838 hasConceptScore W4310921838C2776359362 @default.
- W4310921838 hasConceptScore W4310921838C2778136018 @default.
- W4310921838 hasConceptScore W4310921838C41008148 @default.
- W4310921838 hasConceptScore W4310921838C55493867 @default.
- W4310921838 hasConceptScore W4310921838C63479239 @default.
- W4310921838 hasConceptScore W4310921838C94625758 @default.
- W4310921838 hasConceptScore W4310921838C97541855 @default.
- W4310921838 hasLocation W43109218381 @default.
- W4310921838 hasLocation W43109218382 @default.
- W4310921838 hasOpenAccess W4310921838 @default.
- W4310921838 hasPrimaryLocation W43109218381 @default.
- W4310921838 hasRelatedWork W1504584981 @default.
- W4310921838 hasRelatedWork W1914583973 @default.
- W4310921838 hasRelatedWork W2130711276 @default.
- W4310921838 hasRelatedWork W2145363145 @default.
- W4310921838 hasRelatedWork W2154399718 @default.
- W4310921838 hasRelatedWork W2341346307 @default.
- W4310921838 hasRelatedWork W3038962357 @default.
- W4310921838 hasRelatedWork W3088331655 @default.
- W4310921838 hasRelatedWork W4308828368 @default.
- W4310921838 hasRelatedWork W4321463377 @default.
- W4310921838 isParatext "false" @default.
- W4310921838 isRetracted "false" @default.
- W4310921838 workType "article" @default.