Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387595569> ?p ?o ?g. }
Showing items 1 to 69 of
69
with 100 items per page.
- W4387595569 abstract "In model-based reinforcement learning (MBRL), most algorithms rely on simulating trajectories from one-step dynamics models learned on data. A critical challenge of this approach is the compounding of one-step prediction errors as length of the trajectory grows. In this paper we tackle this issue by using a multi-timestep objective to train one-step models. Our objective is a weighted sum of a loss function (e.g., negative log-likelihood) at various future horizons. We explore and test a range of weights profiles. We find that exponentially decaying weights lead to models that significantly improve the long-horizon R2 score. This improvement is particularly noticeable when the models were evaluated on noisy data. Finally, using a soft actor-critic (SAC) agent in pure batch reinforcement learning (RL) and iterated batch RL scenarios, we found that our multi-timestep models outperform or match standard one-step models. This was especially evident in a noisy variant of the considered environment, highlighting the potential of our approach in real-world applications." @default.
- W4387595569 created "2023-10-13" @default.
- W4387595569 creator A5000763568 @default.
- W4387595569 creator A5021162375 @default.
- W4387595569 creator A5042425097 @default.
- W4387595569 creator A5051459492 @default.
- W4387595569 creator A5093054807 @default.
- W4387595569 date "2023-10-09" @default.
- W4387595569 modified "2023-10-14" @default.
- W4387595569 title "Multi-timestep models for Model-based Reinforcement Learning" @default.
- W4387595569 doi "https://doi.org/10.48550/arxiv.2310.05672" @default.
- W4387595569 hasPublicationYear "2023" @default.
- W4387595569 type Work @default.
- W4387595569 citedByCount "0" @default.
- W4387595569 crossrefType "posted-content" @default.
- W4387595569 hasAuthorship W4387595569A5000763568 @default.
- W4387595569 hasAuthorship W4387595569A5021162375 @default.
- W4387595569 hasAuthorship W4387595569A5042425097 @default.
- W4387595569 hasAuthorship W4387595569A5051459492 @default.
- W4387595569 hasAuthorship W4387595569A5093054807 @default.
- W4387595569 hasBestOaLocation W43875955691 @default.
- W4387595569 hasConcept C119857082 @default.
- W4387595569 hasConcept C121332964 @default.
- W4387595569 hasConcept C127413603 @default.
- W4387595569 hasConcept C1276947 @default.
- W4387595569 hasConcept C134306372 @default.
- W4387595569 hasConcept C13662910 @default.
- W4387595569 hasConcept C14036430 @default.
- W4387595569 hasConcept C140479938 @default.
- W4387595569 hasConcept C146978453 @default.
- W4387595569 hasConcept C154945302 @default.
- W4387595569 hasConcept C204323151 @default.
- W4387595569 hasConcept C33923547 @default.
- W4387595569 hasConcept C41008148 @default.
- W4387595569 hasConcept C78458016 @default.
- W4387595569 hasConcept C86803240 @default.
- W4387595569 hasConcept C97541855 @default.
- W4387595569 hasConceptScore W4387595569C119857082 @default.
- W4387595569 hasConceptScore W4387595569C121332964 @default.
- W4387595569 hasConceptScore W4387595569C127413603 @default.
- W4387595569 hasConceptScore W4387595569C1276947 @default.
- W4387595569 hasConceptScore W4387595569C134306372 @default.
- W4387595569 hasConceptScore W4387595569C13662910 @default.
- W4387595569 hasConceptScore W4387595569C14036430 @default.
- W4387595569 hasConceptScore W4387595569C140479938 @default.
- W4387595569 hasConceptScore W4387595569C146978453 @default.
- W4387595569 hasConceptScore W4387595569C154945302 @default.
- W4387595569 hasConceptScore W4387595569C204323151 @default.
- W4387595569 hasConceptScore W4387595569C33923547 @default.
- W4387595569 hasConceptScore W4387595569C41008148 @default.
- W4387595569 hasConceptScore W4387595569C78458016 @default.
- W4387595569 hasConceptScore W4387595569C86803240 @default.
- W4387595569 hasConceptScore W4387595569C97541855 @default.
- W4387595569 hasLocation W43875955691 @default.
- W4387595569 hasOpenAccess W4387595569 @default.
- W4387595569 hasPrimaryLocation W43875955691 @default.
- W4387595569 hasRelatedWork W1486898455 @default.
- W4387595569 hasRelatedWork W2031695474 @default.
- W4387595569 hasRelatedWork W2138720691 @default.
- W4387595569 hasRelatedWork W2586732548 @default.
- W4387595569 hasRelatedWork W3049728571 @default.
- W4387595569 hasRelatedWork W3103325625 @default.
- W4387595569 hasRelatedWork W4306904969 @default.
- W4387595569 hasRelatedWork W4323768008 @default.
- W4387595569 hasRelatedWork W4362501864 @default.
- W4387595569 hasRelatedWork W4380318855 @default.
- W4387595569 isParatext "false" @default.
- W4387595569 isRetracted "false" @default.
- W4387595569 workType "article" @default.