Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287274300> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W4287274300 abstract "Offline reinforcement learning proposes to learn policies from large collected datasets without interacting with the physical environment. These algorithms have made it possible to learn useful skills from data that can then be deployed in the environment in real-world settings where interactions may be costly or dangerous, such as autonomous driving or factories. However, current algorithms overfit to the dataset they are trained on and exhibit poor out-of-distribution generalization to the environment when deployed. In this paper, we study the effectiveness of performing data augmentations on the state space, and study 7 different augmentation schemes and how they behave with existing offline RL algorithms. We then combine the best data performing augmentation scheme with a state-of-the-art Q-learning technique, and improve the function approximation of the Q-networks by smoothening out the learned state-action space. We experimentally show that using this Surprisingly Simple Self-Supervision technique in RL (S4RL), we significantly improve over the current state-of-the-art algorithms on offline robot learning environments such as MetaWorld [1] and RoboSuite [2,3], and benchmark datasets such as D4RL [4]." @default.
- W4287274300 created "2022-07-25" @default.
- W4287274300 creator A5026180478 @default.
- W4287274300 creator A5036366964 @default.
- W4287274300 creator A5061193324 @default.
- W4287274300 date "2021-03-10" @default.
- W4287274300 modified "2023-09-30" @default.
- W4287274300 title "S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning" @default.
- W4287274300 doi "https://doi.org/10.48550/arxiv.2103.06326" @default.
- W4287274300 hasPublicationYear "2021" @default.
- W4287274300 type Work @default.
- W4287274300 citedByCount "0" @default.
- W4287274300 crossrefType "posted-content" @default.
- W4287274300 hasAuthorship W4287274300A5026180478 @default.
- W4287274300 hasAuthorship W4287274300A5036366964 @default.
- W4287274300 hasAuthorship W4287274300A5061193324 @default.
- W4287274300 hasBestOaLocation W42872743001 @default.
- W4287274300 hasConcept C105795698 @default.
- W4287274300 hasConcept C111472728 @default.
- W4287274300 hasConcept C111919701 @default.
- W4287274300 hasConcept C11413529 @default.
- W4287274300 hasConcept C119857082 @default.
- W4287274300 hasConcept C13280743 @default.
- W4287274300 hasConcept C134306372 @default.
- W4287274300 hasConcept C138885662 @default.
- W4287274300 hasConcept C14036430 @default.
- W4287274300 hasConcept C154945302 @default.
- W4287274300 hasConcept C177148314 @default.
- W4287274300 hasConcept C185798385 @default.
- W4287274300 hasConcept C188116033 @default.
- W4287274300 hasConcept C205649164 @default.
- W4287274300 hasConcept C22019652 @default.
- W4287274300 hasConcept C2778572836 @default.
- W4287274300 hasConcept C2780586882 @default.
- W4287274300 hasConcept C33923547 @default.
- W4287274300 hasConcept C41008148 @default.
- W4287274300 hasConcept C48103436 @default.
- W4287274300 hasConcept C50644808 @default.
- W4287274300 hasConcept C72434380 @default.
- W4287274300 hasConcept C77618280 @default.
- W4287274300 hasConcept C78458016 @default.
- W4287274300 hasConcept C86803240 @default.
- W4287274300 hasConcept C90509273 @default.
- W4287274300 hasConcept C97541855 @default.
- W4287274300 hasConceptScore W4287274300C105795698 @default.
- W4287274300 hasConceptScore W4287274300C111472728 @default.
- W4287274300 hasConceptScore W4287274300C111919701 @default.
- W4287274300 hasConceptScore W4287274300C11413529 @default.
- W4287274300 hasConceptScore W4287274300C119857082 @default.
- W4287274300 hasConceptScore W4287274300C13280743 @default.
- W4287274300 hasConceptScore W4287274300C134306372 @default.
- W4287274300 hasConceptScore W4287274300C138885662 @default.
- W4287274300 hasConceptScore W4287274300C14036430 @default.
- W4287274300 hasConceptScore W4287274300C154945302 @default.
- W4287274300 hasConceptScore W4287274300C177148314 @default.
- W4287274300 hasConceptScore W4287274300C185798385 @default.
- W4287274300 hasConceptScore W4287274300C188116033 @default.
- W4287274300 hasConceptScore W4287274300C205649164 @default.
- W4287274300 hasConceptScore W4287274300C22019652 @default.
- W4287274300 hasConceptScore W4287274300C2778572836 @default.
- W4287274300 hasConceptScore W4287274300C2780586882 @default.
- W4287274300 hasConceptScore W4287274300C33923547 @default.
- W4287274300 hasConceptScore W4287274300C41008148 @default.
- W4287274300 hasConceptScore W4287274300C48103436 @default.
- W4287274300 hasConceptScore W4287274300C50644808 @default.
- W4287274300 hasConceptScore W4287274300C72434380 @default.
- W4287274300 hasConceptScore W4287274300C77618280 @default.
- W4287274300 hasConceptScore W4287274300C78458016 @default.
- W4287274300 hasConceptScore W4287274300C86803240 @default.
- W4287274300 hasConceptScore W4287274300C90509273 @default.
- W4287274300 hasConceptScore W4287274300C97541855 @default.
- W4287274300 hasLocation W42872743001 @default.
- W4287274300 hasOpenAccess W4287274300 @default.
- W4287274300 hasPrimaryLocation W42872743001 @default.
- W4287274300 hasRelatedWork W2350784623 @default.
- W4287274300 hasRelatedWork W2936107532 @default.
- W4287274300 hasRelatedWork W2989932438 @default.
- W4287274300 hasRelatedWork W3035417664 @default.
- W4287274300 hasRelatedWork W3049166411 @default.
- W4287274300 hasRelatedWork W3103643887 @default.
- W4287274300 hasRelatedWork W3147214434 @default.
- W4287274300 hasRelatedWork W3196472998 @default.
- W4287274300 hasRelatedWork W4287688416 @default.
- W4287274300 hasRelatedWork W4327778759 @default.
- W4287274300 isParatext "false" @default.
- W4287274300 isRetracted "false" @default.
- W4287274300 workType "article" @default.