Matches in SemOpenAlex for { <https://semopenalex.org/work/W4323651330> ?p ?o ?g. }
Showing items 1 to 91 of
91
with 100 items per page.
- W4323651330 abstract "An emerging field of sequential decision problems is safe Reinforcement Learning (RL), where the objective is to maximize the reward while obeying safety constraints. Being able to handle constraints is essential for deploying RL agents in real-world environments, where constraint violations can harm the agent and the environment. To this end, we propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic. The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns. By splitting responsibilities, we facilitate the learning task leading to increased sample efficiency. We integrate our approach into two popular RL algorithms, Proximal Policy Optimization and Soft Actor-Critic, and evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations. Finally, we make the zero-shot sim-to-real transfer where a differential drive robot has to navigate through a cluttered room. Our code can be found at https://github.com/nikeke19/Safe-Mult-RL." @default.
- W4323651330 created "2023-03-10" @default.
- W4323651330 creator A5001254143 @default.
- W4323651330 creator A5018031919 @default.
- W4323651330 creator A5048742998 @default.
- W4323651330 creator A5067487326 @default.
- W4323651330 creator A5079281987 @default.
- W4323651330 date "2023-03-07" @default.
- W4323651330 modified "2023-10-17" @default.
- W4323651330 title "A Multiplicative Value Function for Safe and Efficient Reinforcement Learning" @default.
- W4323651330 doi "https://doi.org/10.48550/arxiv.2303.04118" @default.
- W4323651330 hasPublicationYear "2023" @default.
- W4323651330 type Work @default.
- W4323651330 citedByCount "0" @default.
- W4323651330 crossrefType "posted-content" @default.
- W4323651330 hasAuthorship W4323651330A5001254143 @default.
- W4323651330 hasAuthorship W4323651330A5018031919 @default.
- W4323651330 hasAuthorship W4323651330A5048742998 @default.
- W4323651330 hasAuthorship W4323651330A5067487326 @default.
- W4323651330 hasAuthorship W4323651330A5079281987 @default.
- W4323651330 hasBestOaLocation W43236513301 @default.
- W4323651330 hasConcept C119857082 @default.
- W4323651330 hasConcept C126255220 @default.
- W4323651330 hasConcept C127413603 @default.
- W4323651330 hasConcept C134306372 @default.
- W4323651330 hasConcept C14036430 @default.
- W4323651330 hasConcept C14646407 @default.
- W4323651330 hasConcept C154945302 @default.
- W4323651330 hasConcept C177264268 @default.
- W4323651330 hasConcept C17744445 @default.
- W4323651330 hasConcept C199360897 @default.
- W4323651330 hasConcept C199539241 @default.
- W4323651330 hasConcept C201995342 @default.
- W4323651330 hasConcept C202444582 @default.
- W4323651330 hasConcept C2524010 @default.
- W4323651330 hasConcept C2776036281 @default.
- W4323651330 hasConcept C2776291640 @default.
- W4323651330 hasConcept C2776760102 @default.
- W4323651330 hasConcept C2777363581 @default.
- W4323651330 hasConcept C2780451532 @default.
- W4323651330 hasConcept C33923547 @default.
- W4323651330 hasConcept C41008148 @default.
- W4323651330 hasConcept C42747912 @default.
- W4323651330 hasConcept C78458016 @default.
- W4323651330 hasConcept C86803240 @default.
- W4323651330 hasConcept C90509273 @default.
- W4323651330 hasConcept C9652623 @default.
- W4323651330 hasConcept C97541855 @default.
- W4323651330 hasConceptScore W4323651330C119857082 @default.
- W4323651330 hasConceptScore W4323651330C126255220 @default.
- W4323651330 hasConceptScore W4323651330C127413603 @default.
- W4323651330 hasConceptScore W4323651330C134306372 @default.
- W4323651330 hasConceptScore W4323651330C14036430 @default.
- W4323651330 hasConceptScore W4323651330C14646407 @default.
- W4323651330 hasConceptScore W4323651330C154945302 @default.
- W4323651330 hasConceptScore W4323651330C177264268 @default.
- W4323651330 hasConceptScore W4323651330C17744445 @default.
- W4323651330 hasConceptScore W4323651330C199360897 @default.
- W4323651330 hasConceptScore W4323651330C199539241 @default.
- W4323651330 hasConceptScore W4323651330C201995342 @default.
- W4323651330 hasConceptScore W4323651330C202444582 @default.
- W4323651330 hasConceptScore W4323651330C2524010 @default.
- W4323651330 hasConceptScore W4323651330C2776036281 @default.
- W4323651330 hasConceptScore W4323651330C2776291640 @default.
- W4323651330 hasConceptScore W4323651330C2776760102 @default.
- W4323651330 hasConceptScore W4323651330C2777363581 @default.
- W4323651330 hasConceptScore W4323651330C2780451532 @default.
- W4323651330 hasConceptScore W4323651330C33923547 @default.
- W4323651330 hasConceptScore W4323651330C41008148 @default.
- W4323651330 hasConceptScore W4323651330C42747912 @default.
- W4323651330 hasConceptScore W4323651330C78458016 @default.
- W4323651330 hasConceptScore W4323651330C86803240 @default.
- W4323651330 hasConceptScore W4323651330C90509273 @default.
- W4323651330 hasConceptScore W4323651330C9652623 @default.
- W4323651330 hasConceptScore W4323651330C97541855 @default.
- W4323651330 hasLocation W43236513301 @default.
- W4323651330 hasOpenAccess W4323651330 @default.
- W4323651330 hasPrimaryLocation W43236513301 @default.
- W4323651330 hasRelatedWork W2152087638 @default.
- W4323651330 hasRelatedWork W2155027007 @default.
- W4323651330 hasRelatedWork W2734912394 @default.
- W4323651330 hasRelatedWork W2918392679 @default.
- W4323651330 hasRelatedWork W2950892788 @default.
- W4323651330 hasRelatedWork W3096918561 @default.
- W4323651330 hasRelatedWork W3188220908 @default.
- W4323651330 hasRelatedWork W4281556647 @default.
- W4323651330 hasRelatedWork W4287550122 @default.
- W4323651330 hasRelatedWork W4294862481 @default.
- W4323651330 isParatext "false" @default.
- W4323651330 isRetracted "false" @default.
- W4323651330 workType "article" @default.