Matches in SemOpenAlex for { <https://semopenalex.org/work/W3092156675> ?p ?o ?g. }
Showing items 1 to 96 of
96
with 100 items per page.
- W3092156675 abstract "As reinforcement learning agents become increasingly integrated into complex, real-world environments, designing for safety becomes a critical consideration. We specifically focus on researching scenarios where agents can cause undesired side effects while executing a policy on a primary task. Since one can define multiple tasks for a given environment dynamics, there are two important challenges. First, we need to abstract the concept of safety that applies broadly to that environment independent of the specific task being executed. Second, we need a mechanism for the abstracted notion of safety to modulate the actions of agents executing different policies to minimize their side-effects. In this work, we propose Safety Aware Reinforcement Learning (SARL) - a framework where a virtual safe agent modulates the actions of a main reward-based agent to minimize side effects. The safe agent learns a task-independent notion of safety for a given environment. The main agent is then trained with a regularization loss given by the distance between the native action probabilities of the two agents. Since the safe agent effectively abstracts a task-independent notion of safety via its action probabilities, it can be ported to modulate multiple policies solving different tasks within the given environment without further training. We contrast this with solutions that rely on task-specific regularization metrics and test our framework on the SafeLife Suite, based on Conway's Game of Life, comprising a number of complex tasks in dynamic environments. We show that our solution is able to match the performance of solutions that rely on task-specific side-effect penalties on both the primary and safety objectives while additionally providing the benefit of generalizability and portability." @default.
- W3092156675 created "2020-10-15" @default.
- W3092156675 creator A5013678286 @default.
- W3092156675 creator A5022076965 @default.
- W3092156675 creator A5070857594 @default.
- W3092156675 date "2021-05-04" @default.
- W3092156675 modified "2023-09-27" @default.
- W3092156675 title "Safety Aware Reinforcement Learning (SARL)" @default.
- W3092156675 cites W1757796397 @default.
- W3092156675 cites W2158131535 @default.
- W3092156675 cites W2174424190 @default.
- W3092156675 cites W2462906003 @default.
- W3092156675 cites W2575705757 @default.
- W3092156675 cites W2736601468 @default.
- W3092156675 cites W2772709170 @default.
- W3092156675 cites W2790306973 @default.
- W3092156675 cites W2890538051 @default.
- W3092156675 cites W2921821063 @default.
- W3092156675 cites W2963800509 @default.
- W3092156675 cites W2990239702 @default.
- W3092156675 cites W3003293390 @default.
- W3092156675 cites W3004082694 @default.
- W3092156675 cites W3008082783 @default.
- W3092156675 cites W3009543490 @default.
- W3092156675 cites W3013405763 @default.
- W3092156675 cites W3018195242 @default.
- W3092156675 cites W3034757316 @default.
- W3092156675 cites W3035015331 @default.
- W3092156675 cites W3035611392 @default.
- W3092156675 cites W3038910469 @default.
- W3092156675 cites W3216843687 @default.
- W3092156675 hasPublicationYear "2021" @default.
- W3092156675 type Work @default.
- W3092156675 sameAs 3092156675 @default.
- W3092156675 citedByCount "0" @default.
- W3092156675 crossrefType "journal-article" @default.
- W3092156675 hasAuthorship W3092156675A5013678286 @default.
- W3092156675 hasAuthorship W3092156675A5022076965 @default.
- W3092156675 hasAuthorship W3092156675A5070857594 @default.
- W3092156675 hasConcept C106251023 @default.
- W3092156675 hasConcept C107457646 @default.
- W3092156675 hasConcept C119857082 @default.
- W3092156675 hasConcept C121332964 @default.
- W3092156675 hasConcept C127413603 @default.
- W3092156675 hasConcept C154945302 @default.
- W3092156675 hasConcept C199360897 @default.
- W3092156675 hasConcept C201995342 @default.
- W3092156675 hasConcept C2776135515 @default.
- W3092156675 hasConcept C2777904410 @default.
- W3092156675 hasConcept C2780451532 @default.
- W3092156675 hasConcept C2780791683 @default.
- W3092156675 hasConcept C41008148 @default.
- W3092156675 hasConcept C62520636 @default.
- W3092156675 hasConcept C97541855 @default.
- W3092156675 hasConceptScore W3092156675C106251023 @default.
- W3092156675 hasConceptScore W3092156675C107457646 @default.
- W3092156675 hasConceptScore W3092156675C119857082 @default.
- W3092156675 hasConceptScore W3092156675C121332964 @default.
- W3092156675 hasConceptScore W3092156675C127413603 @default.
- W3092156675 hasConceptScore W3092156675C154945302 @default.
- W3092156675 hasConceptScore W3092156675C199360897 @default.
- W3092156675 hasConceptScore W3092156675C201995342 @default.
- W3092156675 hasConceptScore W3092156675C2776135515 @default.
- W3092156675 hasConceptScore W3092156675C2777904410 @default.
- W3092156675 hasConceptScore W3092156675C2780451532 @default.
- W3092156675 hasConceptScore W3092156675C2780791683 @default.
- W3092156675 hasConceptScore W3092156675C41008148 @default.
- W3092156675 hasConceptScore W3092156675C62520636 @default.
- W3092156675 hasConceptScore W3092156675C97541855 @default.
- W3092156675 hasLocation W30921566751 @default.
- W3092156675 hasOpenAccess W3092156675 @default.
- W3092156675 hasPrimaryLocation W30921566751 @default.
- W3092156675 hasRelatedWork W2091713355 @default.
- W3092156675 hasRelatedWork W2267132222 @default.
- W3092156675 hasRelatedWork W2471081794 @default.
- W3092156675 hasRelatedWork W2519695595 @default.
- W3092156675 hasRelatedWork W2533925791 @default.
- W3092156675 hasRelatedWork W2768908787 @default.
- W3092156675 hasRelatedWork W2894141523 @default.
- W3092156675 hasRelatedWork W2904106049 @default.
- W3092156675 hasRelatedWork W2919603045 @default.
- W3092156675 hasRelatedWork W2951896791 @default.
- W3092156675 hasRelatedWork W2953473809 @default.
- W3092156675 hasRelatedWork W2979363950 @default.
- W3092156675 hasRelatedWork W2990330051 @default.
- W3092156675 hasRelatedWork W3085513642 @default.
- W3092156675 hasRelatedWork W3104013016 @default.
- W3092156675 hasRelatedWork W3210140370 @default.
- W3092156675 hasRelatedWork W3210581647 @default.
- W3092156675 hasRelatedWork W3212787291 @default.
- W3092156675 hasRelatedWork W1985548787 @default.
- W3092156675 hasRelatedWork W2187117074 @default.
- W3092156675 isParatext "false" @default.
- W3092156675 isRetracted "false" @default.
- W3092156675 magId "3092156675" @default.
- W3092156675 workType "article" @default.