Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385764501> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4385764501 abstract "Safe Reinforcement learning (Safe RL) aims at learning optimal policies while staying safe. A popular solution to Safe RL is shielding, which uses a logical safety specification to prevent an RL agent from taking unsafe actions. However, traditional shielding techniques are difficult to integrate with continuous, end-to-end deep RL methods. To this end, we introduce Probabilistic Logic Policy Gradient (PLPG). PLPG is a model-based Safe RL technique that uses probabilistic logic programming to model logical safety constraints as differentiable functions. Therefore, PLPG can be seamlessly applied to any policy gradient algorithm while still providing the same convergence guarantees. In our experiments, we show that PLPG learns safer and more rewarding policies compared to other state-of-the-art shielding techniques." @default.
- W4385764501 created "2023-08-12" @default.
- W4385764501 creator A5005466305 @default.
- W4385764501 creator A5027783695 @default.
- W4385764501 creator A5078111090 @default.
- W4385764501 creator A5078136609 @default.
- W4385764501 date "2023-08-01" @default.
- W4385764501 modified "2023-10-02" @default.
- W4385764501 title "Safe Reinforcement Learning via Probabilistic Logic Shields" @default.
- W4385764501 doi "https://doi.org/10.24963/ijcai.2023/637" @default.
- W4385764501 hasPublicationYear "2023" @default.
- W4385764501 type Work @default.
- W4385764501 citedByCount "1" @default.
- W4385764501 countsByYear W43857645012023 @default.
- W4385764501 crossrefType "proceedings-article" @default.
- W4385764501 hasAuthorship W4385764501A5005466305 @default.
- W4385764501 hasAuthorship W4385764501A5027783695 @default.
- W4385764501 hasAuthorship W4385764501A5078111090 @default.
- W4385764501 hasAuthorship W4385764501A5078136609 @default.
- W4385764501 hasBestOaLocation W43857645011 @default.
- W4385764501 hasConcept C119599485 @default.
- W4385764501 hasConcept C127413603 @default.
- W4385764501 hasConcept C134306372 @default.
- W4385764501 hasConcept C154945302 @default.
- W4385764501 hasConcept C162324750 @default.
- W4385764501 hasConcept C202615002 @default.
- W4385764501 hasConcept C2265751 @default.
- W4385764501 hasConcept C24404364 @default.
- W4385764501 hasConcept C2776654903 @default.
- W4385764501 hasConcept C2777303404 @default.
- W4385764501 hasConcept C33923547 @default.
- W4385764501 hasConcept C38652104 @default.
- W4385764501 hasConcept C41008148 @default.
- W4385764501 hasConcept C49937458 @default.
- W4385764501 hasConcept C50522688 @default.
- W4385764501 hasConcept C52063229 @default.
- W4385764501 hasConcept C97541855 @default.
- W4385764501 hasConceptScore W4385764501C119599485 @default.
- W4385764501 hasConceptScore W4385764501C127413603 @default.
- W4385764501 hasConceptScore W4385764501C134306372 @default.
- W4385764501 hasConceptScore W4385764501C154945302 @default.
- W4385764501 hasConceptScore W4385764501C162324750 @default.
- W4385764501 hasConceptScore W4385764501C202615002 @default.
- W4385764501 hasConceptScore W4385764501C2265751 @default.
- W4385764501 hasConceptScore W4385764501C24404364 @default.
- W4385764501 hasConceptScore W4385764501C2776654903 @default.
- W4385764501 hasConceptScore W4385764501C2777303404 @default.
- W4385764501 hasConceptScore W4385764501C33923547 @default.
- W4385764501 hasConceptScore W4385764501C38652104 @default.
- W4385764501 hasConceptScore W4385764501C41008148 @default.
- W4385764501 hasConceptScore W4385764501C49937458 @default.
- W4385764501 hasConceptScore W4385764501C50522688 @default.
- W4385764501 hasConceptScore W4385764501C52063229 @default.
- W4385764501 hasConceptScore W4385764501C97541855 @default.
- W4385764501 hasLocation W43857645011 @default.
- W4385764501 hasLocation W43857645012 @default.
- W4385764501 hasOpenAccess W4385764501 @default.
- W4385764501 hasPrimaryLocation W43857645011 @default.
- W4385764501 hasRelatedWork W1497573972 @default.
- W4385764501 hasRelatedWork W3074294383 @default.
- W4385764501 hasRelatedWork W3119613234 @default.
- W4385764501 hasRelatedWork W3126245186 @default.
- W4385764501 hasRelatedWork W3126673337 @default.
- W4385764501 hasRelatedWork W3136325136 @default.
- W4385764501 hasRelatedWork W4200523043 @default.
- W4385764501 hasRelatedWork W4221161273 @default.
- W4385764501 hasRelatedWork W4283793632 @default.
- W4385764501 hasRelatedWork W4295302007 @default.
- W4385764501 isParatext "false" @default.
- W4385764501 isRetracted "false" @default.
- W4385764501 workType "article" @default.