Matches in SemOpenAlex for { <https://semopenalex.org/work/W3118162816> ?p ?o ?g. }
- W3118162816 abstract "Safety is essential for reinforcement learning (RL) applied in real-world situations. Chance constraints are suitable to represent the safety requirements in stochastic systems. Previous chance-constrained RL methods usually have a low convergence rate, or only learn a conservative policy. In this paper, we propose a model-based chance constrained actor-critic (CCAC) algorithm which can efficiently learn a safe and non-conservative policy. Different from existing methods that optimize a conservative lower bound, CCAC directly solves the original chance constrained problems, where the objective function and safe probability is simultaneously optimized with adaptive weights. In order to improve the convergence rate, CCAC utilizes the gradient of dynamic model to accelerate policy optimization. The effectiveness of CCAC is demonstrated by a stochastic car-following task. Experiments indicate that compared with previous RL methods, CCAC improves the performance while guaranteeing safety, with a five times faster convergence rate. It also has 100 times higher online computation efficiency than traditional safety techniques such as stochastic model predictive control." @default.
- W3118162816 created "2021-01-05" @default.
- W3118162816 creator A5005936929 @default.
- W3118162816 creator A5007610450 @default.
- W3118162816 creator A5053822288 @default.
- W3118162816 creator A5072263969 @default.
- W3118162816 creator A5072839302 @default.
- W3118162816 creator A5082794136 @default.
- W3118162816 date "2020-12-19" @default.
- W3118162816 modified "2023-09-23" @default.
- W3118162816 title "Model-Based Actor-Critic with Chance Constraint for Stochastic System" @default.
- W3118162816 cites W1144593952 @default.
- W3118162816 cites W1526449679 @default.
- W3118162816 cites W1845972764 @default.
- W3118162816 cites W1978956894 @default.
- W3118162816 cites W2020426342 @default.
- W3118162816 cites W2123871098 @default.
- W3118162816 cites W2131116400 @default.
- W3118162816 cites W2140135625 @default.
- W3118162816 cites W2145339207 @default.
- W3118162816 cites W2296319761 @default.
- W3118162816 cites W2557055507 @default.
- W3118162816 cites W2570494446 @default.
- W3118162816 cites W2765395770 @default.
- W3118162816 cites W2766447205 @default.
- W3118162816 cites W2784465508 @default.
- W3118162816 cites W2913300629 @default.
- W3118162816 cites W2962803570 @default.
- W3118162816 cites W2963082979 @default.
- W3118162816 cites W2964222567 @default.
- W3118162816 cites W2982316857 @default.
- W3118162816 cites W2991391803 @default.
- W3118162816 cites W2994786136 @default.
- W3118162816 cites W3002044607 @default.
- W3118162816 cites W3011338904 @default.
- W3118162816 cites W3092461179 @default.
- W3118162816 doi "https://doi.org/10.48550/arxiv.2012.10716" @default.
- W3118162816 hasPublicationYear "2020" @default.
- W3118162816 type Work @default.
- W3118162816 sameAs 3118162816 @default.
- W3118162816 citedByCount "0" @default.
- W3118162816 crossrefType "posted-content" @default.
- W3118162816 hasAuthorship W3118162816A5005936929 @default.
- W3118162816 hasAuthorship W3118162816A5007610450 @default.
- W3118162816 hasAuthorship W3118162816A5053822288 @default.
- W3118162816 hasAuthorship W3118162816A5072263969 @default.
- W3118162816 hasAuthorship W3118162816A5072839302 @default.
- W3118162816 hasAuthorship W3118162816A5082794136 @default.
- W3118162816 hasBestOaLocation W31181628161 @default.
- W3118162816 hasConcept C126255220 @default.
- W3118162816 hasConcept C127413603 @default.
- W3118162816 hasConcept C137631369 @default.
- W3118162816 hasConcept C14036430 @default.
- W3118162816 hasConcept C154945302 @default.
- W3118162816 hasConcept C162324750 @default.
- W3118162816 hasConcept C194387892 @default.
- W3118162816 hasConcept C201995342 @default.
- W3118162816 hasConcept C2524010 @default.
- W3118162816 hasConcept C26517878 @default.
- W3118162816 hasConcept C2776036281 @default.
- W3118162816 hasConcept C2777303404 @default.
- W3118162816 hasConcept C2780451532 @default.
- W3118162816 hasConcept C33923547 @default.
- W3118162816 hasConcept C38652104 @default.
- W3118162816 hasConcept C41008148 @default.
- W3118162816 hasConcept C50522688 @default.
- W3118162816 hasConcept C57869625 @default.
- W3118162816 hasConcept C78458016 @default.
- W3118162816 hasConcept C86803240 @default.
- W3118162816 hasConcept C97541855 @default.
- W3118162816 hasConceptScore W3118162816C126255220 @default.
- W3118162816 hasConceptScore W3118162816C127413603 @default.
- W3118162816 hasConceptScore W3118162816C137631369 @default.
- W3118162816 hasConceptScore W3118162816C14036430 @default.
- W3118162816 hasConceptScore W3118162816C154945302 @default.
- W3118162816 hasConceptScore W3118162816C162324750 @default.
- W3118162816 hasConceptScore W3118162816C194387892 @default.
- W3118162816 hasConceptScore W3118162816C201995342 @default.
- W3118162816 hasConceptScore W3118162816C2524010 @default.
- W3118162816 hasConceptScore W3118162816C26517878 @default.
- W3118162816 hasConceptScore W3118162816C2776036281 @default.
- W3118162816 hasConceptScore W3118162816C2777303404 @default.
- W3118162816 hasConceptScore W3118162816C2780451532 @default.
- W3118162816 hasConceptScore W3118162816C33923547 @default.
- W3118162816 hasConceptScore W3118162816C38652104 @default.
- W3118162816 hasConceptScore W3118162816C41008148 @default.
- W3118162816 hasConceptScore W3118162816C50522688 @default.
- W3118162816 hasConceptScore W3118162816C57869625 @default.
- W3118162816 hasConceptScore W3118162816C78458016 @default.
- W3118162816 hasConceptScore W3118162816C86803240 @default.
- W3118162816 hasConceptScore W3118162816C97541855 @default.
- W3118162816 hasLocation W31181628161 @default.
- W3118162816 hasOpenAccess W3118162816 @default.
- W3118162816 hasPrimaryLocation W31181628161 @default.
- W3118162816 hasRelatedWork W2013056473 @default.
- W3118162816 hasRelatedWork W2128422927 @default.
- W3118162816 hasRelatedWork W2137882283 @default.
- W3118162816 hasRelatedWork W2160374150 @default.
- W3118162816 hasRelatedWork W2183733024 @default.
- W3118162816 hasRelatedWork W2359791990 @default.