Matches in SemOpenAlex for { <https://semopenalex.org/work/W3046384803> ?p ?o ?g. }
- W3046384803 endingPage "7674" @default.
- W3046384803 startingPage "7667" @default.
- W3046384803 abstract "Many physical systems have underlying safety considerations that require that the policy employed ensures the satisfaction of a set of constraints. The analytical formulation usually takes the form of a Constrained Markov Decision Process (CMDP). We focus on the case where the CMDP is unknown, and RL algorithms obtain samples to discover the model and compute an optimal constrained policy. Our goal is to characterize the relationship between safety constraints and the number of samples needed to ensure a desired level of accuracy---both objective maximization and constraint satisfaction---in a PAC sense. We explore two classes of RL algorithms, namely, (i) a generative model based approach, wherein samples are taken initially to estimate a model, and (ii) an online approach, wherein the model is updated as samples are obtained. Our main finding is that compared to the best known bounds of the unconstrained regime, the sample complexity of constrained RL algorithms are increased by a factor that is logarithmic in the number of constraints, which suggests that the approach may be easily utilized in real systems." @default.
- W3046384803 created "2020-08-07" @default.
- W3046384803 creator A5005327552 @default.
- W3046384803 creator A5005504863 @default.
- W3046384803 creator A5053096993 @default.
- W3046384803 creator A5056952829 @default.
- W3046384803 date "2021-05-18" @default.
- W3046384803 modified "2023-10-13" @default.
- W3046384803 title "Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs" @default.
- W3046384803 cites W1061340565 @default.
- W3046384803 cites W1503109067 @default.
- W3046384803 cites W1518931405 @default.
- W3046384803 cites W1786332878 @default.
- W3046384803 cites W1965634470 @default.
- W3046384803 cites W1969276875 @default.
- W3046384803 cites W1988526405 @default.
- W3046384803 cites W2002240881 @default.
- W3046384803 cites W2070570138 @default.
- W3046384803 cites W2073314543 @default.
- W3046384803 cites W2103012681 @default.
- W3046384803 cites W2110005947 @default.
- W3046384803 cites W2119567691 @default.
- W3046384803 cites W2119852681 @default.
- W3046384803 cites W2120678009 @default.
- W3046384803 cites W2128347943 @default.
- W3046384803 cites W2788014517 @default.
- W3046384803 cites W2804791273 @default.
- W3046384803 cites W2963475649 @default.
- W3046384803 cites W2963568654 @default.
- W3046384803 cites W2963582321 @default.
- W3046384803 cites W2964299116 @default.
- W3046384803 cites W2964340170 @default.
- W3046384803 cites W2970890202 @default.
- W3046384803 cites W2982256055 @default.
- W3046384803 cites W2990389059 @default.
- W3046384803 cites W3001756029 @default.
- W3046384803 cites W3009922106 @default.
- W3046384803 cites W3049624187 @default.
- W3046384803 cites W3080734044 @default.
- W3046384803 doi "https://doi.org/10.1609/aaai.v35i9.16937" @default.
- W3046384803 hasPublicationYear "2021" @default.
- W3046384803 type Work @default.
- W3046384803 sameAs 3046384803 @default.
- W3046384803 citedByCount "3" @default.
- W3046384803 countsByYear W30463848032020 @default.
- W3046384803 countsByYear W30463848032022 @default.
- W3046384803 countsByYear W30463848032023 @default.
- W3046384803 crossrefType "journal-article" @default.
- W3046384803 hasAuthorship W3046384803A5005327552 @default.
- W3046384803 hasAuthorship W3046384803A5005504863 @default.
- W3046384803 hasAuthorship W3046384803A5053096993 @default.
- W3046384803 hasAuthorship W3046384803A5056952829 @default.
- W3046384803 hasBestOaLocation W30463848031 @default.
- W3046384803 hasConcept C105795698 @default.
- W3046384803 hasConcept C106189395 @default.
- W3046384803 hasConcept C111919701 @default.
- W3046384803 hasConcept C126255220 @default.
- W3046384803 hasConcept C134306372 @default.
- W3046384803 hasConcept C154945302 @default.
- W3046384803 hasConcept C159886148 @default.
- W3046384803 hasConcept C177264268 @default.
- W3046384803 hasConcept C185592680 @default.
- W3046384803 hasConcept C198531522 @default.
- W3046384803 hasConcept C199360897 @default.
- W3046384803 hasConcept C2524010 @default.
- W3046384803 hasConcept C2776036281 @default.
- W3046384803 hasConcept C2776330181 @default.
- W3046384803 hasConcept C33923547 @default.
- W3046384803 hasConcept C39927690 @default.
- W3046384803 hasConcept C41008148 @default.
- W3046384803 hasConcept C43617362 @default.
- W3046384803 hasConcept C44616089 @default.
- W3046384803 hasConcept C49937458 @default.
- W3046384803 hasConcept C97541855 @default.
- W3046384803 hasConcept C98045186 @default.
- W3046384803 hasConceptScore W3046384803C105795698 @default.
- W3046384803 hasConceptScore W3046384803C106189395 @default.
- W3046384803 hasConceptScore W3046384803C111919701 @default.
- W3046384803 hasConceptScore W3046384803C126255220 @default.
- W3046384803 hasConceptScore W3046384803C134306372 @default.
- W3046384803 hasConceptScore W3046384803C154945302 @default.
- W3046384803 hasConceptScore W3046384803C159886148 @default.
- W3046384803 hasConceptScore W3046384803C177264268 @default.
- W3046384803 hasConceptScore W3046384803C185592680 @default.
- W3046384803 hasConceptScore W3046384803C198531522 @default.
- W3046384803 hasConceptScore W3046384803C199360897 @default.
- W3046384803 hasConceptScore W3046384803C2524010 @default.
- W3046384803 hasConceptScore W3046384803C2776036281 @default.
- W3046384803 hasConceptScore W3046384803C2776330181 @default.
- W3046384803 hasConceptScore W3046384803C33923547 @default.
- W3046384803 hasConceptScore W3046384803C39927690 @default.
- W3046384803 hasConceptScore W3046384803C41008148 @default.
- W3046384803 hasConceptScore W3046384803C43617362 @default.
- W3046384803 hasConceptScore W3046384803C44616089 @default.
- W3046384803 hasConceptScore W3046384803C49937458 @default.
- W3046384803 hasConceptScore W3046384803C97541855 @default.
- W3046384803 hasConceptScore W3046384803C98045186 @default.
- W3046384803 hasIssue "9" @default.