Matches in SemOpenAlex for { <https://semopenalex.org/work/W3087810330> ?p ?o ?g. }
- W3087810330 endingPage "8037" @default.
- W3087810330 startingPage "8030" @default.
- W3087810330 abstract "Constrained Markov decision processes (CMDPs) formalize sequential decision-making problems whose objective is to minimize a cost function while satisfying constraints on various cost functions. In this paper, we consider the setting of episodic fixed-horizon CMDPs. We propose an online algorithm which leverages the linear programming formulation of repeated optimistic planning for finite-horizon CMDP to provide a probably approximately correctness (PAC) guarantee on the number of episodes needed to ensure a near optimal policy, i.e., with resulting objective value close to that of the optimal value and satisfying the constraints within low tolerance, with high probability. The number of episodes needed is shown to have linear dependence on the sizes of the state and action spaces and quadratic dependence on the time horizon and an upper bound on the number of possible successor states for a state-action pair. Therefore, if the upper bound on the number of possible successor states is much smaller than the size of the state space, the number of needed episodes becomes linear in the sizes of the state and action spaces and quadratic in the time horizon." @default.
- W3087810330 created "2020-10-01" @default.
- W3087810330 creator A5000301175 @default.
- W3087810330 creator A5002082998 @default.
- W3087810330 creator A5067636168 @default.
- W3087810330 date "2021-05-18" @default.
- W3087810330 modified "2023-10-16" @default.
- W3087810330 title "A Sample-Efficient Algorithm for Episodic Finite-Horizon MDP with Constraints" @default.
- W3087810330 cites W1505937442 @default.
- W3087810330 cites W1518931405 @default.
- W3087810330 cites W1786332878 @default.
- W3087810330 cites W1867103660 @default.
- W3087810330 cites W2009551863 @default.
- W3087810330 cites W2070570138 @default.
- W3087810330 cites W2119567691 @default.
- W3087810330 cites W2150234726 @default.
- W3087810330 cites W2604884452 @default.
- W3087810330 cites W2616964725 @default.
- W3087810330 cites W2750990725 @default.
- W3087810330 cites W2804791273 @default.
- W3087810330 cites W2946284958 @default.
- W3087810330 cites W2947223001 @default.
- W3087810330 cites W2962723383 @default.
- W3087810330 cites W2962734844 @default.
- W3087810330 cites W2964299116 @default.
- W3087810330 cites W2970749192 @default.
- W3087810330 cites W2991935368 @default.
- W3087810330 cites W3001756029 @default.
- W3087810330 cites W3008910712 @default.
- W3087810330 cites W3008953696 @default.
- W3087810330 cites W3034608738 @default.
- W3087810330 cites W3035939465 @default.
- W3087810330 doi "https://doi.org/10.1609/aaai.v35i9.16979" @default.
- W3087810330 hasPublicationYear "2021" @default.
- W3087810330 type Work @default.
- W3087810330 sameAs 3087810330 @default.
- W3087810330 citedByCount "3" @default.
- W3087810330 countsByYear W30878103302021 @default.
- W3087810330 countsByYear W30878103302022 @default.
- W3087810330 crossrefType "journal-article" @default.
- W3087810330 hasAuthorship W3087810330A5000301175 @default.
- W3087810330 hasAuthorship W3087810330A5002082998 @default.
- W3087810330 hasAuthorship W3087810330A5067636168 @default.
- W3087810330 hasBestOaLocation W30878103301 @default.
- W3087810330 hasConcept C105795698 @default.
- W3087810330 hasConcept C106189395 @default.
- W3087810330 hasConcept C11413529 @default.
- W3087810330 hasConcept C121332964 @default.
- W3087810330 hasConcept C126255220 @default.
- W3087810330 hasConcept C134306372 @default.
- W3087810330 hasConcept C14036430 @default.
- W3087810330 hasConcept C14646407 @default.
- W3087810330 hasConcept C159176650 @default.
- W3087810330 hasConcept C159886148 @default.
- W3087810330 hasConcept C2524010 @default.
- W3087810330 hasConcept C2780791683 @default.
- W3087810330 hasConcept C28761237 @default.
- W3087810330 hasConcept C33923547 @default.
- W3087810330 hasConcept C41008148 @default.
- W3087810330 hasConcept C41045048 @default.
- W3087810330 hasConcept C48103436 @default.
- W3087810330 hasConcept C55439883 @default.
- W3087810330 hasConcept C62520636 @default.
- W3087810330 hasConcept C72434380 @default.
- W3087810330 hasConcept C75306776 @default.
- W3087810330 hasConcept C77553402 @default.
- W3087810330 hasConcept C78458016 @default.
- W3087810330 hasConcept C86803240 @default.
- W3087810330 hasConceptScore W3087810330C105795698 @default.
- W3087810330 hasConceptScore W3087810330C106189395 @default.
- W3087810330 hasConceptScore W3087810330C11413529 @default.
- W3087810330 hasConceptScore W3087810330C121332964 @default.
- W3087810330 hasConceptScore W3087810330C126255220 @default.
- W3087810330 hasConceptScore W3087810330C134306372 @default.
- W3087810330 hasConceptScore W3087810330C14036430 @default.
- W3087810330 hasConceptScore W3087810330C14646407 @default.
- W3087810330 hasConceptScore W3087810330C159176650 @default.
- W3087810330 hasConceptScore W3087810330C159886148 @default.
- W3087810330 hasConceptScore W3087810330C2524010 @default.
- W3087810330 hasConceptScore W3087810330C2780791683 @default.
- W3087810330 hasConceptScore W3087810330C28761237 @default.
- W3087810330 hasConceptScore W3087810330C33923547 @default.
- W3087810330 hasConceptScore W3087810330C41008148 @default.
- W3087810330 hasConceptScore W3087810330C41045048 @default.
- W3087810330 hasConceptScore W3087810330C48103436 @default.
- W3087810330 hasConceptScore W3087810330C55439883 @default.
- W3087810330 hasConceptScore W3087810330C62520636 @default.
- W3087810330 hasConceptScore W3087810330C72434380 @default.
- W3087810330 hasConceptScore W3087810330C75306776 @default.
- W3087810330 hasConceptScore W3087810330C77553402 @default.
- W3087810330 hasConceptScore W3087810330C78458016 @default.
- W3087810330 hasConceptScore W3087810330C86803240 @default.
- W3087810330 hasIssue "9" @default.
- W3087810330 hasLocation W30878103301 @default.
- W3087810330 hasLocation W30878103302 @default.
- W3087810330 hasOpenAccess W3087810330 @default.
- W3087810330 hasPrimaryLocation W30878103301 @default.
- W3087810330 hasRelatedWork W1506128430 @default.