Matches in SemOpenAlex for { <https://semopenalex.org/work/W3131665514> ?p ?o ?g. }
Showing items 1 to 99 of
99
with 100 items per page.
- W3131665514 endingPage "15" @default.
- W3131665514 startingPage "1" @default.
- W3131665514 abstract "A common formulation of constrained reinforcement learning involves multiple rewards that must individually accumulate to given thresholds. In this class of problems, we show a simple example in which the desired optimal policy cannot be induced by any weighted linear combination of rewards. Hence, there exist constrained reinforcement learning problems for which neither regularized nor classical primal-dual methods yield optimal policies. This work addresses this shortcoming by augmenting the state with Lagrange multipliers and reinterpreting primal-dual methods as the portion of the dynamics that drives the multipliers evolution. This approach provides a systematic state augmentation procedure that is guaranteed to solve reinforcement learning problems with constraints. Thus, as we illustrate by an example, while previous methods can fail at finding optimal policies, running the dual dynamics while executing the augmented policy yields an algorithm that provably samples actions from the optimal policy." @default.
- W3131665514 created "2021-03-01" @default.
- W3131665514 creator A5001418995 @default.
- W3131665514 creator A5045154701 @default.
- W3131665514 creator A5046974531 @default.
- W3131665514 creator A5078862959 @default.
- W3131665514 date "2023-01-01" @default.
- W3131665514 modified "2023-09-26" @default.
- W3131665514 title "State Augmented Constrained Reinforcement Learning: Overcoming the Limitations of Learning With Rewards" @default.
- W3131665514 cites W1518931405 @default.
- W3131665514 cites W1526449679 @default.
- W3131665514 cites W1963529497 @default.
- W3131665514 cites W1980102442 @default.
- W3131665514 cites W2006439090 @default.
- W3131665514 cites W2070570138 @default.
- W3131665514 cites W2073314543 @default.
- W3131665514 cites W2116775748 @default.
- W3131665514 cites W2119273405 @default.
- W3131665514 cites W2119567691 @default.
- W3131665514 cites W2121863487 @default.
- W3131665514 cites W2155027007 @default.
- W3131665514 cites W2296319761 @default.
- W3131665514 cites W2575705757 @default.
- W3131665514 cites W2789525339 @default.
- W3131665514 cites W2798766386 @default.
- W3131665514 cites W2804791273 @default.
- W3131665514 cites W2962803570 @default.
- W3131665514 cites W2964108826 @default.
- W3131665514 cites W2990389059 @default.
- W3131665514 cites W2991391803 @default.
- W3131665514 cites W3101517963 @default.
- W3131665514 cites W315644267 @default.
- W3131665514 doi "https://doi.org/10.1109/tac.2023.3319070" @default.
- W3131665514 hasPublicationYear "2023" @default.
- W3131665514 type Work @default.
- W3131665514 sameAs 3131665514 @default.
- W3131665514 citedByCount "0" @default.
- W3131665514 crossrefType "journal-article" @default.
- W3131665514 hasAuthorship W3131665514A5001418995 @default.
- W3131665514 hasAuthorship W3131665514A5045154701 @default.
- W3131665514 hasAuthorship W3131665514A5046974531 @default.
- W3131665514 hasAuthorship W3131665514A5078862959 @default.
- W3131665514 hasConcept C105795698 @default.
- W3131665514 hasConcept C106189395 @default.
- W3131665514 hasConcept C111472728 @default.
- W3131665514 hasConcept C11413529 @default.
- W3131665514 hasConcept C124952713 @default.
- W3131665514 hasConcept C126255220 @default.
- W3131665514 hasConcept C138885662 @default.
- W3131665514 hasConcept C142362112 @default.
- W3131665514 hasConcept C154945302 @default.
- W3131665514 hasConcept C159886148 @default.
- W3131665514 hasConcept C2777212361 @default.
- W3131665514 hasConcept C2780586882 @default.
- W3131665514 hasConcept C2780980858 @default.
- W3131665514 hasConcept C33923547 @default.
- W3131665514 hasConcept C41008148 @default.
- W3131665514 hasConcept C48103436 @default.
- W3131665514 hasConcept C73684929 @default.
- W3131665514 hasConcept C91575142 @default.
- W3131665514 hasConcept C97541855 @default.
- W3131665514 hasConceptScore W3131665514C105795698 @default.
- W3131665514 hasConceptScore W3131665514C106189395 @default.
- W3131665514 hasConceptScore W3131665514C111472728 @default.
- W3131665514 hasConceptScore W3131665514C11413529 @default.
- W3131665514 hasConceptScore W3131665514C124952713 @default.
- W3131665514 hasConceptScore W3131665514C126255220 @default.
- W3131665514 hasConceptScore W3131665514C138885662 @default.
- W3131665514 hasConceptScore W3131665514C142362112 @default.
- W3131665514 hasConceptScore W3131665514C154945302 @default.
- W3131665514 hasConceptScore W3131665514C159886148 @default.
- W3131665514 hasConceptScore W3131665514C2777212361 @default.
- W3131665514 hasConceptScore W3131665514C2780586882 @default.
- W3131665514 hasConceptScore W3131665514C2780980858 @default.
- W3131665514 hasConceptScore W3131665514C33923547 @default.
- W3131665514 hasConceptScore W3131665514C41008148 @default.
- W3131665514 hasConceptScore W3131665514C48103436 @default.
- W3131665514 hasConceptScore W3131665514C73684929 @default.
- W3131665514 hasConceptScore W3131665514C91575142 @default.
- W3131665514 hasConceptScore W3131665514C97541855 @default.
- W3131665514 hasLocation W31316655141 @default.
- W3131665514 hasOpenAccess W3131665514 @default.
- W3131665514 hasPrimaryLocation W31316655141 @default.
- W3131665514 hasRelatedWork W2049069909 @default.
- W3131665514 hasRelatedWork W2159487597 @default.
- W3131665514 hasRelatedWork W2997181001 @default.
- W3131665514 hasRelatedWork W3023446527 @default.
- W3131665514 hasRelatedWork W3181585847 @default.
- W3131665514 hasRelatedWork W4306705058 @default.
- W3131665514 hasRelatedWork W4308505035 @default.
- W3131665514 hasRelatedWork W4315433670 @default.
- W3131665514 hasRelatedWork W4320086332 @default.
- W3131665514 hasRelatedWork W4363672037 @default.
- W3131665514 isParatext "false" @default.
- W3131665514 isRetracted "false" @default.
- W3131665514 magId "3131665514" @default.
- W3131665514 workType "article" @default.