Matches in SemOpenAlex for { <https://semopenalex.org/work/W3101830905> ?p ?o ?g. }
- W3101830905 abstract "Standard reinforcement learning (RL) algorithms train agents to maximize given reward functions. However, many real-world applications of RL require agents to also satisfy certain constraints which may, for example, be motivated by safety concerns. Constrained RL algorithms approach this problem by training agents to maximize given reward functions while respecting textit{explicitly} defined constraints. However, in many cases, manually designing accurate constraints is a challenging task. In this work, given a reward function and a set of demonstrations from an expert that maximizes this reward function while respecting textit{unknown} constraints, we propose a framework to learn the most likely constraints that the expert respects. We then train agents to maximize the given reward function subject to the learned constraints. Previous works in this regard have either mainly been restricted to tabular settings or specific types of constraints or assume knowledge of transition dynamics of the environment. In contrast, we empirically show that our framework is able to learn arbitrary textit{Markovian} constraints in high-dimensions in a model-free setting." @default.
- W3101830905 created "2020-11-23" @default.
- W3101830905 creator A5031914527 @default.
- W3101830905 creator A5049589547 @default.
- W3101830905 creator A5060271129 @default.
- W3101830905 creator A5062315433 @default.
- W3101830905 date "2020-11-19" @default.
- W3101830905 modified "2023-09-23" @default.
- W3101830905 title "Inverse Constrained Reinforcement Learning" @default.
- W3101830905 cites W1518931405 @default.
- W3101830905 cites W1591675293 @default.
- W3101830905 cites W2098774185 @default.
- W3101830905 cites W2122410182 @default.
- W3101830905 cites W2161270100 @default.
- W3101830905 cites W218896052 @default.
- W3101830905 cites W2462906003 @default.
- W3101830905 cites W2545050612 @default.
- W3101830905 cites W2580300496 @default.
- W3101830905 cites W2735318784 @default.
- W3101830905 cites W2736601468 @default.
- W3101830905 cites W2738190501 @default.
- W3101830905 cites W2768908787 @default.
- W3101830905 cites W2891781407 @default.
- W3101830905 cites W2901707424 @default.
- W3101830905 cites W2903838810 @default.
- W3101830905 cites W2911940799 @default.
- W3101830905 cites W2962734844 @default.
- W3101830905 cites W2962803570 @default.
- W3101830905 cites W2963277051 @default.
- W3101830905 cites W2963575623 @default.
- W3101830905 cites W2963590100 @default.
- W3101830905 cites W2964121744 @default.
- W3101830905 cites W2964222567 @default.
- W3101830905 cites W2964263543 @default.
- W3101830905 cites W2964340170 @default.
- W3101830905 cites W2970749192 @default.
- W3101830905 cites W2990130970 @default.
- W3101830905 cites W2995634437 @default.
- W3101830905 cites W3008128154 @default.
- W3101830905 cites W3037655992 @default.
- W3101830905 cites W52822972 @default.
- W3101830905 hasPublicationYear "2020" @default.
- W3101830905 type Work @default.
- W3101830905 sameAs 3101830905 @default.
- W3101830905 citedByCount "0" @default.
- W3101830905 crossrefType "posted-content" @default.
- W3101830905 hasAuthorship W3101830905A5031914527 @default.
- W3101830905 hasAuthorship W3101830905A5049589547 @default.
- W3101830905 hasAuthorship W3101830905A5060271129 @default.
- W3101830905 hasAuthorship W3101830905A5062315433 @default.
- W3101830905 hasConcept C105795698 @default.
- W3101830905 hasConcept C106189395 @default.
- W3101830905 hasConcept C126255220 @default.
- W3101830905 hasConcept C14036430 @default.
- W3101830905 hasConcept C154945302 @default.
- W3101830905 hasConcept C159886148 @default.
- W3101830905 hasConcept C162324750 @default.
- W3101830905 hasConcept C177264268 @default.
- W3101830905 hasConcept C187736073 @default.
- W3101830905 hasConcept C199360897 @default.
- W3101830905 hasConcept C2780451532 @default.
- W3101830905 hasConcept C33923547 @default.
- W3101830905 hasConcept C41008148 @default.
- W3101830905 hasConcept C78458016 @default.
- W3101830905 hasConcept C86803240 @default.
- W3101830905 hasConcept C97541855 @default.
- W3101830905 hasConceptScore W3101830905C105795698 @default.
- W3101830905 hasConceptScore W3101830905C106189395 @default.
- W3101830905 hasConceptScore W3101830905C126255220 @default.
- W3101830905 hasConceptScore W3101830905C14036430 @default.
- W3101830905 hasConceptScore W3101830905C154945302 @default.
- W3101830905 hasConceptScore W3101830905C159886148 @default.
- W3101830905 hasConceptScore W3101830905C162324750 @default.
- W3101830905 hasConceptScore W3101830905C177264268 @default.
- W3101830905 hasConceptScore W3101830905C187736073 @default.
- W3101830905 hasConceptScore W3101830905C199360897 @default.
- W3101830905 hasConceptScore W3101830905C2780451532 @default.
- W3101830905 hasConceptScore W3101830905C33923547 @default.
- W3101830905 hasConceptScore W3101830905C41008148 @default.
- W3101830905 hasConceptScore W3101830905C78458016 @default.
- W3101830905 hasConceptScore W3101830905C86803240 @default.
- W3101830905 hasConceptScore W3101830905C97541855 @default.
- W3101830905 hasOpenAccess W3101830905 @default.
- W3101830905 hasRelatedWork W1999874108 @default.
- W3101830905 hasRelatedWork W2117626647 @default.
- W3101830905 hasRelatedWork W2757609746 @default.
- W3101830905 hasRelatedWork W2804791273 @default.
- W3101830905 hasRelatedWork W2897200624 @default.
- W3101830905 hasRelatedWork W2912960479 @default.
- W3101830905 hasRelatedWork W2913350117 @default.
- W3101830905 hasRelatedWork W2915060045 @default.
- W3101830905 hasRelatedWork W2972981152 @default.
- W3101830905 hasRelatedWork W3035599863 @default.
- W3101830905 hasRelatedWork W3037476194 @default.
- W3101830905 hasRelatedWork W3115089987 @default.
- W3101830905 hasRelatedWork W3130192094 @default.
- W3101830905 hasRelatedWork W3138341590 @default.
- W3101830905 hasRelatedWork W3165911406 @default.
- W3101830905 hasRelatedWork W3173218700 @default.
- W3101830905 hasRelatedWork W3175104240 @default.