Matches in SemOpenAlex for { <https://semopenalex.org/work/W4294964479> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W4294964479 abstract "Current work in explainable reinforcement learning generally produces policies in the form of a decision tree over the state space. Such policies can be used for formal safety verification, agent behavior prediction, and manual inspection of important features. However, existing approaches fit a decision tree after training or use a custom learning procedure which is not compatible with new learning techniques, such as those which use neural networks. To address this limitation, we propose a novel Markov Decision Process (MDP) type for learning decision tree policies: Iterative Bounding MDPs (IBMDPs). An IBMDP is constructed around a base MDP so each IBMDP policy is guaranteed to correspond to a decision tree policy for the base MDP when using a method-agnostic masking procedure. Because of this decision tree equivalence, any function approximator can be used during training, including a neural network, while yielding a decision tree policy for the base MDP. We present the required masking procedure as well as a modified value update step which allows IBMDPs to be solved using existing algorithms. We apply this procedure to produce IBMDP variants of recent reinforcement learning methods. We empirically show the benefits of our approach by solving IBMDPs to produce decision tree policies for the base MDPs." @default.
- W4294964479 created "2022-09-08" @default.
- W4294964479 creator A5051795411 @default.
- W4294964479 creator A5066795279 @default.
- W4294964479 creator A5070450085 @default.
- W4294964479 creator A5088276691 @default.
- W4294964479 date "2021-02-25" @default.
- W4294964479 modified "2023-09-25" @default.
- W4294964479 title "Iterative Bounding MDPs: Learning Interpretable Policies via Non-Interpretable Methods" @default.
- W4294964479 doi "https://doi.org/10.48550/arxiv.2102.13045" @default.
- W4294964479 hasPublicationYear "2021" @default.
- W4294964479 type Work @default.
- W4294964479 citedByCount "0" @default.
- W4294964479 crossrefType "posted-content" @default.
- W4294964479 hasAuthorship W4294964479A5051795411 @default.
- W4294964479 hasAuthorship W4294964479A5066795279 @default.
- W4294964479 hasAuthorship W4294964479A5070450085 @default.
- W4294964479 hasAuthorship W4294964479A5088276691 @default.
- W4294964479 hasBestOaLocation W42949644791 @default.
- W4294964479 hasConcept C10229987 @default.
- W4294964479 hasConcept C105795698 @default.
- W4294964479 hasConcept C106189395 @default.
- W4294964479 hasConcept C113174947 @default.
- W4294964479 hasConcept C119857082 @default.
- W4294964479 hasConcept C134306372 @default.
- W4294964479 hasConcept C154945302 @default.
- W4294964479 hasConcept C159886148 @default.
- W4294964479 hasConcept C33923547 @default.
- W4294964479 hasConcept C41008148 @default.
- W4294964479 hasConcept C50644808 @default.
- W4294964479 hasConcept C5481197 @default.
- W4294964479 hasConcept C63584917 @default.
- W4294964479 hasConcept C72434380 @default.
- W4294964479 hasConcept C84525736 @default.
- W4294964479 hasConcept C97541855 @default.
- W4294964479 hasConceptScore W4294964479C10229987 @default.
- W4294964479 hasConceptScore W4294964479C105795698 @default.
- W4294964479 hasConceptScore W4294964479C106189395 @default.
- W4294964479 hasConceptScore W4294964479C113174947 @default.
- W4294964479 hasConceptScore W4294964479C119857082 @default.
- W4294964479 hasConceptScore W4294964479C134306372 @default.
- W4294964479 hasConceptScore W4294964479C154945302 @default.
- W4294964479 hasConceptScore W4294964479C159886148 @default.
- W4294964479 hasConceptScore W4294964479C33923547 @default.
- W4294964479 hasConceptScore W4294964479C41008148 @default.
- W4294964479 hasConceptScore W4294964479C50644808 @default.
- W4294964479 hasConceptScore W4294964479C5481197 @default.
- W4294964479 hasConceptScore W4294964479C63584917 @default.
- W4294964479 hasConceptScore W4294964479C72434380 @default.
- W4294964479 hasConceptScore W4294964479C84525736 @default.
- W4294964479 hasConceptScore W4294964479C97541855 @default.
- W4294964479 hasLocation W42949644791 @default.
- W4294964479 hasOpenAccess W4294964479 @default.
- W4294964479 hasPrimaryLocation W42949644791 @default.
- W4294964479 hasRelatedWork W1585781035 @default.
- W4294964479 hasRelatedWork W2009074768 @default.
- W4294964479 hasRelatedWork W2131660762 @default.
- W4294964479 hasRelatedWork W2803390621 @default.
- W4294964479 hasRelatedWork W2954804306 @default.
- W4294964479 hasRelatedWork W3094337650 @default.
- W4294964479 hasRelatedWork W3130234572 @default.
- W4294964479 hasRelatedWork W3173410198 @default.
- W4294964479 hasRelatedWork W4294964479 @default.
- W4294964479 hasRelatedWork W936909164 @default.
- W4294964479 isParatext "false" @default.
- W4294964479 isRetracted "false" @default.
- W4294964479 workType "article" @default.