Matches in SemOpenAlex for { <https://semopenalex.org/work/W3033473313> ?p ?o ?g. }
- W3033473313 abstract "We study reinforcement learning in non-episodic factored Markov decision processes (FMDPs). We propose two near-optimal and oracle-efficient algorithms for FMDPs. Assuming oracle access to an FMDP planner, they enjoy a Bayesian and a frequentist regret bound respectively, both of which reduce to the near-optimal bound $widetilde{O}(DSsqrt{AT})$ for standard non-factored MDPs. We propose a tighter connectivity measure, factored span, for FMDPs and prove a lower bound that depends on the factored span rather than the diameter $D$. In order to decrease the gap between lower and upper bounds, we propose an adaptation of the REGAL.C algorithm whose regret bound depends on the factored span. Our oracle-efficient algorithms outperform previously proposed near-optimal algorithms on computer network administration simulations." @default.
- W3033473313 created "2020-06-12" @default.
- W3033473313 creator A5007965098 @default.
- W3033473313 creator A5051918150 @default.
- W3033473313 date "2020-02-06" @default.
- W3033473313 modified "2023-09-27" @default.
- W3033473313 title "Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting." @default.
- W3033473313 cites W1482136514 @default.
- W3033473313 cites W1537180453 @default.
- W3033473313 cites W1554015367 @default.
- W3033473313 cites W1568586930 @default.
- W3033473313 cites W1662803991 @default.
- W3033473313 cites W1716359979 @default.
- W3033473313 cites W18175453 @default.
- W3033473313 cites W1850488217 @default.
- W3033473313 cites W1988217924 @default.
- W3033473313 cites W1997477668 @default.
- W3033473313 cites W2020294948 @default.
- W3033473313 cites W2042996020 @default.
- W3033473313 cites W2096622112 @default.
- W3033473313 cites W2111764152 @default.
- W3033473313 cites W2115519224 @default.
- W3033473313 cites W2119567691 @default.
- W3033473313 cites W2127323769 @default.
- W3033473313 cites W2128957716 @default.
- W3033473313 cites W2132096648 @default.
- W3033473313 cites W2134779831 @default.
- W3033473313 cites W2149385746 @default.
- W3033473313 cites W2152650468 @default.
- W3033473313 cites W2154347416 @default.
- W3033473313 cites W2489939061 @default.
- W3033473313 cites W2769648743 @default.
- W3033473313 cites W2962847271 @default.
- W3033473313 cites W2964000194 @default.
- W3033473313 cites W2964245885 @default.
- W3033473313 cites W2964299116 @default.
- W3033473313 cites W2965004202 @default.
- W3033473313 cites W2970981002 @default.
- W3033473313 hasPublicationYear "2020" @default.
- W3033473313 type Work @default.
- W3033473313 sameAs 3033473313 @default.
- W3033473313 citedByCount "0" @default.
- W3033473313 crossrefType "posted-content" @default.
- W3033473313 hasAuthorship W3033473313A5007965098 @default.
- W3033473313 hasAuthorship W3033473313A5051918150 @default.
- W3033473313 hasConcept C105795698 @default.
- W3033473313 hasConcept C106189395 @default.
- W3033473313 hasConcept C107673813 @default.
- W3033473313 hasConcept C11413529 @default.
- W3033473313 hasConcept C115903868 @default.
- W3033473313 hasConcept C119857082 @default.
- W3033473313 hasConcept C134306372 @default.
- W3033473313 hasConcept C154945302 @default.
- W3033473313 hasConcept C159886148 @default.
- W3033473313 hasConcept C160234255 @default.
- W3033473313 hasConcept C162376815 @default.
- W3033473313 hasConcept C33923547 @default.
- W3033473313 hasConcept C41008148 @default.
- W3033473313 hasConcept C50817715 @default.
- W3033473313 hasConcept C55166926 @default.
- W3033473313 hasConcept C77553402 @default.
- W3033473313 hasConcept C97541855 @default.
- W3033473313 hasConceptScore W3033473313C105795698 @default.
- W3033473313 hasConceptScore W3033473313C106189395 @default.
- W3033473313 hasConceptScore W3033473313C107673813 @default.
- W3033473313 hasConceptScore W3033473313C11413529 @default.
- W3033473313 hasConceptScore W3033473313C115903868 @default.
- W3033473313 hasConceptScore W3033473313C119857082 @default.
- W3033473313 hasConceptScore W3033473313C134306372 @default.
- W3033473313 hasConceptScore W3033473313C154945302 @default.
- W3033473313 hasConceptScore W3033473313C159886148 @default.
- W3033473313 hasConceptScore W3033473313C160234255 @default.
- W3033473313 hasConceptScore W3033473313C162376815 @default.
- W3033473313 hasConceptScore W3033473313C33923547 @default.
- W3033473313 hasConceptScore W3033473313C41008148 @default.
- W3033473313 hasConceptScore W3033473313C50817715 @default.
- W3033473313 hasConceptScore W3033473313C55166926 @default.
- W3033473313 hasConceptScore W3033473313C77553402 @default.
- W3033473313 hasConceptScore W3033473313C97541855 @default.
- W3033473313 hasLocation W30334733131 @default.
- W3033473313 hasOpenAccess W3033473313 @default.
- W3033473313 hasPrimaryLocation W30334733131 @default.
- W3033473313 hasRelatedWork W2411868678 @default.
- W3033473313 hasRelatedWork W2907502549 @default.
- W3033473313 hasRelatedWork W2911884060 @default.
- W3033473313 hasRelatedWork W2941360707 @default.
- W3033473313 hasRelatedWork W2952136328 @default.
- W3033473313 hasRelatedWork W2963767098 @default.
- W3033473313 hasRelatedWork W2964284806 @default.
- W3033473313 hasRelatedWork W2995539780 @default.
- W3033473313 hasRelatedWork W3005253573 @default.
- W3033473313 hasRelatedWork W3033246677 @default.
- W3033473313 hasRelatedWork W3035613237 @default.
- W3033473313 hasRelatedWork W3045472490 @default.
- W3033473313 hasRelatedWork W3083887702 @default.
- W3033473313 hasRelatedWork W3104032756 @default.
- W3033473313 hasRelatedWork W3104186027 @default.
- W3033473313 hasRelatedWork W3106116924 @default.
- W3033473313 hasRelatedWork W3131233483 @default.
- W3033473313 hasRelatedWork W3152479033 @default.