Matches in SemOpenAlex for { <https://semopenalex.org/work/W2963650147> ?p ?o ?g. }
- W2963650147 abstract "In this paper, we show the convergence rates of posterior distributions of the model dynamics in a MDP for both episodic and continuous tasks. The theoretical results hold for general state and action space and the parameter space of the dynamics can be infinite dimensional. Moreover, we show the convergence rates of posterior distributions of the mean accumulative reward under a fixed or the optimal policy and of the regret bound. A variant of Thompson sampling algorithm is proposed which provides both posterior convergence rates for the dynamics and the regret-type bound. Then the previous results are extended to Markov games. Finally, we show numerical results with three simulation scenarios and conclude with discussions." @default.
- W2963650147 created "2019-07-30" @default.
- W2963650147 creator A5022846477 @default.
- W2963650147 creator A5071336210 @default.
- W2963650147 date "2019-07-22" @default.
- W2963650147 modified "2023-09-27" @default.
- W2963650147 title "Convergence Rates of Posterior Distributions in Markov Decision Process." @default.
- W2963650147 cites W1535258871 @default.
- W2963650147 cites W1542941925 @default.
- W2963650147 cites W1560074431 @default.
- W2963650147 cites W1626155273 @default.
- W2963650147 cites W1662803991 @default.
- W2963650147 cites W1850488217 @default.
- W2963650147 cites W1973039793 @default.
- W2963650147 cites W1988526405 @default.
- W2963650147 cites W2004060143 @default.
- W2963650147 cites W2009551863 @default.
- W2963650147 cites W2028145673 @default.
- W2963650147 cites W2036336974 @default.
- W2963650147 cites W2039522160 @default.
- W2963650147 cites W2062024806 @default.
- W2963650147 cites W2062532221 @default.
- W2963650147 cites W2119738618 @default.
- W2963650147 cites W2120346334 @default.
- W2963650147 cites W2768473197 @default.
- W2963650147 cites W2963158178 @default.
- W2963650147 cites W2963603291 @default.
- W2963650147 cites W2964000194 @default.
- W2963650147 cites W3044872185 @default.
- W2963650147 cites W3104051756 @default.
- W2963650147 hasPublicationYear "2019" @default.
- W2963650147 type Work @default.
- W2963650147 sameAs 2963650147 @default.
- W2963650147 citedByCount "0" @default.
- W2963650147 crossrefType "posted-content" @default.
- W2963650147 hasAuthorship W2963650147A5022846477 @default.
- W2963650147 hasAuthorship W2963650147A5071336210 @default.
- W2963650147 hasConcept C105795698 @default.
- W2963650147 hasConcept C106189395 @default.
- W2963650147 hasConcept C107673813 @default.
- W2963650147 hasConcept C111919701 @default.
- W2963650147 hasConcept C126255220 @default.
- W2963650147 hasConcept C127162648 @default.
- W2963650147 hasConcept C159886148 @default.
- W2963650147 hasConcept C162324750 @default.
- W2963650147 hasConcept C2777303404 @default.
- W2963650147 hasConcept C2778572836 @default.
- W2963650147 hasConcept C28826006 @default.
- W2963650147 hasConcept C31258907 @default.
- W2963650147 hasConcept C33923547 @default.
- W2963650147 hasConcept C41008148 @default.
- W2963650147 hasConcept C50522688 @default.
- W2963650147 hasConcept C50817715 @default.
- W2963650147 hasConcept C57830394 @default.
- W2963650147 hasConcept C57869625 @default.
- W2963650147 hasConcept C72434380 @default.
- W2963650147 hasConcept C98763669 @default.
- W2963650147 hasConceptScore W2963650147C105795698 @default.
- W2963650147 hasConceptScore W2963650147C106189395 @default.
- W2963650147 hasConceptScore W2963650147C107673813 @default.
- W2963650147 hasConceptScore W2963650147C111919701 @default.
- W2963650147 hasConceptScore W2963650147C126255220 @default.
- W2963650147 hasConceptScore W2963650147C127162648 @default.
- W2963650147 hasConceptScore W2963650147C159886148 @default.
- W2963650147 hasConceptScore W2963650147C162324750 @default.
- W2963650147 hasConceptScore W2963650147C2777303404 @default.
- W2963650147 hasConceptScore W2963650147C2778572836 @default.
- W2963650147 hasConceptScore W2963650147C28826006 @default.
- W2963650147 hasConceptScore W2963650147C31258907 @default.
- W2963650147 hasConceptScore W2963650147C33923547 @default.
- W2963650147 hasConceptScore W2963650147C41008148 @default.
- W2963650147 hasConceptScore W2963650147C50522688 @default.
- W2963650147 hasConceptScore W2963650147C50817715 @default.
- W2963650147 hasConceptScore W2963650147C57830394 @default.
- W2963650147 hasConceptScore W2963650147C57869625 @default.
- W2963650147 hasConceptScore W2963650147C72434380 @default.
- W2963650147 hasConceptScore W2963650147C98763669 @default.
- W2963650147 hasLocation W29636501471 @default.
- W2963650147 hasOpenAccess W2963650147 @default.
- W2963650147 hasPrimaryLocation W29636501471 @default.
- W2963650147 hasRelatedWork W1451987056 @default.
- W2963650147 hasRelatedWork W1965828084 @default.
- W2963650147 hasRelatedWork W1975715039 @default.
- W2963650147 hasRelatedWork W2007641629 @default.
- W2963650147 hasRelatedWork W2028284768 @default.
- W2963650147 hasRelatedWork W206112445 @default.
- W2963650147 hasRelatedWork W2103198983 @default.
- W2963650147 hasRelatedWork W2139302369 @default.
- W2963650147 hasRelatedWork W2141022000 @default.
- W2963650147 hasRelatedWork W2173945562 @default.
- W2963650147 hasRelatedWork W2239646336 @default.
- W2963650147 hasRelatedWork W2387415507 @default.
- W2963650147 hasRelatedWork W2573478579 @default.
- W2963650147 hasRelatedWork W262476390 @default.
- W2963650147 hasRelatedWork W2767246592 @default.
- W2963650147 hasRelatedWork W2767733971 @default.
- W2963650147 hasRelatedWork W2963940330 @default.
- W2963650147 hasRelatedWork W2964234535 @default.
- W2963650147 hasRelatedWork W3006443714 @default.
- W2963650147 hasRelatedWork W3034791360 @default.