Matches in SemOpenAlex for { <https://semopenalex.org/work/W3192368239> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W3192368239 abstract "In the regret-based formulation of Multi-armed Bandit (MAB) problems, except in rare instances, much of the literature focuses on arms with i.i.d. rewards. In this paper, we consider the problem of obtaining regret guarantees for MAB problems in which the rewards of each arm form a Markov chain which may not belong to a single parameter exponential family. To achieve a logarithmic regret in such problems is not difficult: a variation of standard Kullback-Leibler Upper Confidence Bound (KL-UCB) does the job. However, the constants obtained from such an analysis are poor for the following reason: i.i.d. rewards are a special case of Markov rewards and it is difficult to design an algorithm that works well independent of whether the underlying model is truly Markovian or i.i.d. To overcome this issue, we introduce a novel algorithm that identifies whether the rewards from each arm are truly Markovian or i.i.d. using a total variation distance-based test. Our algorithm then switches from using a standard KL-UCB to a specialized version of KL-UCB when it determines that the arm reward is Markovian, thus resulting in low regrets for both i.i.d. and Markovian settings." @default.
- W3192368239 created "2021-08-16" @default.
- W3192368239 creator A5018946124 @default.
- W3192368239 creator A5028903768 @default.
- W3192368239 creator A5078518595 @default.
- W3192368239 date "2020-09-14" @default.
- W3192368239 modified "2023-09-25" @default.
- W3192368239 title "Adaptive KL-UCB based Bandit Algorithms for Markovian and i.i.d. Settings" @default.
- W3192368239 cites W1570963478 @default.
- W3192368239 cites W1850488217 @default.
- W3192368239 cites W1977898558 @default.
- W3192368239 cites W1998498767 @default.
- W3192368239 cites W2000080679 @default.
- W3192368239 cites W2009551863 @default.
- W3192368239 cites W2039522160 @default.
- W3192368239 cites W2056921512 @default.
- W3192368239 cites W2077902449 @default.
- W3192368239 cites W2113733815 @default.
- W3192368239 cites W2120125372 @default.
- W3192368239 cites W2125724988 @default.
- W3192368239 cites W2135730283 @default.
- W3192368239 cites W2146950091 @default.
- W3192368239 cites W2149660380 @default.
- W3192368239 cites W2150328967 @default.
- W3192368239 cites W2151544200 @default.
- W3192368239 cites W2153975459 @default.
- W3192368239 cites W2162979096 @default.
- W3192368239 cites W2168405694 @default.
- W3192368239 cites W2568033804 @default.
- W3192368239 cites W2767033932 @default.
- W3192368239 cites W2796125956 @default.
- W3192368239 cites W2832404192 @default.
- W3192368239 cites W2963750583 @default.
- W3192368239 cites W2964054583 @default.
- W3192368239 cites W3003718921 @default.
- W3192368239 cites W3100329718 @default.
- W3192368239 cites W3102381603 @default.
- W3192368239 cites W3104981656 @default.
- W3192368239 doi "https://doi.org/10.48550/arxiv.2009.06606" @default.
- W3192368239 hasPublicationYear "2020" @default.
- W3192368239 type Work @default.
- W3192368239 sameAs 3192368239 @default.
- W3192368239 citedByCount "0" @default.
- W3192368239 crossrefType "posted-content" @default.
- W3192368239 hasAuthorship W3192368239A5018946124 @default.
- W3192368239 hasAuthorship W3192368239A5028903768 @default.
- W3192368239 hasAuthorship W3192368239A5078518595 @default.
- W3192368239 hasBestOaLocation W31923682391 @default.
- W3192368239 hasConcept C105795698 @default.
- W3192368239 hasConcept C106189395 @default.
- W3192368239 hasConcept C11413529 @default.
- W3192368239 hasConcept C119857082 @default.
- W3192368239 hasConcept C123197309 @default.
- W3192368239 hasConcept C126255220 @default.
- W3192368239 hasConcept C134306372 @default.
- W3192368239 hasConcept C151376022 @default.
- W3192368239 hasConcept C159886148 @default.
- W3192368239 hasConcept C33923547 @default.
- W3192368239 hasConcept C39927690 @default.
- W3192368239 hasConcept C41008148 @default.
- W3192368239 hasConcept C50817715 @default.
- W3192368239 hasConcept C98763669 @default.
- W3192368239 hasConceptScore W3192368239C105795698 @default.
- W3192368239 hasConceptScore W3192368239C106189395 @default.
- W3192368239 hasConceptScore W3192368239C11413529 @default.
- W3192368239 hasConceptScore W3192368239C119857082 @default.
- W3192368239 hasConceptScore W3192368239C123197309 @default.
- W3192368239 hasConceptScore W3192368239C126255220 @default.
- W3192368239 hasConceptScore W3192368239C134306372 @default.
- W3192368239 hasConceptScore W3192368239C151376022 @default.
- W3192368239 hasConceptScore W3192368239C159886148 @default.
- W3192368239 hasConceptScore W3192368239C33923547 @default.
- W3192368239 hasConceptScore W3192368239C39927690 @default.
- W3192368239 hasConceptScore W3192368239C41008148 @default.
- W3192368239 hasConceptScore W3192368239C50817715 @default.
- W3192368239 hasConceptScore W3192368239C98763669 @default.
- W3192368239 hasLocation W31923682391 @default.
- W3192368239 hasOpenAccess W3192368239 @default.
- W3192368239 hasPrimaryLocation W31923682391 @default.
- W3192368239 hasRelatedWork W2113733815 @default.
- W3192368239 hasRelatedWork W2149943599 @default.
- W3192368239 hasRelatedWork W2913022628 @default.
- W3192368239 hasRelatedWork W2952099252 @default.
- W3192368239 hasRelatedWork W2953067537 @default.
- W3192368239 hasRelatedWork W3092949629 @default.
- W3192368239 hasRelatedWork W3140738360 @default.
- W3192368239 hasRelatedWork W3198596521 @default.
- W3192368239 hasRelatedWork W4297669648 @default.
- W3192368239 hasRelatedWork W4302063494 @default.
- W3192368239 isParatext "false" @default.
- W3192368239 isRetracted "false" @default.
- W3192368239 magId "3192368239" @default.
- W3192368239 workType "article" @default.