Matches in SemOpenAlex for { <https://semopenalex.org/work/W4288093102> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W4288093102 abstract "Restless bandit problems assume time-varying reward distributions of the arms, which adds flexibility to the model but makes the analysis more challenging. We study learning algorithms over the unknown reward distributions and prove a sub-linear, $O(sqrt{T}log T)$, regret bound for a variant of Thompson sampling. Our analysis applies in the infinite time horizon setting, resolving the open question raised by Jung and Tewari (2019) whose analysis is limited to the episodic case. We adopt their policy mapping framework, which allows our algorithm to be efficient and simultaneously keeps the regret meaningful. Our algorithm adapts the TSDE algorithm of Ouyang et al. (2017) in a non-trivial manner to account for the special structure of restless bandits. We test our algorithm on a simulated dynamic channel access problem with several policy mappings, and the empirical regrets agree with the theoretical bound regardless of the choice of the policy mapping." @default.
- W4288093102 created "2022-07-28" @default.
- W4288093102 creator A5027578003 @default.
- W4288093102 creator A5029672764 @default.
- W4288093102 creator A5051918150 @default.
- W4288093102 date "2019-10-12" @default.
- W4288093102 modified "2023-09-25" @default.
- W4288093102 title "Thompson Sampling in Non-Episodic Restless Bandits" @default.
- W4288093102 doi "https://doi.org/10.48550/arxiv.1910.05654" @default.
- W4288093102 hasPublicationYear "2019" @default.
- W4288093102 type Work @default.
- W4288093102 citedByCount "0" @default.
- W4288093102 crossrefType "posted-content" @default.
- W4288093102 hasAuthorship W4288093102A5027578003 @default.
- W4288093102 hasAuthorship W4288093102A5029672764 @default.
- W4288093102 hasAuthorship W4288093102A5051918150 @default.
- W4288093102 hasBestOaLocation W42880931021 @default.
- W4288093102 hasConcept C105795698 @default.
- W4288093102 hasConcept C106131492 @default.
- W4288093102 hasConcept C11413529 @default.
- W4288093102 hasConcept C119857082 @default.
- W4288093102 hasConcept C126255220 @default.
- W4288093102 hasConcept C140779682 @default.
- W4288093102 hasConcept C144237770 @default.
- W4288093102 hasConcept C154945302 @default.
- W4288093102 hasConcept C2780598303 @default.
- W4288093102 hasConcept C28761237 @default.
- W4288093102 hasConcept C31972630 @default.
- W4288093102 hasConcept C33923547 @default.
- W4288093102 hasConcept C41008148 @default.
- W4288093102 hasConcept C50817715 @default.
- W4288093102 hasConcept C73602740 @default.
- W4288093102 hasConceptScore W4288093102C105795698 @default.
- W4288093102 hasConceptScore W4288093102C106131492 @default.
- W4288093102 hasConceptScore W4288093102C11413529 @default.
- W4288093102 hasConceptScore W4288093102C119857082 @default.
- W4288093102 hasConceptScore W4288093102C126255220 @default.
- W4288093102 hasConceptScore W4288093102C140779682 @default.
- W4288093102 hasConceptScore W4288093102C144237770 @default.
- W4288093102 hasConceptScore W4288093102C154945302 @default.
- W4288093102 hasConceptScore W4288093102C2780598303 @default.
- W4288093102 hasConceptScore W4288093102C28761237 @default.
- W4288093102 hasConceptScore W4288093102C31972630 @default.
- W4288093102 hasConceptScore W4288093102C33923547 @default.
- W4288093102 hasConceptScore W4288093102C41008148 @default.
- W4288093102 hasConceptScore W4288093102C50817715 @default.
- W4288093102 hasConceptScore W4288093102C73602740 @default.
- W4288093102 hasLocation W42880931021 @default.
- W4288093102 hasOpenAccess W4288093102 @default.
- W4288093102 hasPrimaryLocation W42880931021 @default.
- W4288093102 hasRelatedWork W1603462117 @default.
- W4288093102 hasRelatedWork W2116082319 @default.
- W4288093102 hasRelatedWork W2952407834 @default.
- W4288093102 hasRelatedWork W3002095816 @default.
- W4288093102 hasRelatedWork W3108185025 @default.
- W4288093102 hasRelatedWork W3196245787 @default.
- W4288093102 hasRelatedWork W3198232829 @default.
- W4288093102 hasRelatedWork W4226154670 @default.
- W4288093102 hasRelatedWork W4226471929 @default.
- W4288093102 hasRelatedWork W4286985920 @default.
- W4288093102 isParatext "false" @default.
- W4288093102 isRetracted "false" @default.
- W4288093102 workType "article" @default.