Matches in SemOpenAlex for { <https://semopenalex.org/work/W4376460718> ?p ?o ?g. }
Showing items 1 to 69 of
69
with 100 items per page.
- W4376460718 endingPage "5970" @default.
- W4376460718 startingPage "5956" @default.
- W4376460718 abstract "We study the asymptotic performance of the Thompson sampling algorithm in the batched multi-armed bandit setting where the time horizon <inline-formula xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink> <tex-math notation=LaTeX>$T$ </tex-math></inline-formula> is divided into batches, and the agent is not able to observe the rewards of her actions until the end of each batch. We show that in this batched setting, Thompson sampling achieves the same asymptotic performance as in the case where instantaneous feedback is available after each action, provided that the batch sizes increase subexponentially. This result implies that Thompson sampling can maintain its performance even if it receives delayed feedback in <inline-formula xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink> <tex-math notation=LaTeX>$omega (log T)$ </tex-math></inline-formula> batches. We further propose an adaptive batching scheme that reduces the number of batches to <inline-formula xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink> <tex-math notation=LaTeX>$O(log log T)$ </tex-math></inline-formula> while maintaining the same performance. Although the batched multi-armed bandit setting has been considered in several recent works, previous results rely on tailored algorithms for the batched setting, which optimize the batch structure and prioritize exploration in the beginning of the experiment to eliminate suboptimal actions. We show that Thompson sampling, on the other hand, is able to achieve a similar asymptotic performance in the batched setting without any modifications." @default.
- W4376460718 created "2023-05-14" @default.
- W4376460718 creator A5003322209 @default.
- W4376460718 creator A5042058456 @default.
- W4376460718 date "2023-09-01" @default.
- W4376460718 modified "2023-09-27" @default.
- W4376460718 title "Asymptotic Performance of Thompson Sampling for Batched Multi-Armed Bandits" @default.
- W4376460718 cites W1958090791 @default.
- W4376460718 cites W1998498767 @default.
- W4376460718 cites W2009551863 @default.
- W4376460718 cites W2039522160 @default.
- W4376460718 cites W2149721706 @default.
- W4376460718 cites W2158319693 @default.
- W4376460718 cites W2168405694 @default.
- W4376460718 cites W2752599163 @default.
- W4376460718 cites W2773557179 @default.
- W4376460718 cites W3013725403 @default.
- W4376460718 cites W3114658024 @default.
- W4376460718 cites W3121632328 @default.
- W4376460718 cites W3196700791 @default.
- W4376460718 cites W4250589301 @default.
- W4376460718 doi "https://doi.org/10.1109/tit.2023.3274678" @default.
- W4376460718 hasPublicationYear "2023" @default.
- W4376460718 type Work @default.
- W4376460718 citedByCount "0" @default.
- W4376460718 crossrefType "journal-article" @default.
- W4376460718 hasAuthorship W4376460718A5003322209 @default.
- W4376460718 hasAuthorship W4376460718A5042058456 @default.
- W4376460718 hasBestOaLocation W43764607182 @default.
- W4376460718 hasConcept C106131492 @default.
- W4376460718 hasConcept C11413529 @default.
- W4376460718 hasConcept C118615104 @default.
- W4376460718 hasConcept C140779682 @default.
- W4376460718 hasConcept C31972630 @default.
- W4376460718 hasConcept C33923547 @default.
- W4376460718 hasConcept C41008148 @default.
- W4376460718 hasConcept C45357846 @default.
- W4376460718 hasConcept C94375191 @default.
- W4376460718 hasConceptScore W4376460718C106131492 @default.
- W4376460718 hasConceptScore W4376460718C11413529 @default.
- W4376460718 hasConceptScore W4376460718C118615104 @default.
- W4376460718 hasConceptScore W4376460718C140779682 @default.
- W4376460718 hasConceptScore W4376460718C31972630 @default.
- W4376460718 hasConceptScore W4376460718C33923547 @default.
- W4376460718 hasConceptScore W4376460718C41008148 @default.
- W4376460718 hasConceptScore W4376460718C45357846 @default.
- W4376460718 hasConceptScore W4376460718C94375191 @default.
- W4376460718 hasFunder F4320306076 @default.
- W4376460718 hasIssue "9" @default.
- W4376460718 hasLocation W43764607181 @default.
- W4376460718 hasLocation W43764607182 @default.
- W4376460718 hasOpenAccess W4376460718 @default.
- W4376460718 hasPrimaryLocation W43764607181 @default.
- W4376460718 hasRelatedWork W2012841980 @default.
- W4376460718 hasRelatedWork W2090686886 @default.
- W4376460718 hasRelatedWork W2338700700 @default.
- W4376460718 hasRelatedWork W2364664711 @default.
- W4376460718 hasRelatedWork W2375537470 @default.
- W4376460718 hasRelatedWork W2386767533 @default.
- W4376460718 hasRelatedWork W2888243295 @default.
- W4376460718 hasRelatedWork W2899484825 @default.
- W4376460718 hasRelatedWork W3104631496 @default.
- W4376460718 hasRelatedWork W4288772858 @default.
- W4376460718 hasVolume "69" @default.
- W4376460718 isParatext "false" @default.
- W4376460718 isRetracted "false" @default.
- W4376460718 workType "article" @default.