Matches in SemOpenAlex for { <https://semopenalex.org/work/W2915064529> ?p ?o ?g. }
Showing items 1 to 95 of
95
with 100 items per page.
- W2915064529 abstract "Boltzmann exploration is widely used in reinforcement learning to provide a trade-off between exploration and exploitation. Recently, in (Cesa-Bianchi et al., 2017) it has been shown that pure Boltzmann exploration does not perform well from a regret perspective, even in the simplest setting of stochastic multi-armed bandit (MAB) problems. In this paper, we show that a simple modification to Boltzmann exploration, motivated by a variation of the standard doubling trick, achieves $O(Klog^{1+alpha} T)$ regret for a stochastic MAB problem with $K$ arms, where $alpha>0$ is a parameter of the algorithm. This improves on the result in (Cesa-Bianchi et al., 2017), where an algorithm inspired by the Gumbel-softmax trick achieves $O(Klog^2 T)$ regret. We also show that our algorithm achieves $O(beta(G) log^{1+alpha} T)$ regret in stochastic MAB problems with graph-structured feedback, without knowledge of the graph structure, where $beta(G)$ is the independence number of the feedback graph. Additionally, we present extensive experimental results on real datasets and applications for multi-armed bandits with both traditional bandit feedback and graph-structured feedback. In all cases, our algorithm performs as well or better than the state-of-the-art." @default.
- W2915064529 created "2019-02-21" @default.
- W2915064529 creator A5023440365 @default.
- W2915064529 creator A5029706579 @default.
- W2915064529 creator A5043550754 @default.
- W2915064529 creator A5078518595 @default.
- W2915064529 date "2019-01-25" @default.
- W2915064529 modified "2023-10-16" @default.
- W2915064529 title "Almost Boltzmann Exploration." @default.
- W2915064529 cites W1570963478 @default.
- W2915064529 cites W165458731 @default.
- W2915064529 cites W1941445455 @default.
- W2915064529 cites W1988790447 @default.
- W2915064529 cites W2032210760 @default.
- W2915064529 cites W2049934117 @default.
- W2915064529 cites W2072419029 @default.
- W2915064529 cites W2121863487 @default.
- W2915064529 cites W2133266261 @default.
- W2915064529 cites W2135598826 @default.
- W2915064529 cites W2155461593 @default.
- W2915064529 cites W2184241488 @default.
- W2915064529 cites W2508363885 @default.
- W2915064529 cites W2573245339 @default.
- W2915064529 cites W2616944917 @default.
- W2915064529 cites W2803555606 @default.
- W2915064529 cites W2962927562 @default.
- W2915064529 cites W2963389158 @default.
- W2915064529 cites W2963486904 @default.
- W2915064529 cites W56894658 @default.
- W2915064529 hasPublicationYear "2019" @default.
- W2915064529 type Work @default.
- W2915064529 sameAs 2915064529 @default.
- W2915064529 citedByCount "0" @default.
- W2915064529 crossrefType "posted-content" @default.
- W2915064529 hasAuthorship W2915064529A5023440365 @default.
- W2915064529 hasAuthorship W2915064529A5029706579 @default.
- W2915064529 hasAuthorship W2915064529A5043550754 @default.
- W2915064529 hasAuthorship W2915064529A5078518595 @default.
- W2915064529 hasConcept C108583219 @default.
- W2915064529 hasConcept C11413529 @default.
- W2915064529 hasConcept C114614502 @default.
- W2915064529 hasConcept C119857082 @default.
- W2915064529 hasConcept C121332964 @default.
- W2915064529 hasConcept C126255220 @default.
- W2915064529 hasConcept C132525143 @default.
- W2915064529 hasConcept C154945302 @default.
- W2915064529 hasConcept C192576344 @default.
- W2915064529 hasConcept C33923547 @default.
- W2915064529 hasConcept C35304006 @default.
- W2915064529 hasConcept C41008148 @default.
- W2915064529 hasConcept C50817715 @default.
- W2915064529 hasConcept C80444323 @default.
- W2915064529 hasConcept C97355855 @default.
- W2915064529 hasConceptScore W2915064529C108583219 @default.
- W2915064529 hasConceptScore W2915064529C11413529 @default.
- W2915064529 hasConceptScore W2915064529C114614502 @default.
- W2915064529 hasConceptScore W2915064529C119857082 @default.
- W2915064529 hasConceptScore W2915064529C121332964 @default.
- W2915064529 hasConceptScore W2915064529C126255220 @default.
- W2915064529 hasConceptScore W2915064529C132525143 @default.
- W2915064529 hasConceptScore W2915064529C154945302 @default.
- W2915064529 hasConceptScore W2915064529C192576344 @default.
- W2915064529 hasConceptScore W2915064529C33923547 @default.
- W2915064529 hasConceptScore W2915064529C35304006 @default.
- W2915064529 hasConceptScore W2915064529C41008148 @default.
- W2915064529 hasConceptScore W2915064529C50817715 @default.
- W2915064529 hasConceptScore W2915064529C80444323 @default.
- W2915064529 hasConceptScore W2915064529C97355855 @default.
- W2915064529 hasLocation W29150645291 @default.
- W2915064529 hasOpenAccess W2915064529 @default.
- W2915064529 hasPrimaryLocation W29150645291 @default.
- W2915064529 hasRelatedWork W2119738618 @default.
- W2915064529 hasRelatedWork W2147398813 @default.
- W2915064529 hasRelatedWork W2767425317 @default.
- W2915064529 hasRelatedWork W2896133671 @default.
- W2915064529 hasRelatedWork W2950200426 @default.
- W2915064529 hasRelatedWork W2963935738 @default.
- W2915064529 hasRelatedWork W3009712390 @default.
- W2915064529 hasRelatedWork W3009822859 @default.
- W2915064529 hasRelatedWork W3034456648 @default.
- W2915064529 hasRelatedWork W3035273634 @default.
- W2915064529 hasRelatedWork W3045472490 @default.
- W2915064529 hasRelatedWork W3046434636 @default.
- W2915064529 hasRelatedWork W3089603435 @default.
- W2915064529 hasRelatedWork W3092940299 @default.
- W2915064529 hasRelatedWork W3094379023 @default.
- W2915064529 hasRelatedWork W3111791188 @default.
- W2915064529 hasRelatedWork W3126160212 @default.
- W2915064529 hasRelatedWork W3157939315 @default.
- W2915064529 hasRelatedWork W3168046038 @default.
- W2915064529 hasRelatedWork W3173494639 @default.
- W2915064529 isParatext "false" @default.
- W2915064529 isRetracted "false" @default.
- W2915064529 magId "2915064529" @default.
- W2915064529 workType "article" @default.