Matches in SemOpenAlex for { <https://semopenalex.org/work/W1699297496> ?p ?o ?g. }
Showing items 1 to 100 of
100
with 100 items per page.
- W1699297496 endingPage "79" @default.
- W1699297496 startingPage "67" @default.
- W1699297496 abstract "Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playing a slot machine with multiple arms. We study stochastic bandit problem where each arm has a reward distribution supported in a known bounded interval, e.g. [0, 1]. In this model, Auer et al. (2002) proposed practical policies called UCB and derived finite-time regret of UCB policies. However, policies achieving the asymptotic bound given by Burnetas and Katehakis (1996) have been unknown for the model. We propose Deterministic Minimum Empirical Divergence (DMED) policy and prove that DMED achieves the asymptotic bound. Furthermore, the index used in DMED for choosing an arm can be computed easily by a convex optimization technique. Although we do not derive a finite-time regret, we confirm by simulations that DMED achieves a regret close to the asymptotic bound in finite time." @default.
- W1699297496 created "2016-06-24" @default.
- W1699297496 creator A5003753658 @default.
- W1699297496 creator A5025039029 @default.
- W1699297496 date "2010-01-01" @default.
- W1699297496 modified "2023-10-02" @default.
- W1699297496 title "An Asymptotically Optimal Bandit Algorithm for Bounded Support Models." @default.
- W1699297496 cites W1515933446 @default.
- W1699297496 cites W1516061453 @default.
- W1699297496 cites W1572204513 @default.
- W1699297496 cites W1582436621 @default.
- W1699297496 cites W1861050369 @default.
- W1699297496 cites W1970097186 @default.
- W1699297496 cites W1977823770 @default.
- W1699297496 cites W1983962754 @default.
- W1699297496 cites W2006530260 @default.
- W1699297496 cites W2009551863 @default.
- W1699297496 cites W2010189695 @default.
- W1699297496 cites W2059120410 @default.
- W1699297496 cites W2077902449 @default.
- W1699297496 cites W2097487180 @default.
- W1699297496 cites W2131958277 @default.
- W1699297496 cites W2168405694 @default.
- W1699297496 cites W2296319761 @default.
- W1699297496 cites W2317700292 @default.
- W1699297496 cites W2401950669 @default.
- W1699297496 hasPublicationYear "2010" @default.
- W1699297496 type Work @default.
- W1699297496 sameAs 1699297496 @default.
- W1699297496 citedByCount "71" @default.
- W1699297496 countsByYear W16992974962012 @default.
- W1699297496 countsByYear W16992974962013 @default.
- W1699297496 countsByYear W16992974962014 @default.
- W1699297496 countsByYear W16992974962015 @default.
- W1699297496 countsByYear W16992974962016 @default.
- W1699297496 countsByYear W16992974962017 @default.
- W1699297496 countsByYear W16992974962018 @default.
- W1699297496 countsByYear W16992974962019 @default.
- W1699297496 countsByYear W16992974962020 @default.
- W1699297496 countsByYear W16992974962021 @default.
- W1699297496 crossrefType "proceedings-article" @default.
- W1699297496 hasAuthorship W1699297496A5003753658 @default.
- W1699297496 hasAuthorship W1699297496A5025039029 @default.
- W1699297496 hasConcept C105795698 @default.
- W1699297496 hasConcept C114614502 @default.
- W1699297496 hasConcept C126255220 @default.
- W1699297496 hasConcept C134306372 @default.
- W1699297496 hasConcept C138885662 @default.
- W1699297496 hasConcept C181789720 @default.
- W1699297496 hasConcept C207390915 @default.
- W1699297496 hasConcept C2778067643 @default.
- W1699297496 hasConcept C33923547 @default.
- W1699297496 hasConcept C34388435 @default.
- W1699297496 hasConcept C41008148 @default.
- W1699297496 hasConcept C41895202 @default.
- W1699297496 hasConcept C50817715 @default.
- W1699297496 hasConcept C77553402 @default.
- W1699297496 hasConceptScore W1699297496C105795698 @default.
- W1699297496 hasConceptScore W1699297496C114614502 @default.
- W1699297496 hasConceptScore W1699297496C126255220 @default.
- W1699297496 hasConceptScore W1699297496C134306372 @default.
- W1699297496 hasConceptScore W1699297496C138885662 @default.
- W1699297496 hasConceptScore W1699297496C181789720 @default.
- W1699297496 hasConceptScore W1699297496C207390915 @default.
- W1699297496 hasConceptScore W1699297496C2778067643 @default.
- W1699297496 hasConceptScore W1699297496C33923547 @default.
- W1699297496 hasConceptScore W1699297496C34388435 @default.
- W1699297496 hasConceptScore W1699297496C41008148 @default.
- W1699297496 hasConceptScore W1699297496C41895202 @default.
- W1699297496 hasConceptScore W1699297496C50817715 @default.
- W1699297496 hasConceptScore W1699297496C77553402 @default.
- W1699297496 hasLocation W16992974961 @default.
- W1699297496 hasOpenAccess W1699297496 @default.
- W1699297496 hasPrimaryLocation W16992974961 @default.
- W1699297496 hasRelatedWork W1501823362 @default.
- W1699297496 hasRelatedWork W1556834409 @default.
- W1699297496 hasRelatedWork W1911551976 @default.
- W1699297496 hasRelatedWork W1973885534 @default.
- W1699297496 hasRelatedWork W1975779216 @default.
- W1699297496 hasRelatedWork W1983962754 @default.
- W1699297496 hasRelatedWork W1998376807 @default.
- W1699297496 hasRelatedWork W1998498767 @default.
- W1699297496 hasRelatedWork W2000080679 @default.
- W1699297496 hasRelatedWork W2009551863 @default.
- W1699297496 hasRelatedWork W2039522160 @default.
- W1699297496 hasRelatedWork W2077902449 @default.
- W1699297496 hasRelatedWork W2108114251 @default.
- W1699297496 hasRelatedWork W2123681024 @default.
- W1699297496 hasRelatedWork W2131958277 @default.
- W1699297496 hasRelatedWork W2142971854 @default.
- W1699297496 hasRelatedWork W2168405694 @default.
- W1699297496 hasRelatedWork W3100329718 @default.
- W1699297496 hasRelatedWork W3125634603 @default.
- W1699297496 hasRelatedWork W50486269 @default.
- W1699297496 isParatext "false" @default.
- W1699297496 isRetracted "false" @default.
- W1699297496 magId "1699297496" @default.
- W1699297496 workType "article" @default.