Matches in SemOpenAlex for { <https://semopenalex.org/work/W2887199102> ?p ?o ?g. }
- W2887199102 abstract "This work extends existing multi-armed bandit (MAB) algorithms beyond their original settings by leveraging advances in sequential Monte Carlo (SMC) methods from the approximate inference community. We leverage Monte Carlo estimation and, in particular, the flexibility of (sequential) importance sampling to allow for accurate estimation of the statistics of interest within the MAB problem. The MAB is a sequential allocation task where the goal is to learn a policy that maximizes long term payoff, where only the reward of the executed action is observed; i.e., sequential optimal decisions are made, while simultaneously learning how the world operates. In the stochastic setting, the reward for each action is generated from an unknown distribution. To decide the next optimal action to take, one must compute sufficient statistics of this unknown reward distribution, e.g., upper-confidence bounds (UCB), or expectations in Thompson sampling. Closed-form expressions for these statistics of interest are analytically intractable except for simple cases. By combining SMC methods --- which estimate posterior densities and expectations in probabilistic models that are analytically intractable --- with Bayesian state-of-the-art MAB algorithms, we extend their applicability to complex models: those for which sampling may be performed even if analytic computation of summary statistics is infeasible --- nonlinear reward functions and dynamic bandits. We combine SMC both for Thompson sampling and upper confident bound-based (Bayes-UCB) policies, and study different bandit models: classic Bernoulli and Gaussian distributed cases, as well as dynamic and context dependent linear-Gaussian, logistic and categorical-softmax rewards." @default.
- W2887199102 created "2018-08-22" @default.
- W2887199102 creator A5040717332 @default.
- W2887199102 creator A5085649470 @default.
- W2887199102 date "2018-08-08" @default.
- W2887199102 modified "2023-09-27" @default.
- W2887199102 title "Sequential) Importance Sampling Bandits" @default.
- W2887199102 cites W1501586228 @default.
- W2887199102 cites W1513008779 @default.
- W2887199102 cites W154374270 @default.
- W2887199102 cites W1544274373 @default.
- W2887199102 cites W157259654 @default.
- W2887199102 cites W1579979603 @default.
- W2887199102 cites W1663973292 @default.
- W2887199102 cites W1676840945 @default.
- W2887199102 cites W1762449885 @default.
- W2887199102 cites W1826234144 @default.
- W2887199102 cites W1911551976 @default.
- W2887199102 cites W1973476981 @default.
- W2887199102 cites W1973885534 @default.
- W2887199102 cites W1991169806 @default.
- W2887199102 cites W1998498767 @default.
- W2887199102 cites W1999674105 @default.
- W2887199102 cites W2001250891 @default.
- W2887199102 cites W2006022852 @default.
- W2887199102 cites W2009551863 @default.
- W2887199102 cites W2040196349 @default.
- W2887199102 cites W2065266611 @default.
- W2887199102 cites W2069739265 @default.
- W2887199102 cites W2081741802 @default.
- W2887199102 cites W2098613108 @default.
- W2887199102 cites W2104397175 @default.
- W2887199102 cites W2105934661 @default.
- W2887199102 cites W2108738385 @default.
- W2887199102 cites W2110632090 @default.
- W2887199102 cites W2112420033 @default.
- W2887199102 cites W2116082319 @default.
- W2887199102 cites W2119738618 @default.
- W2887199102 cites W2121863487 @default.
- W2887199102 cites W2124156864 @default.
- W2887199102 cites W2127406017 @default.
- W2887199102 cites W2130527179 @default.
- W2887199102 cites W2140971281 @default.
- W2887199102 cites W2141645258 @default.
- W2887199102 cites W2149721706 @default.
- W2887199102 cites W2160337655 @default.
- W2887199102 cites W2162733643 @default.
- W2887199102 cites W2164102968 @default.
- W2887199102 cites W2164411961 @default.
- W2887199102 cites W2165609874 @default.
- W2887199102 cites W2166253248 @default.
- W2887199102 cites W2168405694 @default.
- W2887199102 cites W2182000050 @default.
- W2887199102 cites W2182342230 @default.
- W2887199102 cites W2238987678 @default.
- W2887199102 cites W2271361270 @default.
- W2887199102 cites W2294875987 @default.
- W2887199102 cites W2332257675 @default.
- W2887199102 cites W2480874920 @default.
- W2887199102 cites W2620353771 @default.
- W2887199102 cites W2742123006 @default.
- W2887199102 cites W2743027853 @default.
- W2887199102 cites W2753355075 @default.
- W2887199102 cites W2775949621 @default.
- W2887199102 cites W2778633435 @default.
- W2887199102 cites W2799181865 @default.
- W2887199102 cites W2949944926 @default.
- W2887199102 cites W2952937083 @default.
- W2887199102 cites W2952962174 @default.
- W2887199102 cites W2962901934 @default.
- W2887199102 cites W2963007936 @default.
- W2887199102 cites W2963750583 @default.
- W2887199102 cites W2963938771 @default.
- W2887199102 cites W2964124625 @default.
- W2887199102 cites W2964268110 @default.
- W2887199102 cites W2964330179 @default.
- W2887199102 cites W3103934441 @default.
- W2887199102 cites W3121154744 @default.
- W2887199102 cites W3121632328 @default.
- W2887199102 cites W3122960105 @default.
- W2887199102 cites W3125096521 @default.
- W2887199102 cites W3125634603 @default.
- W2887199102 cites W2114001875 @default.
- W2887199102 hasPublicationYear "2018" @default.
- W2887199102 type Work @default.
- W2887199102 sameAs 2887199102 @default.
- W2887199102 citedByCount "1" @default.
- W2887199102 countsByYear W28871991022020 @default.
- W2887199102 crossrefType "posted-content" @default.
- W2887199102 hasAuthorship W2887199102A5040717332 @default.
- W2887199102 hasAuthorship W2887199102A5085649470 @default.
- W2887199102 hasConcept C105795698 @default.
- W2887199102 hasConcept C107673813 @default.
- W2887199102 hasConcept C119857082 @default.
- W2887199102 hasConcept C126255220 @default.
- W2887199102 hasConcept C153083717 @default.
- W2887199102 hasConcept C154945302 @default.
- W2887199102 hasConcept C157286648 @default.
- W2887199102 hasConcept C160234255 @default.
- W2887199102 hasConcept C19499675 @default.