Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313471212> ?p ?o ?g. }
Showing items 1 to 90 of
90
with 100 items per page.
- W4313471212 endingPage "112" @default.
- W4313471212 startingPage "99" @default.
- W4313471212 abstract "We present REGA, a new adaptive-sampling-based algorithm for the control of finite-horizon Markov decision processes (MDPs) with very large state spaces and small action spaces. We apply a variant of the ϵ-greedy multiarmed bandit algorithm to each stage of the MDP in a recursive manner, thus computing an estimation of the “reward-to-go” value at each stage of the MDP. We provide a finite-time analysis of REGA. In particular, we provide a bound on the probability that the approximation error exceeds a given threshold, where the bound is given in terms of the number of samples collected at each stage of the MDP. We empirically compare REGA against another sampling-based algorithm called RASA by running simulations against the SysAdmin benchmark problem with 210 states. The results show that REGA and RASA achieved similar performance. Moreover, REGA and RASA empirically outperformed an implementation of the algorithm that uses the “original” ϵ-greedy algorithm that commonly appears in the literature." @default.
- W4313471212 created "2023-01-06" @default.
- W4313471212 creator A5003605921 @default.
- W4313471212 creator A5088820331 @default.
- W4313471212 date "2023-01-01" @default.
- W4313471212 modified "2023-09-25" @default.
- W4313471212 title "An ϵ-Greedy Multiarmed Bandit Approach to Markov Decision Processes" @default.
- W4313471212 cites W2015370225 @default.
- W4313471212 cites W2077902449 @default.
- W4313471212 cites W2084709390 @default.
- W4313471212 cites W2100857832 @default.
- W4313471212 cites W2104830490 @default.
- W4313471212 cites W2152545378 @default.
- W4313471212 cites W2157560463 @default.
- W4313471212 cites W2168405694 @default.
- W4313471212 cites W2334782222 @default.
- W4313471212 cites W2950929549 @default.
- W4313471212 cites W32403112 @default.
- W4313471212 cites W4245744559 @default.
- W4313471212 doi "https://doi.org/10.3390/stats6010006" @default.
- W4313471212 hasPublicationYear "2023" @default.
- W4313471212 type Work @default.
- W4313471212 citedByCount "0" @default.
- W4313471212 crossrefType "journal-article" @default.
- W4313471212 hasAuthorship W4313471212A5003605921 @default.
- W4313471212 hasAuthorship W4313471212A5088820331 @default.
- W4313471212 hasBestOaLocation W43134712121 @default.
- W4313471212 hasConcept C105795698 @default.
- W4313471212 hasConcept C106131492 @default.
- W4313471212 hasConcept C106189395 @default.
- W4313471212 hasConcept C107673813 @default.
- W4313471212 hasConcept C11413529 @default.
- W4313471212 hasConcept C115988155 @default.
- W4313471212 hasConcept C126255220 @default.
- W4313471212 hasConcept C13280743 @default.
- W4313471212 hasConcept C134306372 @default.
- W4313471212 hasConcept C140779682 @default.
- W4313471212 hasConcept C154945302 @default.
- W4313471212 hasConcept C159886148 @default.
- W4313471212 hasConcept C17098449 @default.
- W4313471212 hasConcept C185798385 @default.
- W4313471212 hasConcept C205649164 @default.
- W4313471212 hasConcept C31972630 @default.
- W4313471212 hasConcept C33923547 @default.
- W4313471212 hasConcept C41008148 @default.
- W4313471212 hasConcept C51823790 @default.
- W4313471212 hasConcept C73602740 @default.
- W4313471212 hasConcept C77553402 @default.
- W4313471212 hasConceptScore W4313471212C105795698 @default.
- W4313471212 hasConceptScore W4313471212C106131492 @default.
- W4313471212 hasConceptScore W4313471212C106189395 @default.
- W4313471212 hasConceptScore W4313471212C107673813 @default.
- W4313471212 hasConceptScore W4313471212C11413529 @default.
- W4313471212 hasConceptScore W4313471212C115988155 @default.
- W4313471212 hasConceptScore W4313471212C126255220 @default.
- W4313471212 hasConceptScore W4313471212C13280743 @default.
- W4313471212 hasConceptScore W4313471212C134306372 @default.
- W4313471212 hasConceptScore W4313471212C140779682 @default.
- W4313471212 hasConceptScore W4313471212C154945302 @default.
- W4313471212 hasConceptScore W4313471212C159886148 @default.
- W4313471212 hasConceptScore W4313471212C17098449 @default.
- W4313471212 hasConceptScore W4313471212C185798385 @default.
- W4313471212 hasConceptScore W4313471212C205649164 @default.
- W4313471212 hasConceptScore W4313471212C31972630 @default.
- W4313471212 hasConceptScore W4313471212C33923547 @default.
- W4313471212 hasConceptScore W4313471212C41008148 @default.
- W4313471212 hasConceptScore W4313471212C51823790 @default.
- W4313471212 hasConceptScore W4313471212C73602740 @default.
- W4313471212 hasConceptScore W4313471212C77553402 @default.
- W4313471212 hasIssue "1" @default.
- W4313471212 hasLocation W43134712121 @default.
- W4313471212 hasLocation W43134712122 @default.
- W4313471212 hasOpenAccess W4313471212 @default.
- W4313471212 hasPrimaryLocation W43134712121 @default.
- W4313471212 hasRelatedWork W1971436483 @default.
- W4313471212 hasRelatedWork W2149476049 @default.
- W4313471212 hasRelatedWork W2151167757 @default.
- W4313471212 hasRelatedWork W2161367706 @default.
- W4313471212 hasRelatedWork W2570542232 @default.
- W4313471212 hasRelatedWork W3013781205 @default.
- W4313471212 hasRelatedWork W3121013427 @default.
- W4313471212 hasRelatedWork W4244700267 @default.
- W4313471212 hasRelatedWork W4313471212 @default.
- W4313471212 hasRelatedWork W91039700 @default.
- W4313471212 hasVolume "6" @default.
- W4313471212 isParatext "false" @default.
- W4313471212 isRetracted "false" @default.
- W4313471212 workType "article" @default.