Matches in SemOpenAlex for { <https://semopenalex.org/work/W2562681634> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W2562681634 abstract "This paper presents a new method to learn online policies in continuous state, continuous action, model-free Markov decision processes, with two properties that are crucial for practical applications. First, the policies are implementable with a very low computational cost: once the policy is computed, the action corresponding to a given state is obtained in logarithmic time with respect to the number of samples used. Second, our method is versatile: it does not rely on any a priori knowledge of the structure of optimal policies. We build upon the Fitted Q-iteration algorithm which represents the $Q$-value as the average of several regression trees. Our algorithm, the Fitted Policy Forest algorithm (FPF), computes a regression forest representing the Q-value and transforms it into a single tree representing the policy, while keeping control on the size of the policy using resampling and leaf merging. We introduce an adaptation of Multi-Resolution Exploration (MRE) which is particularly suited to FPF. We assess the performance of FPF on three classical benchmarks for reinforcement learning: the Inverted Pendulum, the Double Integrator and Car on the Hill and show that FPF equals or outperforms other algorithms, although these algorithms rely on the use of particular representations of the policies, especially chosen in order to fit each of the three problems. Finally, we exhibit that the combination of FPF and MRE allows to find nearly optimal solutions in problems where $epsilon$-greedy approaches would fail." @default.
- W2562681634 created "2017-01-06" @default.
- W2562681634 creator A5055688187 @default.
- W2562681634 creator A5056775678 @default.
- W2562681634 date "2016-06-13" @default.
- W2562681634 modified "2023-09-26" @default.
- W2562681634 title "Online Reinforcement Learning for Real-Time Exploration in Continuous State and Action Markov Decision Processes" @default.
- W2562681634 cites W1626155273 @default.
- W2562681634 cites W1971845780 @default.
- W2562681634 cites W1982948368 @default.
- W2562681634 cites W2056132907 @default.
- W2562681634 cites W2068052921 @default.
- W2562681634 cites W2097451572 @default.
- W2562681634 cites W2097846877 @default.
- W2562681634 cites W2106907982 @default.
- W2562681634 cites W2119567691 @default.
- W2562681634 cites W2120346334 @default.
- W2562681634 cites W2125612430 @default.
- W2562681634 cites W2149863006 @default.
- W2562681634 cites W2150923691 @default.
- W2562681634 cites W2154549708 @default.
- W2562681634 cites W2197494948 @default.
- W2562681634 cites W2214981627 @default.
- W2562681634 cites W2219400982 @default.
- W2562681634 cites W2912934387 @default.
- W2562681634 hasPublicationYear "2016" @default.
- W2562681634 type Work @default.
- W2562681634 sameAs 2562681634 @default.
- W2562681634 citedByCount "0" @default.
- W2562681634 crossrefType "proceedings-article" @default.
- W2562681634 hasAuthorship W2562681634A5055688187 @default.
- W2562681634 hasAuthorship W2562681634A5056775678 @default.
- W2562681634 hasBestOaLocation W25626816342 @default.
- W2562681634 hasConcept C105795698 @default.
- W2562681634 hasConcept C106189395 @default.
- W2562681634 hasConcept C11413529 @default.
- W2562681634 hasConcept C119857082 @default.
- W2562681634 hasConcept C121332964 @default.
- W2562681634 hasConcept C154945302 @default.
- W2562681634 hasConcept C159886148 @default.
- W2562681634 hasConcept C163836022 @default.
- W2562681634 hasConcept C17098449 @default.
- W2562681634 hasConcept C2780791683 @default.
- W2562681634 hasConcept C33923547 @default.
- W2562681634 hasConcept C41008148 @default.
- W2562681634 hasConcept C48103436 @default.
- W2562681634 hasConcept C62520636 @default.
- W2562681634 hasConcept C97541855 @default.
- W2562681634 hasConcept C98763669 @default.
- W2562681634 hasConceptScore W2562681634C105795698 @default.
- W2562681634 hasConceptScore W2562681634C106189395 @default.
- W2562681634 hasConceptScore W2562681634C11413529 @default.
- W2562681634 hasConceptScore W2562681634C119857082 @default.
- W2562681634 hasConceptScore W2562681634C121332964 @default.
- W2562681634 hasConceptScore W2562681634C154945302 @default.
- W2562681634 hasConceptScore W2562681634C159886148 @default.
- W2562681634 hasConceptScore W2562681634C163836022 @default.
- W2562681634 hasConceptScore W2562681634C17098449 @default.
- W2562681634 hasConceptScore W2562681634C2780791683 @default.
- W2562681634 hasConceptScore W2562681634C33923547 @default.
- W2562681634 hasConceptScore W2562681634C41008148 @default.
- W2562681634 hasConceptScore W2562681634C48103436 @default.
- W2562681634 hasConceptScore W2562681634C62520636 @default.
- W2562681634 hasConceptScore W2562681634C97541855 @default.
- W2562681634 hasConceptScore W2562681634C98763669 @default.
- W2562681634 hasLocation W25626816341 @default.
- W2562681634 hasLocation W25626816342 @default.
- W2562681634 hasOpenAccess W2562681634 @default.
- W2562681634 hasPrimaryLocation W25626816341 @default.
- W2562681634 hasRelatedWork W1515117609 @default.
- W2562681634 hasRelatedWork W1551379884 @default.
- W2562681634 hasRelatedWork W1563041104 @default.
- W2562681634 hasRelatedWork W2026691440 @default.
- W2562681634 hasRelatedWork W2095807485 @default.
- W2562681634 hasRelatedWork W2146763310 @default.
- W2562681634 hasRelatedWork W2156371714 @default.
- W2562681634 hasRelatedWork W2356987663 @default.
- W2562681634 hasRelatedWork W4285429136 @default.
- W2562681634 hasRelatedWork W2096496337 @default.
- W2562681634 isParatext "false" @default.
- W2562681634 isRetracted "false" @default.
- W2562681634 magId "2562681634" @default.
- W2562681634 workType "article" @default.