Matches in SemOpenAlex for { <https://semopenalex.org/work/W136366603> ?p ?o ?g. }
- W136366603 endingPage "426" @default.
- W136366603 startingPage "421" @default.
- W136366603 abstract "To operate effectively in complex environments learning agents have to selectively ignore irrelevant details by forming useful abstractions. In this paper we outline a formulation of abstraction for reinforcement learning approaches to stochastic decision problems by extending one of the recent minimization models, known as ǫ-reduction. The technique presented here extends ǫ-reduction to SMDPs by executing a policy instead of a single action, and grouping all states which have a small difference in transition probabilities and reward function under a given policy. When the reward structure is not known or multiple tasks need to be learned on the same environments, a two-phase method for state aggregation is introduced and a theorem in this paper shows the solvability of tasks using the two-phase method partitions. Simulations of different state spaces show that the policies in both MDP and this representation achieve similar results and that the total learning time in the partition space is much smaller than the total amount of time spent on learning in the original state space." @default.
- W136366603 created "2016-06-24" @default.
- W136366603 creator A5044329397 @default.
- W136366603 creator A5047174917 @default.
- W136366603 date "2005-01-01" @default.
- W136366603 modified "2023-09-28" @default.
- W136366603 title "Action Dependent State Space Abstraction for Hierarchical Learning Systems." @default.
- W136366603 cites W134602879 @default.
- W136366603 cites W1552562496 @default.
- W136366603 cites W1568042657 @default.
- W136366603 cites W1575633987 @default.
- W136366603 cites W1586162706 @default.
- W136366603 cites W1592847719 @default.
- W136366603 cites W1650504995 @default.
- W136366603 cites W1800916125 @default.
- W136366603 cites W1982678075 @default.
- W136366603 cites W1988217924 @default.
- W136366603 cites W1993711637 @default.
- W136366603 cites W2036389743 @default.
- W136366603 cites W2038694949 @default.
- W136366603 cites W2058735307 @default.
- W136366603 cites W2061504687 @default.
- W136366603 cites W2064134784 @default.
- W136366603 cites W2075379212 @default.
- W136366603 cites W2102000945 @default.
- W136366603 cites W2118318536 @default.
- W136366603 cites W2132096648 @default.
- W136366603 cites W2153947321 @default.
- W136366603 cites W2159142421 @default.
- W136366603 cites W2323009482 @default.
- W136366603 cites W2334782222 @default.
- W136366603 cites W2341171179 @default.
- W136366603 hasPublicationYear "2005" @default.
- W136366603 type Work @default.
- W136366603 sameAs 136366603 @default.
- W136366603 citedByCount "0" @default.
- W136366603 crossrefType "journal-article" @default.
- W136366603 hasAuthorship W136366603A5044329397 @default.
- W136366603 hasAuthorship W136366603A5047174917 @default.
- W136366603 hasConcept C105795698 @default.
- W136366603 hasConcept C106189395 @default.
- W136366603 hasConcept C111335779 @default.
- W136366603 hasConcept C111472728 @default.
- W136366603 hasConcept C111919701 @default.
- W136366603 hasConcept C11413529 @default.
- W136366603 hasConcept C114614502 @default.
- W136366603 hasConcept C119857082 @default.
- W136366603 hasConcept C121332964 @default.
- W136366603 hasConcept C124304363 @default.
- W136366603 hasConcept C138885662 @default.
- W136366603 hasConcept C154945302 @default.
- W136366603 hasConcept C159886148 @default.
- W136366603 hasConcept C17744445 @default.
- W136366603 hasConcept C199539241 @default.
- W136366603 hasConcept C2524010 @default.
- W136366603 hasConcept C2776359362 @default.
- W136366603 hasConcept C2778572836 @default.
- W136366603 hasConcept C2780791683 @default.
- W136366603 hasConcept C33923547 @default.
- W136366603 hasConcept C41008148 @default.
- W136366603 hasConcept C42812 @default.
- W136366603 hasConcept C48103436 @default.
- W136366603 hasConcept C62520636 @default.
- W136366603 hasConcept C72434380 @default.
- W136366603 hasConcept C80444323 @default.
- W136366603 hasConcept C94625758 @default.
- W136366603 hasConcept C97541855 @default.
- W136366603 hasConceptScore W136366603C105795698 @default.
- W136366603 hasConceptScore W136366603C106189395 @default.
- W136366603 hasConceptScore W136366603C111335779 @default.
- W136366603 hasConceptScore W136366603C111472728 @default.
- W136366603 hasConceptScore W136366603C111919701 @default.
- W136366603 hasConceptScore W136366603C11413529 @default.
- W136366603 hasConceptScore W136366603C114614502 @default.
- W136366603 hasConceptScore W136366603C119857082 @default.
- W136366603 hasConceptScore W136366603C121332964 @default.
- W136366603 hasConceptScore W136366603C124304363 @default.
- W136366603 hasConceptScore W136366603C138885662 @default.
- W136366603 hasConceptScore W136366603C154945302 @default.
- W136366603 hasConceptScore W136366603C159886148 @default.
- W136366603 hasConceptScore W136366603C17744445 @default.
- W136366603 hasConceptScore W136366603C199539241 @default.
- W136366603 hasConceptScore W136366603C2524010 @default.
- W136366603 hasConceptScore W136366603C2776359362 @default.
- W136366603 hasConceptScore W136366603C2778572836 @default.
- W136366603 hasConceptScore W136366603C2780791683 @default.
- W136366603 hasConceptScore W136366603C33923547 @default.
- W136366603 hasConceptScore W136366603C41008148 @default.
- W136366603 hasConceptScore W136366603C42812 @default.
- W136366603 hasConceptScore W136366603C48103436 @default.
- W136366603 hasConceptScore W136366603C62520636 @default.
- W136366603 hasConceptScore W136366603C72434380 @default.
- W136366603 hasConceptScore W136366603C80444323 @default.
- W136366603 hasConceptScore W136366603C94625758 @default.
- W136366603 hasConceptScore W136366603C97541855 @default.
- W136366603 hasLocation W1363666031 @default.
- W136366603 hasOpenAccess W136366603 @default.
- W136366603 hasPrimaryLocation W1363666031 @default.