Matches in SemOpenAlex for { <https://semopenalex.org/work/W2034294515> ?p ?o ?g. }
Showing items 1 to 82 of
82
with 100 items per page.
- W2034294515 abstract "For scaling up reinforcement learning to large and complex problems, we propose an approach to partition the larger state space into multiple smaller state spaces based the critical states for decomposing learning task. During learning process, we record every training episode, and eliminate the state loops existed in it. We find some states have high probability (even to 1) appeared in all these acyclic episodes. We call these states critical states. That means, if agent wants to reach the goal state, then it will have high probability to pass these critical states according to the learned experience. So the critical states can be used to partition the state space for accomplishing learning task by stages. We also prove that the optimal policy found in the partitioned smaller state space is equivalent to the optimal policy found in the original state space. The experiment comparisons between Q-learning and Q-learning with critical states demonstrate our approach more effective. The more important is that our approach brings the light of how agent can use its own experience to plan learning for better performance." @default.
- W2034294515 created "2016-06-24" @default.
- W2034294515 creator A5019598467 @default.
- W2034294515 creator A5038112478 @default.
- W2034294515 creator A5082836130 @default.
- W2034294515 date "2009-10-01" @default.
- W2034294515 modified "2023-09-26" @default.
- W2034294515 title "Partitioning the state space by critical states" @default.
- W2034294515 cites W1964150045 @default.
- W2034294515 cites W2017912529 @default.
- W2034294515 cites W2029733418 @default.
- W2034294515 cites W2070301851 @default.
- W2034294515 cites W2072493968 @default.
- W2034294515 cites W2082709276 @default.
- W2034294515 cites W2112597558 @default.
- W2034294515 cites W2121517924 @default.
- W2034294515 cites W2121863487 @default.
- W2034294515 cites W32403112 @default.
- W2034294515 doi "https://doi.org/10.1109/bicta.2009.5338123" @default.
- W2034294515 hasPublicationYear "2009" @default.
- W2034294515 type Work @default.
- W2034294515 sameAs 2034294515 @default.
- W2034294515 citedByCount "4" @default.
- W2034294515 countsByYear W20342945152013 @default.
- W2034294515 countsByYear W20342945152015 @default.
- W2034294515 countsByYear W20342945152019 @default.
- W2034294515 countsByYear W20342945152020 @default.
- W2034294515 crossrefType "proceedings-article" @default.
- W2034294515 hasAuthorship W2034294515A5019598467 @default.
- W2034294515 hasAuthorship W2034294515A5038112478 @default.
- W2034294515 hasAuthorship W2034294515A5082836130 @default.
- W2034294515 hasConcept C105795698 @default.
- W2034294515 hasConcept C111919701 @default.
- W2034294515 hasConcept C11413529 @default.
- W2034294515 hasConcept C114614502 @default.
- W2034294515 hasConcept C127413603 @default.
- W2034294515 hasConcept C154945302 @default.
- W2034294515 hasConcept C188116033 @default.
- W2034294515 hasConcept C201995342 @default.
- W2034294515 hasConcept C2778572836 @default.
- W2034294515 hasConcept C2780451532 @default.
- W2034294515 hasConcept C33923547 @default.
- W2034294515 hasConcept C41008148 @default.
- W2034294515 hasConcept C42812 @default.
- W2034294515 hasConcept C48103436 @default.
- W2034294515 hasConcept C72434380 @default.
- W2034294515 hasConcept C80444323 @default.
- W2034294515 hasConcept C97541855 @default.
- W2034294515 hasConceptScore W2034294515C105795698 @default.
- W2034294515 hasConceptScore W2034294515C111919701 @default.
- W2034294515 hasConceptScore W2034294515C11413529 @default.
- W2034294515 hasConceptScore W2034294515C114614502 @default.
- W2034294515 hasConceptScore W2034294515C127413603 @default.
- W2034294515 hasConceptScore W2034294515C154945302 @default.
- W2034294515 hasConceptScore W2034294515C188116033 @default.
- W2034294515 hasConceptScore W2034294515C201995342 @default.
- W2034294515 hasConceptScore W2034294515C2778572836 @default.
- W2034294515 hasConceptScore W2034294515C2780451532 @default.
- W2034294515 hasConceptScore W2034294515C33923547 @default.
- W2034294515 hasConceptScore W2034294515C41008148 @default.
- W2034294515 hasConceptScore W2034294515C42812 @default.
- W2034294515 hasConceptScore W2034294515C48103436 @default.
- W2034294515 hasConceptScore W2034294515C72434380 @default.
- W2034294515 hasConceptScore W2034294515C80444323 @default.
- W2034294515 hasConceptScore W2034294515C97541855 @default.
- W2034294515 hasLocation W20342945151 @default.
- W2034294515 hasOpenAccess W2034294515 @default.
- W2034294515 hasPrimaryLocation W20342945151 @default.
- W2034294515 hasRelatedWork W2094557321 @default.
- W2034294515 hasRelatedWork W2123899227 @default.
- W2034294515 hasRelatedWork W2923653485 @default.
- W2034294515 hasRelatedWork W3090436287 @default.
- W2034294515 hasRelatedWork W3097708648 @default.
- W2034294515 hasRelatedWork W3103643887 @default.
- W2034294515 hasRelatedWork W3173185086 @default.
- W2034294515 hasRelatedWork W3196472998 @default.
- W2034294515 hasRelatedWork W4220761930 @default.
- W2034294515 hasRelatedWork W4287606906 @default.
- W2034294515 isParatext "false" @default.
- W2034294515 isRetracted "false" @default.
- W2034294515 magId "2034294515" @default.
- W2034294515 workType "article" @default.