Matches in SemOpenAlex for { <https://semopenalex.org/work/W777543669> ?p ?o ?g. }
Showing items 1 to 89 of
89
with 100 items per page.
- W777543669 abstract "Classical value estimation reinforcement learning algorithms do not perform very well in dynamic environments. On the other hand, the reinforcement learning of animals is quite flexible: they can adapt to dynamic environments very quickly and deal with noisy inputs very effectively. One feature that may contribute to animals' good performance in dynamic environments is that they learn and perceive the time to reward. In this research, we attempt to learn and perceive the time to reward and explore situations where the learned time information can be used to improve the performance of the learning agent in dynamic environments. The type of dynamic environments that we are interested in is that type of switching environment which stays the same for a long time, then changes abruptly, and then holds for a long time before another change. The type of dynamics that we mainly focus on is the time to reward, though we also extend the ideas to learning and perceiving other criteria of optimality, e.g. the discounted return, so that they can still work even when the amount of reward may also change. Specifically, both the mean and variance of the time to reward are learned and then used to detect changes in the environment and to decide whether the agent should give up a suboptimal action. When a change in the environment is detected, the learning agent responds specifically to the change in order to recover quickly from it. When it is found that the current action is still worse than the optimal one, the agent gives up this time's exploration of the action and then remakes its decision in order to avoid longer than necessary exploration. The results of our experiments using two real-world problems show that they have effectively sped up learning, reduced the time taken to recover from environmental changes, and improved the performance of the agent after the learning converges in most of the test cases compared with classical value estimation reinforcement learning algorithms. In addition, we have successfully used spiking neurons to implement various phenomena of classical conditioning, the simplest form of animal reinforcement learning in dynamic environments, and also pointed out a possible implementation of instrumental conditioning and general reinforcement learning using similar models." @default.
- W777543669 created "2016-06-24" @default.
- W777543669 creator A5058544003 @default.
- W777543669 date "2012-05-23" @default.
- W777543669 modified "2023-09-24" @default.
- W777543669 title "Reinforcement learning with time perception" @default.
- W777543669 cites W100894287 @default.
- W777543669 cites W1566451225 @default.
- W777543669 cites W2070649993 @default.
- W777543669 cites W2114475956 @default.
- W777543669 cites W2169528473 @default.
- W777543669 cites W84774806 @default.
- W777543669 hasPublicationYear "2012" @default.
- W777543669 type Work @default.
- W777543669 sameAs 777543669 @default.
- W777543669 citedByCount "0" @default.
- W777543669 crossrefType "dissertation" @default.
- W777543669 hasAuthorship W777543669A5058544003 @default.
- W777543669 hasConcept C119857082 @default.
- W777543669 hasConcept C120665830 @default.
- W777543669 hasConcept C121332964 @default.
- W777543669 hasConcept C121955636 @default.
- W777543669 hasConcept C138885662 @default.
- W777543669 hasConcept C144133560 @default.
- W777543669 hasConcept C154945302 @default.
- W777543669 hasConcept C15744967 @default.
- W777543669 hasConcept C169760540 @default.
- W777543669 hasConcept C192209626 @default.
- W777543669 hasConcept C196083921 @default.
- W777543669 hasConcept C196340769 @default.
- W777543669 hasConcept C26760741 @default.
- W777543669 hasConcept C2776291640 @default.
- W777543669 hasConcept C2776401178 @default.
- W777543669 hasConcept C2780791683 @default.
- W777543669 hasConcept C41008148 @default.
- W777543669 hasConcept C41895202 @default.
- W777543669 hasConcept C62520636 @default.
- W777543669 hasConcept C67203356 @default.
- W777543669 hasConcept C77805123 @default.
- W777543669 hasConcept C97541855 @default.
- W777543669 hasConceptScore W777543669C119857082 @default.
- W777543669 hasConceptScore W777543669C120665830 @default.
- W777543669 hasConceptScore W777543669C121332964 @default.
- W777543669 hasConceptScore W777543669C121955636 @default.
- W777543669 hasConceptScore W777543669C138885662 @default.
- W777543669 hasConceptScore W777543669C144133560 @default.
- W777543669 hasConceptScore W777543669C154945302 @default.
- W777543669 hasConceptScore W777543669C15744967 @default.
- W777543669 hasConceptScore W777543669C169760540 @default.
- W777543669 hasConceptScore W777543669C192209626 @default.
- W777543669 hasConceptScore W777543669C196083921 @default.
- W777543669 hasConceptScore W777543669C196340769 @default.
- W777543669 hasConceptScore W777543669C26760741 @default.
- W777543669 hasConceptScore W777543669C2776291640 @default.
- W777543669 hasConceptScore W777543669C2776401178 @default.
- W777543669 hasConceptScore W777543669C2780791683 @default.
- W777543669 hasConceptScore W777543669C41008148 @default.
- W777543669 hasConceptScore W777543669C41895202 @default.
- W777543669 hasConceptScore W777543669C62520636 @default.
- W777543669 hasConceptScore W777543669C67203356 @default.
- W777543669 hasConceptScore W777543669C77805123 @default.
- W777543669 hasConceptScore W777543669C97541855 @default.
- W777543669 hasLocation W7775436691 @default.
- W777543669 hasOpenAccess W777543669 @default.
- W777543669 hasPrimaryLocation W7775436691 @default.
- W777543669 hasRelatedWork W1576253121 @default.
- W777543669 hasRelatedWork W168445157 @default.
- W777543669 hasRelatedWork W1812824539 @default.
- W777543669 hasRelatedWork W1972847450 @default.
- W777543669 hasRelatedWork W1999874108 @default.
- W777543669 hasRelatedWork W2041191585 @default.
- W777543669 hasRelatedWork W2209229337 @default.
- W777543669 hasRelatedWork W2475178090 @default.
- W777543669 hasRelatedWork W2515409829 @default.
- W777543669 hasRelatedWork W2592605409 @default.
- W777543669 hasRelatedWork W2757609746 @default.
- W777543669 hasRelatedWork W2915060045 @default.
- W777543669 hasRelatedWork W2963508354 @default.
- W777543669 hasRelatedWork W3000606155 @default.
- W777543669 hasRelatedWork W3007958381 @default.
- W777543669 hasRelatedWork W3098951764 @default.
- W777543669 hasRelatedWork W3106309118 @default.
- W777543669 hasRelatedWork W3152815381 @default.
- W777543669 hasRelatedWork W3173218700 @default.
- W777543669 hasRelatedWork W3197006425 @default.
- W777543669 isParatext "false" @default.
- W777543669 isRetracted "false" @default.
- W777543669 magId "777543669" @default.
- W777543669 workType "dissertation" @default.