Matches in SemOpenAlex for { <https://semopenalex.org/work/W4200380940> ?p ?o ?g. }
Showing items 1 to 52 of
52
with 100 items per page.
- W4200380940 abstract "This paper optimizes reinforcement learning (RL) parameters and makes agents solve problems more accurately and efficiently. The RL algorithm that the paper used is Q-learning, and the experimental environment is mazes. There are three parameters to influence the entire performance in RL, such as learning rate, greedy factor, and discount rate. This paper introduces a nonlinear time-varying strategy (NTV) into the Q-learning algorithm and uses a uniform experiment design (UED) to effectively obtain the best parameter combination. The agent can steadily figure out a solution to the mazes with the fewest steps. In conclusion, this paper proposes a Q-learning with NTV (NTV-Q-learning) to search the solutions and steadily get out of the mazes with minimal steps to prove the effective improvement of Q-Learning. From the experimental results, the solution obtained by NTV-Q-learning is better than Q-Learning." @default.
- W4200380940 created "2021-12-31" @default.
- W4200380940 creator A5016062587 @default.
- W4200380940 creator A5023756898 @default.
- W4200380940 creator A5042632224 @default.
- W4200380940 creator A5054686089 @default.
- W4200380940 creator A5059188982 @default.
- W4200380940 date "2021-11-16" @default.
- W4200380940 modified "2023-10-17" @default.
- W4200380940 title "Parameters Optimization for Reinforcement Learning with Nonlinear Time-Varying Strategy by Using Uniform Experiment Design" @default.
- W4200380940 doi "https://doi.org/10.1109/ispacs51563.2021.9651084" @default.
- W4200380940 hasPublicationYear "2021" @default.
- W4200380940 type Work @default.
- W4200380940 citedByCount "0" @default.
- W4200380940 crossrefType "proceedings-article" @default.
- W4200380940 hasAuthorship W4200380940A5016062587 @default.
- W4200380940 hasAuthorship W4200380940A5023756898 @default.
- W4200380940 hasAuthorship W4200380940A5042632224 @default.
- W4200380940 hasAuthorship W4200380940A5054686089 @default.
- W4200380940 hasAuthorship W4200380940A5059188982 @default.
- W4200380940 hasConcept C119857082 @default.
- W4200380940 hasConcept C121332964 @default.
- W4200380940 hasConcept C154945302 @default.
- W4200380940 hasConcept C158622935 @default.
- W4200380940 hasConcept C188116033 @default.
- W4200380940 hasConcept C41008148 @default.
- W4200380940 hasConcept C62520636 @default.
- W4200380940 hasConcept C97541855 @default.
- W4200380940 hasConceptScore W4200380940C119857082 @default.
- W4200380940 hasConceptScore W4200380940C121332964 @default.
- W4200380940 hasConceptScore W4200380940C154945302 @default.
- W4200380940 hasConceptScore W4200380940C158622935 @default.
- W4200380940 hasConceptScore W4200380940C188116033 @default.
- W4200380940 hasConceptScore W4200380940C41008148 @default.
- W4200380940 hasConceptScore W4200380940C62520636 @default.
- W4200380940 hasConceptScore W4200380940C97541855 @default.
- W4200380940 hasLocation W42003809401 @default.
- W4200380940 hasOpenAccess W4200380940 @default.
- W4200380940 hasPrimaryLocation W42003809401 @default.
- W4200380940 hasRelatedWork W1562959674 @default.
- W4200380940 hasRelatedWork W2923653485 @default.
- W4200380940 hasRelatedWork W2951071805 @default.
- W4200380940 hasRelatedWork W2957776456 @default.
- W4200380940 hasRelatedWork W3022038857 @default.
- W4200380940 hasRelatedWork W3025133396 @default.
- W4200380940 hasRelatedWork W4206669594 @default.
- W4200380940 hasRelatedWork W4220782901 @default.
- W4200380940 hasRelatedWork W4289712363 @default.
- W4200380940 hasRelatedWork W4319083788 @default.
- W4200380940 isParatext "false" @default.
- W4200380940 isRetracted "false" @default.
- W4200380940 workType "article" @default.