SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W4200380940> ?p ?o ?g. }

Showing items 1 to 52 of 52 with 100 items per page.

W4200380940 abstract "This paper optimizes reinforcement learning (RL) parameters and makes agents solve problems more accurately and efficiently. The RL algorithm that the paper used is Q-learning, and the experimental environment is mazes. There are three parameters to influence the entire performance in RL, such as learning rate, greedy factor, and discount rate. This paper introduces a nonlinear time-varying strategy (NTV) into the Q-learning algorithm and uses a uniform experiment design (UED) to effectively obtain the best parameter combination. The agent can steadily figure out a solution to the mazes with the fewest steps. In conclusion, this paper proposes a Q-learning with NTV (NTV-Q-learning) to search the solutions and steadily get out of the mazes with minimal steps to prove the effective improvement of Q-Learning. From the experimental results, the solution obtained by NTV-Q-learning is better than Q-Learning." @default.
W4200380940 created "2021-12-31" @default.
W4200380940 creator A5016062587 @default.
W4200380940 creator A5023756898 @default.
W4200380940 creator A5042632224 @default.
W4200380940 creator A5054686089 @default.
W4200380940 creator A5059188982 @default.
W4200380940 date "2021-11-16" @default.
W4200380940 modified "2023-10-17" @default.
W4200380940 title "Parameters Optimization for Reinforcement Learning with Nonlinear Time-Varying Strategy by Using Uniform Experiment Design" @default.
W4200380940 doi "https://doi.org/10.1109/ispacs51563.2021.9651084" @default.
W4200380940 hasPublicationYear "2021" @default.
W4200380940 type Work @default.
W4200380940 citedByCount "0" @default.
W4200380940 crossrefType "proceedings-article" @default.
W4200380940 hasAuthorship W4200380940A5016062587 @default.
W4200380940 hasAuthorship W4200380940A5023756898 @default.
W4200380940 hasAuthorship W4200380940A5042632224 @default.
W4200380940 hasAuthorship W4200380940A5054686089 @default.
W4200380940 hasAuthorship W4200380940A5059188982 @default.
W4200380940 hasConcept C119857082 @default.
W4200380940 hasConcept C121332964 @default.
W4200380940 hasConcept C154945302 @default.
W4200380940 hasConcept C158622935 @default.
W4200380940 hasConcept C188116033 @default.
W4200380940 hasConcept C41008148 @default.
W4200380940 hasConcept C62520636 @default.
W4200380940 hasConcept C97541855 @default.
W4200380940 hasConceptScore W4200380940C119857082 @default.
W4200380940 hasConceptScore W4200380940C121332964 @default.
W4200380940 hasConceptScore W4200380940C154945302 @default.
W4200380940 hasConceptScore W4200380940C158622935 @default.
W4200380940 hasConceptScore W4200380940C188116033 @default.
W4200380940 hasConceptScore W4200380940C41008148 @default.
W4200380940 hasConceptScore W4200380940C62520636 @default.
W4200380940 hasConceptScore W4200380940C97541855 @default.
W4200380940 hasLocation W42003809401 @default.
W4200380940 hasOpenAccess W4200380940 @default.
W4200380940 hasPrimaryLocation W42003809401 @default.
W4200380940 hasRelatedWork W1562959674 @default.
W4200380940 hasRelatedWork W2923653485 @default.
W4200380940 hasRelatedWork W2951071805 @default.
W4200380940 hasRelatedWork W2957776456 @default.
W4200380940 hasRelatedWork W3022038857 @default.
W4200380940 hasRelatedWork W3025133396 @default.
W4200380940 hasRelatedWork W4206669594 @default.
W4200380940 hasRelatedWork W4220782901 @default.
W4200380940 hasRelatedWork W4289712363 @default.
W4200380940 hasRelatedWork W4319083788 @default.
W4200380940 isParatext "false" @default.
W4200380940 isRetracted "false" @default.
W4200380940 workType "article" @default.