Matches in SemOpenAlex for { <https://semopenalex.org/work/W4300128663> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W4300128663 abstract "Q-learning is widely used algorithm in reinforcement learning community. Under the lookup table setting, its convergence is well established. However, its behavior is known to be unstable with the linear function approximation case. This paper develops a new Q-learning algorithm that converges when linear function approximation is used. We prove that simply adding an appropriate regularization term ensures convergence of the algorithm. We prove its stability using a recent analysis tool based on switching system models. Moreover, we experimentally show that it converges in environments where Q-learning with linear function approximation has known to diverge. We also provide an error bound on the solution where the algorithm converges." @default.
- W4300128663 created "2022-10-03" @default.
- W4300128663 creator A5027028606 @default.
- W4300128663 creator A5074697684 @default.
- W4300128663 creator A5082767256 @default.
- W4300128663 date "2022-02-10" @default.
- W4300128663 modified "2023-10-18" @default.
- W4300128663 title "Regularized Q-learning" @default.
- W4300128663 doi "https://doi.org/10.48550/arxiv.2202.05404" @default.
- W4300128663 hasPublicationYear "2022" @default.
- W4300128663 type Work @default.
- W4300128663 citedByCount "0" @default.
- W4300128663 crossrefType "posted-content" @default.
- W4300128663 hasAuthorship W4300128663A5027028606 @default.
- W4300128663 hasAuthorship W4300128663A5074697684 @default.
- W4300128663 hasAuthorship W4300128663A5082767256 @default.
- W4300128663 hasBestOaLocation W43001286631 @default.
- W4300128663 hasConcept C112972136 @default.
- W4300128663 hasConcept C11413529 @default.
- W4300128663 hasConcept C119857082 @default.
- W4300128663 hasConcept C121332964 @default.
- W4300128663 hasConcept C122383733 @default.
- W4300128663 hasConcept C126255220 @default.
- W4300128663 hasConcept C14036430 @default.
- W4300128663 hasConcept C148764684 @default.
- W4300128663 hasConcept C154945302 @default.
- W4300128663 hasConcept C158622935 @default.
- W4300128663 hasConcept C160824197 @default.
- W4300128663 hasConcept C162324750 @default.
- W4300128663 hasConcept C2776135515 @default.
- W4300128663 hasConcept C2777303404 @default.
- W4300128663 hasConcept C28826006 @default.
- W4300128663 hasConcept C33923547 @default.
- W4300128663 hasConcept C41008148 @default.
- W4300128663 hasConcept C50522688 @default.
- W4300128663 hasConcept C50644808 @default.
- W4300128663 hasConcept C62520636 @default.
- W4300128663 hasConcept C78458016 @default.
- W4300128663 hasConcept C86803240 @default.
- W4300128663 hasConcept C91873725 @default.
- W4300128663 hasConcept C97541855 @default.
- W4300128663 hasConceptScore W4300128663C112972136 @default.
- W4300128663 hasConceptScore W4300128663C11413529 @default.
- W4300128663 hasConceptScore W4300128663C119857082 @default.
- W4300128663 hasConceptScore W4300128663C121332964 @default.
- W4300128663 hasConceptScore W4300128663C122383733 @default.
- W4300128663 hasConceptScore W4300128663C126255220 @default.
- W4300128663 hasConceptScore W4300128663C14036430 @default.
- W4300128663 hasConceptScore W4300128663C148764684 @default.
- W4300128663 hasConceptScore W4300128663C154945302 @default.
- W4300128663 hasConceptScore W4300128663C158622935 @default.
- W4300128663 hasConceptScore W4300128663C160824197 @default.
- W4300128663 hasConceptScore W4300128663C162324750 @default.
- W4300128663 hasConceptScore W4300128663C2776135515 @default.
- W4300128663 hasConceptScore W4300128663C2777303404 @default.
- W4300128663 hasConceptScore W4300128663C28826006 @default.
- W4300128663 hasConceptScore W4300128663C33923547 @default.
- W4300128663 hasConceptScore W4300128663C41008148 @default.
- W4300128663 hasConceptScore W4300128663C50522688 @default.
- W4300128663 hasConceptScore W4300128663C50644808 @default.
- W4300128663 hasConceptScore W4300128663C62520636 @default.
- W4300128663 hasConceptScore W4300128663C78458016 @default.
- W4300128663 hasConceptScore W4300128663C86803240 @default.
- W4300128663 hasConceptScore W4300128663C91873725 @default.
- W4300128663 hasConceptScore W4300128663C97541855 @default.
- W4300128663 hasLocation W43001286631 @default.
- W4300128663 hasOpenAccess W4300128663 @default.
- W4300128663 hasPrimaryLocation W43001286631 @default.
- W4300128663 hasRelatedWork W2093585199 @default.
- W4300128663 hasRelatedWork W2109330238 @default.
- W4300128663 hasRelatedWork W2330722182 @default.
- W4300128663 hasRelatedWork W2355070670 @default.
- W4300128663 hasRelatedWork W2498835000 @default.
- W4300128663 hasRelatedWork W2513431008 @default.
- W4300128663 hasRelatedWork W2885465588 @default.
- W4300128663 hasRelatedWork W3124157877 @default.
- W4300128663 hasRelatedWork W3177870706 @default.
- W4300128663 hasRelatedWork W4226128862 @default.
- W4300128663 isParatext "false" @default.
- W4300128663 isRetracted "false" @default.
- W4300128663 workType "article" @default.