Matches in SemOpenAlex for { <https://semopenalex.org/work/W2945805962> ?p ?o ?g. }
Showing items 1 to 91 of
91
with 100 items per page.
- W2945805962 abstract "Temporal-difference learning (TD), coupled with neural networks, is among the most fundamental building blocks of deep reinforcement learning. However, due to the nonlinearity in value function approximation, such a coupling leads to nonconvexity and even divergence in optimization. As a result, the global convergence of neural TD remains unclear. In this paper, we prove for the first time that neural TD converges at a sublinear rate to the global optimum of the mean-squared projected Bellman error for policy evaluation. In particular, we show how such global convergence is enabled by the overparametrization of neural networks, which also plays a vital role in the empirical success of neural TD. Beyond policy evaluation, we establish the global convergence of neural (soft) Q-learning, which is further connected to that of policy gradient algorithms." @default.
- W2945805962 created "2019-05-29" @default.
- W2945805962 creator A5021536336 @default.
- W2945805962 creator A5048272675 @default.
- W2945805962 creator A5059740024 @default.
- W2945805962 creator A5078210646 @default.
- W2945805962 date "2019-05-24" @default.
- W2945805962 modified "2023-09-27" @default.
- W2945805962 title "Neural Temporal-Difference Learning Converges to Global Optima" @default.
- W2945805962 hasPublicationYear "2019" @default.
- W2945805962 type Work @default.
- W2945805962 sameAs 2945805962 @default.
- W2945805962 citedByCount "0" @default.
- W2945805962 crossrefType "posted-content" @default.
- W2945805962 hasAuthorship W2945805962A5021536336 @default.
- W2945805962 hasAuthorship W2945805962A5048272675 @default.
- W2945805962 hasAuthorship W2945805962A5059740024 @default.
- W2945805962 hasAuthorship W2945805962A5078210646 @default.
- W2945805962 hasConcept C117160843 @default.
- W2945805962 hasConcept C121332964 @default.
- W2945805962 hasConcept C126255220 @default.
- W2945805962 hasConcept C134306372 @default.
- W2945805962 hasConcept C138885662 @default.
- W2945805962 hasConcept C14036430 @default.
- W2945805962 hasConcept C14646407 @default.
- W2945805962 hasConcept C154945302 @default.
- W2945805962 hasConcept C158622935 @default.
- W2945805962 hasConcept C162324750 @default.
- W2945805962 hasConcept C164752517 @default.
- W2945805962 hasConcept C196340769 @default.
- W2945805962 hasConcept C207390915 @default.
- W2945805962 hasConcept C2777303404 @default.
- W2945805962 hasConcept C33923547 @default.
- W2945805962 hasConcept C41008148 @default.
- W2945805962 hasConcept C41895202 @default.
- W2945805962 hasConcept C50522688 @default.
- W2945805962 hasConcept C50644808 @default.
- W2945805962 hasConcept C62520636 @default.
- W2945805962 hasConcept C78458016 @default.
- W2945805962 hasConcept C86803240 @default.
- W2945805962 hasConcept C97541855 @default.
- W2945805962 hasConceptScore W2945805962C117160843 @default.
- W2945805962 hasConceptScore W2945805962C121332964 @default.
- W2945805962 hasConceptScore W2945805962C126255220 @default.
- W2945805962 hasConceptScore W2945805962C134306372 @default.
- W2945805962 hasConceptScore W2945805962C138885662 @default.
- W2945805962 hasConceptScore W2945805962C14036430 @default.
- W2945805962 hasConceptScore W2945805962C14646407 @default.
- W2945805962 hasConceptScore W2945805962C154945302 @default.
- W2945805962 hasConceptScore W2945805962C158622935 @default.
- W2945805962 hasConceptScore W2945805962C162324750 @default.
- W2945805962 hasConceptScore W2945805962C164752517 @default.
- W2945805962 hasConceptScore W2945805962C196340769 @default.
- W2945805962 hasConceptScore W2945805962C207390915 @default.
- W2945805962 hasConceptScore W2945805962C2777303404 @default.
- W2945805962 hasConceptScore W2945805962C33923547 @default.
- W2945805962 hasConceptScore W2945805962C41008148 @default.
- W2945805962 hasConceptScore W2945805962C41895202 @default.
- W2945805962 hasConceptScore W2945805962C50522688 @default.
- W2945805962 hasConceptScore W2945805962C50644808 @default.
- W2945805962 hasConceptScore W2945805962C62520636 @default.
- W2945805962 hasConceptScore W2945805962C78458016 @default.
- W2945805962 hasConceptScore W2945805962C86803240 @default.
- W2945805962 hasConceptScore W2945805962C97541855 @default.
- W2945805962 hasLocation W29458059621 @default.
- W2945805962 hasOpenAccess W2945805962 @default.
- W2945805962 hasPrimaryLocation W29458059621 @default.
- W2945805962 hasRelatedWork W1508054082 @default.
- W2945805962 hasRelatedWork W2073442162 @default.
- W2945805962 hasRelatedWork W2216476061 @default.
- W2945805962 hasRelatedWork W2587741277 @default.
- W2945805962 hasRelatedWork W2765861987 @default.
- W2945805962 hasRelatedWork W2798826368 @default.
- W2945805962 hasRelatedWork W2899748887 @default.
- W2945805962 hasRelatedWork W2913473169 @default.
- W2945805962 hasRelatedWork W2952591465 @default.
- W2945805962 hasRelatedWork W2961321674 @default.
- W2945805962 hasRelatedWork W2966530573 @default.
- W2945805962 hasRelatedWork W2970216099 @default.
- W2945805962 hasRelatedWork W2995015865 @default.
- W2945805962 hasRelatedWork W2995734397 @default.
- W2945805962 hasRelatedWork W2996555526 @default.
- W2945805962 hasRelatedWork W3017380811 @default.
- W2945805962 hasRelatedWork W3033103176 @default.
- W2945805962 hasRelatedWork W3034846718 @default.
- W2945805962 hasRelatedWork W3094346332 @default.
- W2945805962 hasRelatedWork W3207779995 @default.
- W2945805962 isParatext "false" @default.
- W2945805962 isRetracted "false" @default.
- W2945805962 magId "2945805962" @default.
- W2945805962 workType "article" @default.