Matches in SemOpenAlex for { <https://semopenalex.org/work/W2914274927> ?p ?o ?g. }
- W2914274927 abstract "Acceleration is an increasingly common theme in the stochastic optimization literature. The two most common examples are Nesterov's method, and Polyak's momentum technique. In this paper two new algorithms are introduced for root finding problems: 1) PolSA is a root finding algorithm with specially designed matrix momentum, and 2) NeSA can be regarded as a variant of Nesterov's algorithm, or a simplification of PolSA. The PolSA algorithm is new even in the context of optimization (when cast as a root finding problem). The research surveyed in this paper is motivated by applications to reinforcement learning. It is well known that most variants of TD- and Q-learning may be cast as SA (stochastic approximation) algorithms, and the tools from general SA theory can be used to investigate convergence and bounds on convergence rate. In particular, the asymptotic variance is a common metric of performance for SA algorithms, and is also one among many metrics used in assessing the performance of stochastic optimization algorithms. There are two well known SA techniques that are known to have optimal asymptotic variance: the Ruppert-Polyak averaging technique, and stochastic Newton-Raphson (SNR). The former algorithm can have extremely bad transient performance, and the latter can be computationally expensive. It is demonstrated here that parameter estimates from the new PolSA algorithm couple with those of the ideal (but more complex) SNR algorithm. The new algorithm is thus a third approach to obtain optimal asymptotic covariance. These strong results require assumptions on the model. A linearized model is considered, and the noise is assumed to be a martingale difference sequence. Numerical results are obtained in a non-linear setting that is the motivation for this work: In PolSA implementations of Q-learning it is observed that coupling occurs with SNR in this non-ideal setting." @default.
- W2914274927 created "2019-02-21" @default.
- W2914274927 creator A5006003609 @default.
- W2914274927 creator A5047988825 @default.
- W2914274927 creator A5067821332 @default.
- W2914274927 date "2018-09-17" @default.
- W2914274927 modified "2023-09-27" @default.
- W2914274927 title "Optimal Matrix Momentum Stochastic Approximation and Applications to Q-learning" @default.
- W2914274927 cites W1568288633 @default.
- W2914274927 cites W1576452626 @default.
- W2914274927 cites W1985291828 @default.
- W2914274927 cites W1985570873 @default.
- W2914274927 cites W1988720110 @default.
- W2914274927 cites W1994616650 @default.
- W2914274927 cites W2040434095 @default.
- W2914274927 cites W2073384958 @default.
- W2914274927 cites W2082040833 @default.
- W2914274927 cites W2100677568 @default.
- W2914274927 cites W2132351269 @default.
- W2914274927 cites W2135482703 @default.
- W2914274927 cites W2147750403 @default.
- W2914274927 cites W2149166950 @default.
- W2914274927 cites W2156779765 @default.
- W2914274927 cites W2159930037 @default.
- W2914274927 cites W2520472982 @default.
- W2914274927 cites W2608239888 @default.
- W2914274927 cites W2735941225 @default.
- W2914274927 cites W2751801093 @default.
- W2914274927 cites W2923133561 @default.
- W2914274927 cites W2963607709 @default.
- W2914274927 cites W3011120880 @default.
- W2914274927 cites W3123423968 @default.
- W2914274927 cites W594357522 @default.
- W2914274927 hasPublicationYear "2018" @default.
- W2914274927 type Work @default.
- W2914274927 sameAs 2914274927 @default.
- W2914274927 citedByCount "3" @default.
- W2914274927 countsByYear W29142749272019 @default.
- W2914274927 countsByYear W29142749272020 @default.
- W2914274927 countsByYear W29142749272021 @default.
- W2914274927 crossrefType "posted-content" @default.
- W2914274927 hasAuthorship W2914274927A5006003609 @default.
- W2914274927 hasAuthorship W2914274927A5047988825 @default.
- W2914274927 hasAuthorship W2914274927A5067821332 @default.
- W2914274927 hasConcept C106487976 @default.
- W2914274927 hasConcept C11413529 @default.
- W2914274927 hasConcept C126255220 @default.
- W2914274927 hasConcept C151730666 @default.
- W2914274927 hasConcept C154945302 @default.
- W2914274927 hasConcept C159985019 @default.
- W2914274927 hasConcept C162324750 @default.
- W2914274927 hasConcept C176217482 @default.
- W2914274927 hasConcept C192562407 @default.
- W2914274927 hasConcept C194387892 @default.
- W2914274927 hasConcept C21547014 @default.
- W2914274927 hasConcept C26517878 @default.
- W2914274927 hasConcept C2777303404 @default.
- W2914274927 hasConcept C2779343474 @default.
- W2914274927 hasConcept C33923547 @default.
- W2914274927 hasConcept C38652104 @default.
- W2914274927 hasConcept C41008148 @default.
- W2914274927 hasConcept C50522688 @default.
- W2914274927 hasConcept C55479107 @default.
- W2914274927 hasConcept C57869625 @default.
- W2914274927 hasConcept C86803240 @default.
- W2914274927 hasConcept C97541855 @default.
- W2914274927 hasConceptScore W2914274927C106487976 @default.
- W2914274927 hasConceptScore W2914274927C11413529 @default.
- W2914274927 hasConceptScore W2914274927C126255220 @default.
- W2914274927 hasConceptScore W2914274927C151730666 @default.
- W2914274927 hasConceptScore W2914274927C154945302 @default.
- W2914274927 hasConceptScore W2914274927C159985019 @default.
- W2914274927 hasConceptScore W2914274927C162324750 @default.
- W2914274927 hasConceptScore W2914274927C176217482 @default.
- W2914274927 hasConceptScore W2914274927C192562407 @default.
- W2914274927 hasConceptScore W2914274927C194387892 @default.
- W2914274927 hasConceptScore W2914274927C21547014 @default.
- W2914274927 hasConceptScore W2914274927C26517878 @default.
- W2914274927 hasConceptScore W2914274927C2777303404 @default.
- W2914274927 hasConceptScore W2914274927C2779343474 @default.
- W2914274927 hasConceptScore W2914274927C33923547 @default.
- W2914274927 hasConceptScore W2914274927C38652104 @default.
- W2914274927 hasConceptScore W2914274927C41008148 @default.
- W2914274927 hasConceptScore W2914274927C50522688 @default.
- W2914274927 hasConceptScore W2914274927C55479107 @default.
- W2914274927 hasConceptScore W2914274927C57869625 @default.
- W2914274927 hasConceptScore W2914274927C86803240 @default.
- W2914274927 hasConceptScore W2914274927C97541855 @default.
- W2914274927 hasLocation W29142749271 @default.
- W2914274927 hasOpenAccess W2914274927 @default.
- W2914274927 hasPrimaryLocation W29142749271 @default.
- W2914274927 hasRelatedWork W1925698896 @default.
- W2914274927 hasRelatedWork W2063513670 @default.
- W2914274927 hasRelatedWork W2134042548 @default.
- W2914274927 hasRelatedWork W2156779765 @default.
- W2914274927 hasRelatedWork W2570201534 @default.
- W2914274927 hasRelatedWork W2621374146 @default.
- W2914274927 hasRelatedWork W2734851461 @default.
- W2914274927 hasRelatedWork W2783506385 @default.
- W2914274927 hasRelatedWork W2810368869 @default.