Matches in SemOpenAlex for { <https://semopenalex.org/work/W3212756364> ?p ?o ?g. }
- W3212756364 endingPage "962" @default.
- W3212756364 startingPage "950" @default.
- W3212756364 abstract "Motivated by the recent success of reinforcement learning algorithms, this paper studies a class of biased stochastic approximation (SA) procedures under a mild “ergodicity-like” assumption on the random noise sequence. Building on a multistep Lyapunov function that looks ahead to several future updates to accommodate the stochastic perturbations (thus gaining control over the bias), we prove a general result on the convergence of the SA iterates, and use it to derive non-asymptotic bounds on the mean-square error in the case of constant stepsizes. This novel viewpoint renders finite-time analysis of biased SA algorithms under a family of stochastic perturbations possible. For direct comparison with prior work, we demonstrate these bounds by applying them to TD-learning with linear function approximation, under the Markov chain observation model. The resultant finite-time error bound for TD-learning is the first of its kind, in the sense that it holds i) for the unmodified versions (i.e., without any modification to the updates) using even nonlinear approximators; as well as for Markov chains ii) under sublinear mixing conditions and iii) starting from any initial distribution, at least one of which has to be violated for existing results to be applicable." @default.
- W3212756364 created "2021-11-22" @default.
- W3212756364 creator A5036336837 @default.
- W3212756364 creator A5057208641 @default.
- W3212756364 date "2022-01-01" @default.
- W3212756364 modified "2023-10-15" @default.
- W3212756364 title "Finite-Time Error Bounds of Biased Stochastic Approximation With Application to TD-Learning" @default.
- W3212756364 cites W1646707810 @default.
- W3212756364 cites W1994616650 @default.
- W3212756364 cites W2030019188 @default.
- W3212756364 cites W2139418546 @default.
- W3212756364 cites W2145339207 @default.
- W3212756364 cites W2885208219 @default.
- W3212756364 cites W2885549115 @default.
- W3212756364 cites W2897742256 @default.
- W3212756364 cites W2912171307 @default.
- W3212756364 cites W2963333577 @default.
- W3212756364 cites W2963616027 @default.
- W3212756364 cites W2969218621 @default.
- W3212756364 cites W2991859550 @default.
- W3212756364 cites W3021312661 @default.
- W3212756364 cites W3021959079 @default.
- W3212756364 cites W3024414306 @default.
- W3212756364 cites W3034190797 @default.
- W3212756364 cites W3037764089 @default.
- W3212756364 cites W3041202696 @default.
- W3212756364 cites W3106022061 @default.
- W3212756364 cites W3123661679 @default.
- W3212756364 cites W3153673236 @default.
- W3212756364 cites W3185561982 @default.
- W3212756364 cites W4211192810 @default.
- W3212756364 cites W4224862297 @default.
- W3212756364 cites W4233061323 @default.
- W3212756364 cites W4243772471 @default.
- W3212756364 doi "https://doi.org/10.1109/tsp.2021.3128723" @default.
- W3212756364 hasPublicationYear "2022" @default.
- W3212756364 type Work @default.
- W3212756364 sameAs 3212756364 @default.
- W3212756364 citedByCount "1" @default.
- W3212756364 countsByYear W32127563642023 @default.
- W3212756364 crossrefType "journal-article" @default.
- W3212756364 hasAuthorship W3212756364A5036336837 @default.
- W3212756364 hasAuthorship W3212756364A5057208641 @default.
- W3212756364 hasConcept C105795698 @default.
- W3212756364 hasConcept C117160843 @default.
- W3212756364 hasConcept C118615104 @default.
- W3212756364 hasConcept C121332964 @default.
- W3212756364 hasConcept C126255220 @default.
- W3212756364 hasConcept C134306372 @default.
- W3212756364 hasConcept C140479938 @default.
- W3212756364 hasConcept C154945302 @default.
- W3212756364 hasConcept C158622935 @default.
- W3212756364 hasConcept C201779956 @default.
- W3212756364 hasConcept C26517878 @default.
- W3212756364 hasConcept C28826006 @default.
- W3212756364 hasConcept C33923547 @default.
- W3212756364 hasConcept C38652104 @default.
- W3212756364 hasConcept C41008148 @default.
- W3212756364 hasConcept C50644808 @default.
- W3212756364 hasConcept C55479107 @default.
- W3212756364 hasConcept C60640748 @default.
- W3212756364 hasConcept C62520636 @default.
- W3212756364 hasConcept C91873725 @default.
- W3212756364 hasConcept C97541855 @default.
- W3212756364 hasConcept C98763669 @default.
- W3212756364 hasConceptScore W3212756364C105795698 @default.
- W3212756364 hasConceptScore W3212756364C117160843 @default.
- W3212756364 hasConceptScore W3212756364C118615104 @default.
- W3212756364 hasConceptScore W3212756364C121332964 @default.
- W3212756364 hasConceptScore W3212756364C126255220 @default.
- W3212756364 hasConceptScore W3212756364C134306372 @default.
- W3212756364 hasConceptScore W3212756364C140479938 @default.
- W3212756364 hasConceptScore W3212756364C154945302 @default.
- W3212756364 hasConceptScore W3212756364C158622935 @default.
- W3212756364 hasConceptScore W3212756364C201779956 @default.
- W3212756364 hasConceptScore W3212756364C26517878 @default.
- W3212756364 hasConceptScore W3212756364C28826006 @default.
- W3212756364 hasConceptScore W3212756364C33923547 @default.
- W3212756364 hasConceptScore W3212756364C38652104 @default.
- W3212756364 hasConceptScore W3212756364C41008148 @default.
- W3212756364 hasConceptScore W3212756364C50644808 @default.
- W3212756364 hasConceptScore W3212756364C55479107 @default.
- W3212756364 hasConceptScore W3212756364C60640748 @default.
- W3212756364 hasConceptScore W3212756364C62520636 @default.
- W3212756364 hasConceptScore W3212756364C91873725 @default.
- W3212756364 hasConceptScore W3212756364C97541855 @default.
- W3212756364 hasConceptScore W3212756364C98763669 @default.
- W3212756364 hasFunder F4320321001 @default.
- W3212756364 hasFunder F4320323172 @default.
- W3212756364 hasLocation W32127563641 @default.
- W3212756364 hasOpenAccess W3212756364 @default.
- W3212756364 hasPrimaryLocation W32127563641 @default.
- W3212756364 hasRelatedWork W1914388985 @default.
- W3212756364 hasRelatedWork W2033254849 @default.
- W3212756364 hasRelatedWork W2041721334 @default.
- W3212756364 hasRelatedWork W2113921460 @default.
- W3212756364 hasRelatedWork W2950382421 @default.
- W3212756364 hasRelatedWork W2954827202 @default.