Matches in SemOpenAlex for { <https://semopenalex.org/work/W1557287189> ?p ?o ?g. }
Showing items 1 to 96 of
96
with 100 items per page.
- W1557287189 abstract "Consider the problem of a constrained Markov Decision Process (MDP). Under a parameterization of the control strategies, the problem can be transformed into a non-linear optimization problem with non-linear constraints. Both the cost and the constraints are stationary averages. We assume that the transition probabilities of the underlying Markov chain are unknown: only the values of the control variables are known, as well as the instantaneous values of the cost and the constraints, so no analytical expression for the stationary averages is available. To find the solution to the optimization problem, a stochastic version of a primal/dual method with an augmented Lagrangian is used. The updating scheme uses a measure valued estimator of the gradients that can be interpreted in terms of a finite horizon version of the Perturbation Analysis (PA) method known as the perturbation realization factors. Most finite horizon derivative estimators are consistent as the sample size grows, so it is common to assume that large enough samples can be observed so as to make the bias negligible. This paper deals with the actual implementations of the gradient estimators in finite horizon with small sample sizes, so that the iterates of the stochastic approximation can be performed very often, as would be required for on-line learning. We identify the asymptotic bias of the stochastic approximation for the constrained optimization method, and by so doing we propose several means to correct it. As is very common with these problems, the bias correction introduces a conflict between precision and speed: the smaller the bias, the slower the reaction time. In the sequel, we present the theoretical basis for the study of bias and learning rate. Our experimental results indicate that smoothing at a faster time scale may not be necessary at all, only at a slower time scale. We include results where the algorithms have to track changes in the environment." @default.
- W1557287189 created "2016-06-24" @default.
- W1557287189 creator A5025478222 @default.
- W1557287189 creator A5032264786 @default.
- W1557287189 creator A5068804090 @default.
- W1557287189 date "2004-06-21" @default.
- W1557287189 modified "2023-09-27" @default.
- W1557287189 title "Implementation of gradient estimation to a constrained Markov decision problem" @default.
- W1557287189 cites W1518931405 @default.
- W1557287189 cites W2114757210 @default.
- W1557287189 cites W2118943752 @default.
- W1557287189 cites W2119792915 @default.
- W1557287189 cites W2133626316 @default.
- W1557287189 cites W2161142726 @default.
- W1557287189 cites W2334782222 @default.
- W1557287189 cites W2531891978 @default.
- W1557287189 cites W2798766386 @default.
- W1557287189 doi "https://doi.org/10.1109/cdc.2003.1272362" @default.
- W1557287189 hasPublicationYear "2004" @default.
- W1557287189 type Work @default.
- W1557287189 sameAs 1557287189 @default.
- W1557287189 citedByCount "19" @default.
- W1557287189 countsByYear W15572871892012 @default.
- W1557287189 countsByYear W15572871892016 @default.
- W1557287189 countsByYear W15572871892019 @default.
- W1557287189 countsByYear W15572871892020 @default.
- W1557287189 countsByYear W15572871892021 @default.
- W1557287189 crossrefType "proceedings-article" @default.
- W1557287189 hasAuthorship W1557287189A5025478222 @default.
- W1557287189 hasAuthorship W1557287189A5032264786 @default.
- W1557287189 hasAuthorship W1557287189A5068804090 @default.
- W1557287189 hasConcept C105795698 @default.
- W1557287189 hasConcept C106189395 @default.
- W1557287189 hasConcept C126255220 @default.
- W1557287189 hasConcept C134306372 @default.
- W1557287189 hasConcept C137836250 @default.
- W1557287189 hasConcept C140479938 @default.
- W1557287189 hasConcept C150452318 @default.
- W1557287189 hasConcept C159886148 @default.
- W1557287189 hasConcept C185429906 @default.
- W1557287189 hasConcept C26517878 @default.
- W1557287189 hasConcept C2779880469 @default.
- W1557287189 hasConcept C28826006 @default.
- W1557287189 hasConcept C33923547 @default.
- W1557287189 hasConcept C38652104 @default.
- W1557287189 hasConcept C41008148 @default.
- W1557287189 hasConcept C55479107 @default.
- W1557287189 hasConcept C8272713 @default.
- W1557287189 hasConcept C91765299 @default.
- W1557287189 hasConcept C98763669 @default.
- W1557287189 hasConceptScore W1557287189C105795698 @default.
- W1557287189 hasConceptScore W1557287189C106189395 @default.
- W1557287189 hasConceptScore W1557287189C126255220 @default.
- W1557287189 hasConceptScore W1557287189C134306372 @default.
- W1557287189 hasConceptScore W1557287189C137836250 @default.
- W1557287189 hasConceptScore W1557287189C140479938 @default.
- W1557287189 hasConceptScore W1557287189C150452318 @default.
- W1557287189 hasConceptScore W1557287189C159886148 @default.
- W1557287189 hasConceptScore W1557287189C185429906 @default.
- W1557287189 hasConceptScore W1557287189C26517878 @default.
- W1557287189 hasConceptScore W1557287189C2779880469 @default.
- W1557287189 hasConceptScore W1557287189C28826006 @default.
- W1557287189 hasConceptScore W1557287189C33923547 @default.
- W1557287189 hasConceptScore W1557287189C38652104 @default.
- W1557287189 hasConceptScore W1557287189C41008148 @default.
- W1557287189 hasConceptScore W1557287189C55479107 @default.
- W1557287189 hasConceptScore W1557287189C8272713 @default.
- W1557287189 hasConceptScore W1557287189C91765299 @default.
- W1557287189 hasConceptScore W1557287189C98763669 @default.
- W1557287189 hasLocation W15572871891 @default.
- W1557287189 hasOpenAccess W1557287189 @default.
- W1557287189 hasPrimaryLocation W15572871891 @default.
- W1557287189 hasRelatedWork W1518931405 @default.
- W1557287189 hasRelatedWork W1607150413 @default.
- W1557287189 hasRelatedWork W180325379 @default.
- W1557287189 hasRelatedWork W1997020095 @default.
- W1557287189 hasRelatedWork W2010654234 @default.
- W1557287189 hasRelatedWork W203276351 @default.
- W1557287189 hasRelatedWork W2042401830 @default.
- W1557287189 hasRelatedWork W2042592362 @default.
- W1557287189 hasRelatedWork W2061412229 @default.
- W1557287189 hasRelatedWork W2063059707 @default.
- W1557287189 hasRelatedWork W2084091044 @default.
- W1557287189 hasRelatedWork W2095962350 @default.
- W1557287189 hasRelatedWork W2099775880 @default.
- W1557287189 hasRelatedWork W2121863487 @default.
- W1557287189 hasRelatedWork W2133626316 @default.
- W1557287189 hasRelatedWork W2136762313 @default.
- W1557287189 hasRelatedWork W2162863295 @default.
- W1557287189 hasRelatedWork W2235056388 @default.
- W1557287189 hasRelatedWork W2280955991 @default.
- W1557287189 hasRelatedWork W2597037225 @default.
- W1557287189 isParatext "false" @default.
- W1557287189 isRetracted "false" @default.
- W1557287189 magId "1557287189" @default.
- W1557287189 workType "article" @default.