Matches in SemOpenAlex for { <https://semopenalex.org/work/W2247459084> ?p ?o ?g. }
- W2247459084 abstract "Approximate Newton methods are a standard optimization tool which aim to maintain the benefits of Newton's method, such as a fast rate of convergence, whilst alleviating its drawbacks, such as computationally expensive calculation or estimation of the inverse Hessian. In this work we investigate approximate Newton methods for policy optimization in Markov Decision Processes (MDPs). We first analyse the structure of the Hessian of the objective function for MDPs. We show that, like the gradient, the Hessian exhibits useful structure in the context of MDPs and we use this analysis to motivate two Gauss-Newton Methods for MDPs. Like the Gauss-Newton method for non-linear least squares, these methods involve approximating the Hessian by ignoring certain terms in the Hessian which are difficult to estimate. The approximate Hessians possess desirable properties, such as negative definiteness, and we demonstrate several important performance guarantees including guaranteed ascent directions, invariance to affine transformation of the parameter space, and convergence guarantees. We finally provide a unifying perspective of key policy search algorithms, demonstrating that our second Gauss-Newton algorithm is closely related to both the EM-algorithm and natural gradient ascent applied to MDPs, but performs significantly better in practice on a range of challenging domains." @default.
- W2247459084 created "2016-06-24" @default.
- W2247459084 creator A5031943811 @default.
- W2247459084 creator A5079232487 @default.
- W2247459084 date "2015-07-29" @default.
- W2247459084 modified "2023-09-27" @default.
- W2247459084 title "A Gauss-Newton Method for Markov Decision Processes" @default.
- W2247459084 cites W1491622225 @default.
- W2247459084 cites W1516801383 @default.
- W2247459084 cites W1564755532 @default.
- W2247459084 cites W1586251222 @default.
- W2247459084 cites W1590183771 @default.
- W2247459084 cites W1606011487 @default.
- W2247459084 cites W1625390266 @default.
- W2247459084 cites W1640774615 @default.
- W2247459084 cites W1941248864 @default.
- W2247459084 cites W195033972 @default.
- W2247459084 cites W2009303086 @default.
- W2247459084 cites W2012392077 @default.
- W2247459084 cites W2028145673 @default.
- W2247459084 cites W2041367235 @default.
- W2247459084 cites W2044758663 @default.
- W2247459084 cites W2046765929 @default.
- W2247459084 cites W2049633694 @default.
- W2247459084 cites W2080039641 @default.
- W2247459084 cites W208252816 @default.
- W2247459084 cites W2090963365 @default.
- W2247459084 cites W2096450737 @default.
- W2247459084 cites W2097568041 @default.
- W2247459084 cites W2100785108 @default.
- W2247459084 cites W2111467524 @default.
- W2247459084 cites W2114735315 @default.
- W2247459084 cites W2117341272 @default.
- W2247459084 cites W2119717200 @default.
- W2247459084 cites W2120070743 @default.
- W2247459084 cites W2122759946 @default.
- W2247459084 cites W2123967136 @default.
- W2247459084 cites W2124289529 @default.
- W2247459084 cites W2130005627 @default.
- W2247459084 cites W2130801532 @default.
- W2247459084 cites W2133069808 @default.
- W2247459084 cites W2133435356 @default.
- W2247459084 cites W2136237198 @default.
- W2247459084 cites W2136602922 @default.
- W2247459084 cites W2139053308 @default.
- W2247459084 cites W2140135625 @default.
- W2247459084 cites W2151965738 @default.
- W2247459084 cites W2153678894 @default.
- W2247459084 cites W2155027007 @default.
- W2247459084 cites W2156718681 @default.
- W2247459084 cites W2156737235 @default.
- W2247459084 cites W2169209873 @default.
- W2247459084 cites W2296319761 @default.
- W2247459084 cites W2341171179 @default.
- W2247459084 hasPublicationYear "2015" @default.
- W2247459084 type Work @default.
- W2247459084 sameAs 2247459084 @default.
- W2247459084 citedByCount "0" @default.
- W2247459084 crossrefType "posted-content" @default.
- W2247459084 hasAuthorship W2247459084A5031943811 @default.
- W2247459084 hasAuthorship W2247459084A5079232487 @default.
- W2247459084 hasConcept C105795698 @default.
- W2247459084 hasConcept C106189395 @default.
- W2247459084 hasConcept C121332964 @default.
- W2247459084 hasConcept C126255220 @default.
- W2247459084 hasConcept C151730666 @default.
- W2247459084 hasConcept C158622935 @default.
- W2247459084 hasConcept C159886148 @default.
- W2247459084 hasConcept C162324750 @default.
- W2247459084 hasConcept C203616005 @default.
- W2247459084 hasConcept C26517878 @default.
- W2247459084 hasConcept C2777303404 @default.
- W2247459084 hasConcept C2779343474 @default.
- W2247459084 hasConcept C28826006 @default.
- W2247459084 hasConcept C33923547 @default.
- W2247459084 hasConcept C38652104 @default.
- W2247459084 hasConcept C41008148 @default.
- W2247459084 hasConcept C50522688 @default.
- W2247459084 hasConcept C57869625 @default.
- W2247459084 hasConcept C62520636 @default.
- W2247459084 hasConcept C85189116 @default.
- W2247459084 hasConcept C86803240 @default.
- W2247459084 hasConceptScore W2247459084C105795698 @default.
- W2247459084 hasConceptScore W2247459084C106189395 @default.
- W2247459084 hasConceptScore W2247459084C121332964 @default.
- W2247459084 hasConceptScore W2247459084C126255220 @default.
- W2247459084 hasConceptScore W2247459084C151730666 @default.
- W2247459084 hasConceptScore W2247459084C158622935 @default.
- W2247459084 hasConceptScore W2247459084C159886148 @default.
- W2247459084 hasConceptScore W2247459084C162324750 @default.
- W2247459084 hasConceptScore W2247459084C203616005 @default.
- W2247459084 hasConceptScore W2247459084C26517878 @default.
- W2247459084 hasConceptScore W2247459084C2777303404 @default.
- W2247459084 hasConceptScore W2247459084C2779343474 @default.
- W2247459084 hasConceptScore W2247459084C28826006 @default.
- W2247459084 hasConceptScore W2247459084C33923547 @default.
- W2247459084 hasConceptScore W2247459084C38652104 @default.
- W2247459084 hasConceptScore W2247459084C41008148 @default.
- W2247459084 hasConceptScore W2247459084C50522688 @default.
- W2247459084 hasConceptScore W2247459084C57869625 @default.