Matches in SemOpenAlex for { <https://semopenalex.org/work/W2896070594> ?p ?o ?g. }
- W2896070594 abstract "We present a predictor-corrector framework, called PicCoLO, that can transform a first-order model-free reinforcement or imitation learning algorithm into a new hybrid method that leverages predictive models to accelerate policy learning. The new PicCoLOed algorithm optimizes a policy by recursively repeating two steps: In the Prediction Step, the learner uses a model to predict the unseen future gradient and then applies the predicted estimate to update the policy; in the Correction Step, the learner runs the updated policy in the environment, receives the true gradient, and then corrects the policy using the gradient error. Unlike previous algorithms, PicCoLO corrects for the mistakes of using imperfect predicted gradients and hence does not suffer from model bias. The development of PicCoLO is made possible by a novel reduction from predictable online learning to adversarial online learning, which provides a systematic way to modify existing first-order algorithms to achieve the optimal regret with respect to predictable information. We show, in both theory and simulation, that the convergence rate of several first-order model-free algorithms can be improved by PicCoLO." @default.
- W2896070594 created "2018-10-26" @default.
- W2896070594 creator A5017897684 @default.
- W2896070594 creator A5052552981 @default.
- W2896070594 creator A5062476223 @default.
- W2896070594 creator A5078753151 @default.
- W2896070594 date "2018-10-15" @default.
- W2896070594 modified "2023-09-26" @default.
- W2896070594 title "Predictor-Corrector Policy Optimization" @default.
- W2896070594 cites W112666333 @default.
- W2896070594 cites W1499669280 @default.
- W2896070594 cites W1518461880 @default.
- W2896070594 cites W1519983590 @default.
- W2896070594 cites W1575592356 @default.
- W2896070594 cites W1757796397 @default.
- W2896070594 cites W1771410628 @default.
- W2896070594 cites W1878322007 @default.
- W2896070594 cites W1970789124 @default.
- W2896070594 cites W1980035368 @default.
- W2896070594 cites W2016384870 @default.
- W2896070594 cites W2077008409 @default.
- W2896070594 cites W2089559088 @default.
- W2896070594 cites W2104733512 @default.
- W2896070594 cites W2129160848 @default.
- W2896070594 cites W2130801532 @default.
- W2896070594 cites W2130913800 @default.
- W2896070594 cites W2140135625 @default.
- W2896070594 cites W2143343660 @default.
- W2896070594 cites W2146502635 @default.
- W2896070594 cites W2148825261 @default.
- W2896070594 cites W2155027007 @default.
- W2896070594 cites W2156737235 @default.
- W2896070594 cites W2165150801 @default.
- W2896070594 cites W2167856595 @default.
- W2896070594 cites W2169401877 @default.
- W2896070594 cites W2172968643 @default.
- W2896070594 cites W2186453173 @default.
- W2896070594 cites W2341171179 @default.
- W2896070594 cites W2512432267 @default.
- W2896070594 cites W2513180554 @default.
- W2896070594 cites W2594640072 @default.
- W2896070594 cites W2626639528 @default.
- W2896070594 cites W2627272094 @default.
- W2896070594 cites W2738675347 @default.
- W2896070594 cites W2772709170 @default.
- W2896070594 cites W2785523195 @default.
- W2896070594 cites W2793955514 @default.
- W2896070594 cites W2795756076 @default.
- W2896070594 cites W2803316681 @default.
- W2896070594 cites W2804010078 @default.
- W2896070594 cites W2902907165 @default.
- W2896070594 cites W2913535645 @default.
- W2896070594 cites W2949608212 @default.
- W2896070594 cites W2950230866 @default.
- W2896070594 cites W2950517718 @default.
- W2896070594 cites W2951445759 @default.
- W2896070594 cites W2951948137 @default.
- W2896070594 cites W2962957031 @default.
- W2896070594 cites W2963184621 @default.
- W2896070594 cites W2963184939 @default.
- W2896070594 cites W2963201118 @default.
- W2896070594 cites W2963349913 @default.
- W2896070594 cites W2963414638 @default.
- W2896070594 cites W2963457007 @default.
- W2896070594 cites W2963604043 @default.
- W2896070594 cites W2963641140 @default.
- W2896070594 cites W2963642149 @default.
- W2896070594 cites W2963851840 @default.
- W2896070594 cites W2964025922 @default.
- W2896070594 cites W2964121744 @default.
- W2896070594 cites W2964121860 @default.
- W2896070594 cites W2964134150 @default.
- W2896070594 cites W2964188247 @default.
- W2896070594 cites W2964220198 @default.
- W2896070594 cites W2964323557 @default.
- W2896070594 cites W3103209356 @default.
- W2896070594 cites W3141595720 @default.
- W2896070594 cites W658381347 @default.
- W2896070594 cites W6908809 @default.
- W2896070594 hasPublicationYear "2018" @default.
- W2896070594 type Work @default.
- W2896070594 sameAs 2896070594 @default.
- W2896070594 citedByCount "0" @default.
- W2896070594 crossrefType "posted-content" @default.
- W2896070594 hasAuthorship W2896070594A5017897684 @default.
- W2896070594 hasAuthorship W2896070594A5052552981 @default.
- W2896070594 hasAuthorship W2896070594A5062476223 @default.
- W2896070594 hasAuthorship W2896070594A5078753151 @default.
- W2896070594 hasConcept C10138342 @default.
- W2896070594 hasConcept C11413529 @default.
- W2896070594 hasConcept C119857082 @default.
- W2896070594 hasConcept C126255220 @default.
- W2896070594 hasConcept C138885662 @default.
- W2896070594 hasConcept C154945302 @default.
- W2896070594 hasConcept C162324750 @default.
- W2896070594 hasConcept C182306322 @default.
- W2896070594 hasConcept C2777303404 @default.
- W2896070594 hasConcept C2780310539 @default.
- W2896070594 hasConcept C33923547 @default.
- W2896070594 hasConcept C41008148 @default.