Matches in SemOpenAlex for { <https://semopenalex.org/work/W2804860445> ?p ?o ?g. }
- W2804860445 abstract "Policy gradient methods have been successfully applied to a variety ofreinforcement learning tasks. However, while learning in a simulator, thesemethods do not utilise the opportunity to improve learning by adjusting certainenvironment variables: unobservable state features that are randomly determinedby the environment in a physical setting, but that are controllable in asimulator. This can lead to slow learning, or convergence to highly suboptimalpolicies. In this paper, we present contextual policy optimisation (CPO). Thecentral idea is to use Bayesian optimisation to actively select thedistribution of the environment variable that maximises the improvementgenerated by each iteration of the policy gradient method. To make thisBayesian optimisation practical, we contribute two easy-to-computelow-dimensional fingerprints of the current policy. We apply CPO to a number ofcontinuous control tasks of varying difficulty and show that CPO canefficiently learn policies that are robust to significant rare events, whichare unlikely to be observable under random sampling but are key to learninggood policies." @default.
- W2804860445 created "2018-06-01" @default.
- W2804860445 creator A5002007044 @default.
- W2804860445 creator A5030103572 @default.
- W2804860445 creator A5056879203 @default.
- W2804860445 date "2018-05-27" @default.
- W2804860445 modified "2023-09-27" @default.
- W2804860445 title "Contextual Policy Optimisation." @default.
- W2804860445 cites W1191599655 @default.
- W2804860445 cites W1771410628 @default.
- W2804860445 cites W1871676304 @default.
- W2804860445 cites W2046765929 @default.
- W2804860445 cites W2077902449 @default.
- W2804860445 cites W2098152875 @default.
- W2804860445 cites W2111241577 @default.
- W2804860445 cites W2119717200 @default.
- W2804860445 cites W2121863487 @default.
- W2804860445 cites W2127107099 @default.
- W2804860445 cites W2130801532 @default.
- W2804860445 cites W2131627640 @default.
- W2804860445 cites W2186601217 @default.
- W2804860445 cites W2529477964 @default.
- W2804860445 cites W2555374257 @default.
- W2804860445 cites W2565313327 @default.
- W2804860445 cites W2602963933 @default.
- W2804860445 cites W2604763608 @default.
- W2804860445 cites W2624086852 @default.
- W2804860445 cites W2949201811 @default.
- W2804860445 cites W2963775850 @default.
- W2804860445 cites W2964061343 @default.
- W2804860445 cites W2964327384 @default.
- W2804860445 hasPublicationYear "2018" @default.
- W2804860445 type Work @default.
- W2804860445 sameAs 2804860445 @default.
- W2804860445 citedByCount "2" @default.
- W2804860445 countsByYear W28048604452019 @default.
- W2804860445 countsByYear W28048604452020 @default.
- W2804860445 crossrefType "posted-content" @default.
- W2804860445 hasAuthorship W2804860445A5002007044 @default.
- W2804860445 hasAuthorship W2804860445A5030103572 @default.
- W2804860445 hasAuthorship W2804860445A5056879203 @default.
- W2804860445 hasConcept C107673813 @default.
- W2804860445 hasConcept C119857082 @default.
- W2804860445 hasConcept C134306372 @default.
- W2804860445 hasConcept C136197465 @default.
- W2804860445 hasConcept C149782125 @default.
- W2804860445 hasConcept C154945302 @default.
- W2804860445 hasConcept C162324750 @default.
- W2804860445 hasConcept C182365436 @default.
- W2804860445 hasConcept C26517878 @default.
- W2804860445 hasConcept C2775924081 @default.
- W2804860445 hasConcept C2777303404 @default.
- W2804860445 hasConcept C2780695315 @default.
- W2804860445 hasConcept C33923547 @default.
- W2804860445 hasConcept C38652104 @default.
- W2804860445 hasConcept C41008148 @default.
- W2804860445 hasConcept C50522688 @default.
- W2804860445 hasConcept C97541855 @default.
- W2804860445 hasConceptScore W2804860445C107673813 @default.
- W2804860445 hasConceptScore W2804860445C119857082 @default.
- W2804860445 hasConceptScore W2804860445C134306372 @default.
- W2804860445 hasConceptScore W2804860445C136197465 @default.
- W2804860445 hasConceptScore W2804860445C149782125 @default.
- W2804860445 hasConceptScore W2804860445C154945302 @default.
- W2804860445 hasConceptScore W2804860445C162324750 @default.
- W2804860445 hasConceptScore W2804860445C182365436 @default.
- W2804860445 hasConceptScore W2804860445C26517878 @default.
- W2804860445 hasConceptScore W2804860445C2775924081 @default.
- W2804860445 hasConceptScore W2804860445C2777303404 @default.
- W2804860445 hasConceptScore W2804860445C2780695315 @default.
- W2804860445 hasConceptScore W2804860445C33923547 @default.
- W2804860445 hasConceptScore W2804860445C38652104 @default.
- W2804860445 hasConceptScore W2804860445C41008148 @default.
- W2804860445 hasConceptScore W2804860445C50522688 @default.
- W2804860445 hasConceptScore W2804860445C97541855 @default.
- W2804860445 hasLocation W28048604451 @default.
- W2804860445 hasOpenAccess W2804860445 @default.
- W2804860445 hasPrimaryLocation W28048604451 @default.
- W2804860445 hasRelatedWork W1607817625 @default.
- W2804860445 hasRelatedWork W2128786740 @default.
- W2804860445 hasRelatedWork W2145183542 @default.
- W2804860445 hasRelatedWork W2147032798 @default.
- W2804860445 hasRelatedWork W2476216644 @default.
- W2804860445 hasRelatedWork W2782446262 @default.
- W2804860445 hasRelatedWork W2895958971 @default.
- W2804860445 hasRelatedWork W2900831917 @default.
- W2804860445 hasRelatedWork W2907704766 @default.
- W2804860445 hasRelatedWork W2909588145 @default.
- W2804860445 hasRelatedWork W2968021416 @default.
- W2804860445 hasRelatedWork W2997343068 @default.
- W2804860445 hasRelatedWork W3007369745 @default.
- W2804860445 hasRelatedWork W3008076766 @default.
- W2804860445 hasRelatedWork W3026304494 @default.
- W2804860445 hasRelatedWork W3035074656 @default.
- W2804860445 hasRelatedWork W3151079898 @default.
- W2804860445 hasRelatedWork W3173031723 @default.
- W2804860445 hasRelatedWork W3185541071 @default.
- W2804860445 hasRelatedWork W778742492 @default.
- W2804860445 isParatext "false" @default.
- W2804860445 isRetracted "false" @default.