Matches in SemOpenAlex for { <https://semopenalex.org/work/W2949966881> ?p ?o ?g. }
- W2949966881 abstract "A long-lived autonomous agent should be able to respond online to novel instances of tasks from a familiar domain. Acting online requires 'fast' responses, in terms of rapid convergence, especially when the task instance has a short duration, such as in applications involving interactions with humans. These requirements can be problematic for many established methods for learning to act. In domains where the agent knows that the task instance is drawn from a family of related tasks, albeit without access to the label of any given instance, it can choose to act through a process of policy reuse from a library, rather than policy learning from scratch. In policy reuse, the agent has prior knowledge of the class of tasks in the form of a library of policies that were learnt from sample task instances during an offline training phase. We formalise the problem of policy reuse, and present an algorithm for efficiently responding to a novel task instance by reusing a policy from the library of existing policies, where the choice is based on observed 'signals' which correlate to policy performance. We achieve this by posing the problem as a Bayesian choice problem with a corresponding notion of an optimal response, but the computation of that response is in many cases intractable. Therefore, to reduce the computation cost of the posterior, we follow a Bayesian optimisation approach and define a set of policy selection functions, which balance exploration in the policy library against exploitation of previously tried policies, together with a model of expected performance of the policy library on their corresponding task instances. We validate our method in several simulated domains of interactive, short-duration episodic tasks, showing rapid convergence in unknown task variations." @default.
- W2949966881 created "2019-06-27" @default.
- W2949966881 creator A5021151663 @default.
- W2949966881 creator A5068297734 @default.
- W2949966881 creator A5071122608 @default.
- W2949966881 date "2015-05-01" @default.
- W2949966881 modified "2023-09-27" @default.
- W2949966881 title "Bayesian Policy Reuse" @default.
- W2949966881 cites W1496855202 @default.
- W2949966881 cites W1505168701 @default.
- W2949966881 cites W1964505956 @default.
- W2949966881 cites W1988360637 @default.
- W2949966881 cites W2016223069 @default.
- W2949966881 cites W2031727428 @default.
- W2949966881 cites W2041377611 @default.
- W2949966881 cites W2096195880 @default.
- W2949966881 cites W2097381042 @default.
- W2949966881 cites W2099201756 @default.
- W2949966881 cites W2100785108 @default.
- W2949966881 cites W2115293355 @default.
- W2949966881 cites W2119850747 @default.
- W2949966881 cites W2148224612 @default.
- W2949966881 cites W2148434045 @default.
- W2949966881 cites W2152414052 @default.
- W2949966881 cites W2158807713 @default.
- W2949966881 cites W2168405694 @default.
- W2949966881 cites W2169743339 @default.
- W2949966881 cites W2245825236 @default.
- W2949966881 cites W2295214655 @default.
- W2949966881 cites W2490159962 @default.
- W2949966881 cites W2808217720 @default.
- W2949966881 cites W2952448454 @default.
- W2949966881 cites W3036998056 @default.
- W2949966881 cites W3124229194 @default.
- W2949966881 hasPublicationYear "2015" @default.
- W2949966881 type Work @default.
- W2949966881 sameAs 2949966881 @default.
- W2949966881 citedByCount "0" @default.
- W2949966881 crossrefType "posted-content" @default.
- W2949966881 hasAuthorship W2949966881A5021151663 @default.
- W2949966881 hasAuthorship W2949966881A5068297734 @default.
- W2949966881 hasAuthorship W2949966881A5071122608 @default.
- W2949966881 hasConcept C107457646 @default.
- W2949966881 hasConcept C107673813 @default.
- W2949966881 hasConcept C119857082 @default.
- W2949966881 hasConcept C127413603 @default.
- W2949966881 hasConcept C134306372 @default.
- W2949966881 hasConcept C154945302 @default.
- W2949966881 hasConcept C162324750 @default.
- W2949966881 hasConcept C177264268 @default.
- W2949966881 hasConcept C187736073 @default.
- W2949966881 hasConcept C199360897 @default.
- W2949966881 hasConcept C206588197 @default.
- W2949966881 hasConcept C2777212361 @default.
- W2949966881 hasConcept C2780451532 @default.
- W2949966881 hasConcept C33923547 @default.
- W2949966881 hasConcept C36503486 @default.
- W2949966881 hasConcept C41008148 @default.
- W2949966881 hasConcept C548081761 @default.
- W2949966881 hasConcept C97541855 @default.
- W2949966881 hasConcept C98045186 @default.
- W2949966881 hasConceptScore W2949966881C107457646 @default.
- W2949966881 hasConceptScore W2949966881C107673813 @default.
- W2949966881 hasConceptScore W2949966881C119857082 @default.
- W2949966881 hasConceptScore W2949966881C127413603 @default.
- W2949966881 hasConceptScore W2949966881C134306372 @default.
- W2949966881 hasConceptScore W2949966881C154945302 @default.
- W2949966881 hasConceptScore W2949966881C162324750 @default.
- W2949966881 hasConceptScore W2949966881C177264268 @default.
- W2949966881 hasConceptScore W2949966881C187736073 @default.
- W2949966881 hasConceptScore W2949966881C199360897 @default.
- W2949966881 hasConceptScore W2949966881C206588197 @default.
- W2949966881 hasConceptScore W2949966881C2777212361 @default.
- W2949966881 hasConceptScore W2949966881C2780451532 @default.
- W2949966881 hasConceptScore W2949966881C33923547 @default.
- W2949966881 hasConceptScore W2949966881C36503486 @default.
- W2949966881 hasConceptScore W2949966881C41008148 @default.
- W2949966881 hasConceptScore W2949966881C548081761 @default.
- W2949966881 hasConceptScore W2949966881C97541855 @default.
- W2949966881 hasConceptScore W2949966881C98045186 @default.
- W2949966881 hasLocation W29499668811 @default.
- W2949966881 hasOpenAccess W2949966881 @default.
- W2949966881 hasPrimaryLocation W29499668811 @default.
- W2949966881 hasRelatedWork W1744884320 @default.
- W2949966881 hasRelatedWork W203338875 @default.
- W2949966881 hasRelatedWork W2158641818 @default.
- W2949966881 hasRelatedWork W2261891975 @default.
- W2949966881 hasRelatedWork W2503646666 @default.
- W2949966881 hasRelatedWork W2551277100 @default.
- W2949966881 hasRelatedWork W2613603362 @default.
- W2949966881 hasRelatedWork W2768663371 @default.
- W2949966881 hasRelatedWork W2907704766 @default.
- W2949966881 hasRelatedWork W2909588145 @default.
- W2949966881 hasRelatedWork W2950197980 @default.
- W2949966881 hasRelatedWork W2991156573 @default.
- W2949966881 hasRelatedWork W3022124161 @default.
- W2949966881 hasRelatedWork W3035608172 @default.
- W2949966881 hasRelatedWork W3080901109 @default.
- W2949966881 hasRelatedWork W3170823761 @default.
- W2949966881 hasRelatedWork W3174070947 @default.