Matches in SemOpenAlex for { <https://semopenalex.org/work/W3168182227> ?p ?o ?g. }
- W3168182227 abstract "Biological agents have meaningful interactions with their environment despite the absence of immediate reward signals. In such instances, the agent can learn preferred modes of behaviour that lead to predictable states -- necessary for survival. In this paper, we pursue the notion that this learnt behaviour can be a consequence of reward-free preference learning that ensures an appropriate trade-off between exploration and preference satisfaction. For this, we introduce a model-based Bayesian agent equipped with a preference learning mechanism (pepper) using conjugate priors. These conjugate priors are used to augment the expected free energy planner for learning preferences over states (or outcomes) across time. Importantly, our approach enables the agent to learn preferences that encourage adaptive behaviour at test time. We illustrate this in the OpenAI Gym FrozenLake and the 3D mini-world environments -- with and without volatility. Given a constant environment, these agents learn confident (i.e., precise) preferences and act to satisfy them. Conversely, in a volatile setting, perpetual preference uncertainty maintains exploratory behaviour. Our experiments suggest that learnable (reward-free) preferences entail a trade-off between exploration and preference satisfaction. Pepper offers a straightforward framework suitable for designing adaptive agents when reward functions cannot be predefined as in real environments." @default.
- W3168182227 created "2021-06-22" @default.
- W3168182227 creator A5066982052 @default.
- W3168182227 creator A5071633664 @default.
- W3168182227 creator A5072080630 @default.
- W3168182227 creator A5077250049 @default.
- W3168182227 creator A5080525121 @default.
- W3168182227 date "2021-06-08" @default.
- W3168182227 modified "2023-09-27" @default.
- W3168182227 title "Exploration and preference satisfaction trade-off in reward-free learning" @default.
- W3168182227 cites W1486707268 @default.
- W3168182227 cites W1509073079 @default.
- W3168182227 cites W1522301498 @default.
- W3168182227 cites W1524895715 @default.
- W3168182227 cites W1591713425 @default.
- W3168182227 cites W172298727 @default.
- W3168182227 cites W1763311249 @default.
- W3168182227 cites W2000514530 @default.
- W3168182227 cites W2020920737 @default.
- W3168182227 cites W2034806191 @default.
- W3168182227 cites W2055789905 @default.
- W3168182227 cites W2056769976 @default.
- W3168182227 cites W2089381994 @default.
- W3168182227 cites W2099505143 @default.
- W3168182227 cites W2105232223 @default.
- W3168182227 cites W2111030512 @default.
- W3168182227 cites W2121863487 @default.
- W3168182227 cites W2138621090 @default.
- W3168182227 cites W2139612737 @default.
- W3168182227 cites W2148764920 @default.
- W3168182227 cites W2154022540 @default.
- W3168182227 cites W2157331557 @default.
- W3168182227 cites W2164424353 @default.
- W3168182227 cites W2170899200 @default.
- W3168182227 cites W2181068523 @default.
- W3168182227 cites W2188721763 @default.
- W3168182227 cites W2417786368 @default.
- W3168182227 cites W2495467174 @default.
- W3168182227 cites W2552810632 @default.
- W3168182227 cites W2584057084 @default.
- W3168182227 cites W2781668895 @default.
- W3168182227 cites W2781726626 @default.
- W3168182227 cites W2795495912 @default.
- W3168182227 cites W2892266804 @default.
- W3168182227 cites W2899205164 @default.
- W3168182227 cites W2900152462 @default.
- W3168182227 cites W2907537824 @default.
- W3168182227 cites W2950872548 @default.
- W3168182227 cites W2963238274 @default.
- W3168182227 cites W2963523627 @default.
- W3168182227 cites W2963820385 @default.
- W3168182227 cites W2976973594 @default.
- W3168182227 cites W2995298643 @default.
- W3168182227 cites W3015971707 @default.
- W3168182227 cites W3029413790 @default.
- W3168182227 cites W3033115783 @default.
- W3168182227 cites W3034781633 @default.
- W3168182227 cites W3034973310 @default.
- W3168182227 cites W3035599863 @default.
- W3168182227 cites W3036498527 @default.
- W3168182227 cites W3095485376 @default.
- W3168182227 cites W3113994363 @default.
- W3168182227 cites W3118769118 @default.
- W3168182227 cites W3122690883 @default.
- W3168182227 cites W779494576 @default.
- W3168182227 hasPublicationYear "2021" @default.
- W3168182227 type Work @default.
- W3168182227 sameAs 3168182227 @default.
- W3168182227 citedByCount "0" @default.
- W3168182227 crossrefType "posted-content" @default.
- W3168182227 hasAuthorship W3168182227A5066982052 @default.
- W3168182227 hasAuthorship W3168182227A5071633664 @default.
- W3168182227 hasAuthorship W3168182227A5072080630 @default.
- W3168182227 hasAuthorship W3168182227A5077250049 @default.
- W3168182227 hasAuthorship W3168182227A5080525121 @default.
- W3168182227 hasConcept C107673813 @default.
- W3168182227 hasConcept C154945302 @default.
- W3168182227 hasConcept C162324750 @default.
- W3168182227 hasConcept C175444787 @default.
- W3168182227 hasConcept C177769412 @default.
- W3168182227 hasConcept C26004113 @default.
- W3168182227 hasConcept C2776999362 @default.
- W3168182227 hasConcept C2781249084 @default.
- W3168182227 hasConcept C41008148 @default.
- W3168182227 hasConcept C97541855 @default.
- W3168182227 hasConceptScore W3168182227C107673813 @default.
- W3168182227 hasConceptScore W3168182227C154945302 @default.
- W3168182227 hasConceptScore W3168182227C162324750 @default.
- W3168182227 hasConceptScore W3168182227C175444787 @default.
- W3168182227 hasConceptScore W3168182227C177769412 @default.
- W3168182227 hasConceptScore W3168182227C26004113 @default.
- W3168182227 hasConceptScore W3168182227C2776999362 @default.
- W3168182227 hasConceptScore W3168182227C2781249084 @default.
- W3168182227 hasConceptScore W3168182227C41008148 @default.
- W3168182227 hasConceptScore W3168182227C97541855 @default.
- W3168182227 hasLocation W31681822271 @default.
- W3168182227 hasOpenAccess W3168182227 @default.
- W3168182227 hasPrimaryLocation W31681822271 @default.
- W3168182227 hasRelatedWork W165425597 @default.
- W3168182227 hasRelatedWork W2040513777 @default.