Matches in SemOpenAlex for { <https://semopenalex.org/work/W3200996868> ?p ?o ?g. }
Showing items 1 to 92 of
92
with 100 items per page.
- W3200996868 abstract "There is a strong link between the general concept of intelligence and the ability to collect and use information. The theory of Bayes-adaptive exploration offers an attractive optimality framework for training machines to perform complex information gathering tasks. However, the computational complexity of the resulting optimal control problem has limited the diffusion of the theory to mainstream deep AI research. In this paper we exploit the inherent mathematical structure of Bayes-adaptive problems order to dramatically simplify the problem by making the reward structure denser while simultaneously decoupling the learning of exploitation and exploration policies. The key to this simplification comes from the novel concept of cross-value (i.e. the value of being an environment while acting optimally according to another), which we use to quantify the value of currently available information. This results a new denser reward structure that cashes in all future rewards that can be predicted from the current information state. In a set of experiments we show that the approach makes it possible to learn challenging information gathering tasks without the use of shaping and heuristic bonuses situations where the standard RL algorithms fail." @default.
- W3200996868 created "2021-09-27" @default.
- W3200996868 creator A5039391126 @default.
- W3200996868 date "2021-09-17" @default.
- W3200996868 modified "2023-09-27" @default.
- W3200996868 title "Knowledge is reward: Learning optimal exploration by predictive reward cashing" @default.
- W3200996868 cites W1579979603 @default.
- W3200996868 cites W172298727 @default.
- W3200996868 cites W2020920737 @default.
- W3200996868 cites W2028169578 @default.
- W3200996868 cites W2101524054 @default.
- W3200996868 cites W2144558232 @default.
- W3200996868 cites W2144794447 @default.
- W3200996868 cites W2149586740 @default.
- W3200996868 cites W2157477959 @default.
- W3200996868 cites W2268119109 @default.
- W3200996868 cites W22928002 @default.
- W3200996868 cites W2327833721 @default.
- W3200996868 cites W2463330366 @default.
- W3200996868 cites W2550182557 @default.
- W3200996868 cites W2561776174 @default.
- W3200996868 cites W2572702041 @default.
- W3200996868 cites W2578206533 @default.
- W3200996868 cites W2796447411 @default.
- W3200996868 cites W2905606790 @default.
- W3200996868 cites W2949931500 @default.
- W3200996868 cites W2962851448 @default.
- W3200996868 cites W2963067607 @default.
- W3200996868 cites W2963160877 @default.
- W3200996868 cites W2963176272 @default.
- W3200996868 cites W2963276097 @default.
- W3200996868 cites W2963797557 @default.
- W3200996868 cites W2963938771 @default.
- W3200996868 cites W2964067469 @default.
- W3200996868 cites W2964174623 @default.
- W3200996868 cites W2996148148 @default.
- W3200996868 cites W3092688116 @default.
- W3200996868 cites W3123298421 @default.
- W3200996868 cites W3153792329 @default.
- W3200996868 cites W3166906164 @default.
- W3200996868 cites W3169864268 @default.
- W3200996868 cites W779494576 @default.
- W3200996868 cites W88520345 @default.
- W3200996868 hasPublicationYear "2021" @default.
- W3200996868 type Work @default.
- W3200996868 sameAs 3200996868 @default.
- W3200996868 citedByCount "0" @default.
- W3200996868 crossrefType "posted-content" @default.
- W3200996868 hasAuthorship W3200996868A5039391126 @default.
- W3200996868 hasConcept C119857082 @default.
- W3200996868 hasConcept C154945302 @default.
- W3200996868 hasConcept C165696696 @default.
- W3200996868 hasConcept C173801870 @default.
- W3200996868 hasConcept C177264268 @default.
- W3200996868 hasConcept C199360897 @default.
- W3200996868 hasConcept C38652104 @default.
- W3200996868 hasConcept C41008148 @default.
- W3200996868 hasConceptScore W3200996868C119857082 @default.
- W3200996868 hasConceptScore W3200996868C154945302 @default.
- W3200996868 hasConceptScore W3200996868C165696696 @default.
- W3200996868 hasConceptScore W3200996868C173801870 @default.
- W3200996868 hasConceptScore W3200996868C177264268 @default.
- W3200996868 hasConceptScore W3200996868C199360897 @default.
- W3200996868 hasConceptScore W3200996868C38652104 @default.
- W3200996868 hasConceptScore W3200996868C41008148 @default.
- W3200996868 hasLocation W32009968681 @default.
- W3200996868 hasOpenAccess W3200996868 @default.
- W3200996868 hasPrimaryLocation W32009968681 @default.
- W3200996868 hasRelatedWork W2183087363 @default.
- W3200996868 hasRelatedWork W2795908317 @default.
- W3200996868 hasRelatedWork W2806124528 @default.
- W3200996868 hasRelatedWork W2899041500 @default.
- W3200996868 hasRelatedWork W2949682451 @default.
- W3200996868 hasRelatedWork W2950722223 @default.
- W3200996868 hasRelatedWork W2978242174 @default.
- W3200996868 hasRelatedWork W2998241503 @default.
- W3200996868 hasRelatedWork W3005607450 @default.
- W3200996868 hasRelatedWork W3085832734 @default.
- W3200996868 hasRelatedWork W3090196474 @default.
- W3200996868 hasRelatedWork W3159757479 @default.
- W3200996868 hasRelatedWork W3168815054 @default.
- W3200996868 hasRelatedWork W3183316534 @default.
- W3200996868 hasRelatedWork W3200054052 @default.
- W3200996868 hasRelatedWork W3203207428 @default.
- W3200996868 hasRelatedWork W3210095716 @default.
- W3200996868 hasRelatedWork W76760840 @default.
- W3200996868 hasRelatedWork W3095548673 @default.
- W3200996868 hasRelatedWork W3166580013 @default.
- W3200996868 isParatext "false" @default.
- W3200996868 isRetracted "false" @default.
- W3200996868 magId "3200996868" @default.
- W3200996868 workType "article" @default.