SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W2617291496> ?p ?o ?g. }

Showing items 1 to 72 of 72 with 100 items per page.

W2617291496 abstract "Reinforcement learning is a general and powerful framework with which to study and implement artificial intelligence. Recent advances in deep learning have enabled RL algorithms to achieve impressive performance in restricted domains such as playing Atari video games (Mnih et al., 2015) and, recently, the board game Go (Silver et al., 2016). However, we are still far from constructing a generally intelligent agent. Many of the obstacles and open questions are conceptual: What does it mean to be intelligent? How does one explore and learn optimally in general, unknown environments? What, in fact, does it mean to be optimal in the general sense? The universal Bayesian agent AIXI (Hutter, 2005) is a model of a maximally intelligent agent, and plays a central role in the sub-field of general reinforcement learning (GRL). Recently, AIXI has been shown to be flawed in important ways; it doesn't explore enough to be asymptotically optimal (Orseau, 2010), and it can perform poorly with certain priors (Leike and Hutter, 2015). Several variants of AIXI have been proposed to attempt to address these shortfalls: among them are entropy-seeking agents (Orseau, 2011), knowledge-seeking agents (Orseau et al., 2013), Bayes with bursts of exploration (Lattimore, 2013), MDL agents (Leike, 2016a), Thompson sampling (Leike et al., 2016), and optimism (Sunehag and Hutter, 2015). We present AIXIjs, a JavaScript implementation of these GRL agents. This implementation is accompanied by a framework for running experiments against various environments, similar to OpenAI Gym (Brockman et al., 2016), and a suite of interactive demos that explore different properties of the agents, similar to REINFORCEjs (Karpathy, 2015). We use AIXIjs to present numerous experiments illustrating fundamental properties of, and differences between, these agents." @default.
W2617291496 created "2017-06-05" @default.
W2617291496 creator A5029807267 @default.
W2617291496 date "2017-05-22" @default.
W2617291496 modified "2023-09-27" @default.
W2617291496 title "AIXIjs: A Software Demo for General Reinforcement Learning." @default.
W2617291496 cites W1577509784 @default.
W2617291496 cites W1582436621 @default.
W2617291496 cites W1876044947 @default.
W2617291496 cites W2039522160 @default.
W2617291496 cites W2076063813 @default.
W2617291496 cites W2155027007 @default.
W2617291496 cites W2171084228 @default.
W2617291496 cites W2915054436 @default.
W2617291496 cites W2963080043 @default.
W2617291496 cites W3172851577 @default.
W2617291496 cites W65193931 @default.
W2617291496 hasPublicationYear "2017" @default.
W2617291496 type Work @default.
W2617291496 sameAs 2617291496 @default.
W2617291496 citedByCount "6" @default.
W2617291496 countsByYear W26172914962017 @default.
W2617291496 countsByYear W26172914962019 @default.
W2617291496 countsByYear W26172914962020 @default.
W2617291496 countsByYear W26172914962021 @default.
W2617291496 crossrefType "posted-content" @default.
W2617291496 hasAuthorship W2617291496A5029807267 @default.
W2617291496 hasConcept C107673813 @default.
W2617291496 hasConcept C119857082 @default.
W2617291496 hasConcept C154945302 @default.
W2617291496 hasConcept C166957645 @default.
W2617291496 hasConcept C41008148 @default.
W2617291496 hasConcept C74072328 @default.
W2617291496 hasConcept C79581498 @default.
W2617291496 hasConcept C95457728 @default.
W2617291496 hasConcept C97541855 @default.
W2617291496 hasConceptScore W2617291496C107673813 @default.
W2617291496 hasConceptScore W2617291496C119857082 @default.
W2617291496 hasConceptScore W2617291496C154945302 @default.
W2617291496 hasConceptScore W2617291496C166957645 @default.
W2617291496 hasConceptScore W2617291496C41008148 @default.
W2617291496 hasConceptScore W2617291496C74072328 @default.
W2617291496 hasConceptScore W2617291496C79581498 @default.
W2617291496 hasConceptScore W2617291496C95457728 @default.
W2617291496 hasConceptScore W2617291496C97541855 @default.
W2617291496 hasLocation W26172914961 @default.
W2617291496 hasOpenAccess W2617291496 @default.
W2617291496 hasPrimaryLocation W26172914961 @default.
W2617291496 hasRelatedWork W1425067287 @default.
W2617291496 hasRelatedWork W1492402092 @default.
W2617291496 hasRelatedWork W1553200036 @default.
W2617291496 hasRelatedWork W1953389919 @default.
W2617291496 hasRelatedWork W2013391942 @default.
W2617291496 hasRelatedWork W2063722002 @default.
W2617291496 hasRelatedWork W2091532789 @default.
W2617291496 hasRelatedWork W2118075434 @default.
W2617291496 hasRelatedWork W2395532075 @default.
W2617291496 hasRelatedWork W2793662597 @default.
W2617291496 hasRelatedWork W2805516822 @default.
W2617291496 hasRelatedWork W3011590047 @default.
W2617291496 hasRelatedWork W3037034483 @default.
W2617291496 hasRelatedWork W3091373078 @default.
W2617291496 hasRelatedWork W3104851248 @default.
W2617291496 hasRelatedWork W3121562452 @default.
W2617291496 hasRelatedWork W3134924952 @default.
W2617291496 hasRelatedWork W3174514644 @default.
W2617291496 hasRelatedWork W3183850949 @default.
W2617291496 hasRelatedWork W51055497 @default.
W2617291496 isParatext "false" @default.
W2617291496 isRetracted "false" @default.
W2617291496 magId "2617291496" @default.
W2617291496 workType "article" @default.