Matches in SemOpenAlex for { <https://semopenalex.org/work/W50830905> ?p ?o ?g. }
- W50830905 abstract "This dissertation addresses the problem of learning to act in an unknown and uncertain world. This is a difficult problem. Even if a world model is available, an assumption not made here, it is known to be intractable to learn an optimal policy for controlling behaviour (Littman 1996). Assuming no world model is known leads to two approaches: model-free learning, which attempts to learn to act without a model of the environment, and model learning, which attempts to learn a model of the environment from interactions with the world. Most earlier approaches make a priori assumptions about the complexity of the model or policy required, the upshot of which is that a fixed amount of memory is available to the agent. It is well known that in a noisy environment, the type assumed within, an environment specific amount of memory is required to act optimally. Fixing the capacity of memory before any interactions have occurred is thus a limiting assumption. The theme of this dissertation is that representing multiple policies or environment models of varying size enables us to address this problem. Both model-free learning and model learning are investigated. For the former, I present a policy search method (usable with a wide range of algorithms) that maintains a population of policies of varying size. By sharing information between policies I show that it can learn near optimal policies for a variety of challenging problems, and that performance is significantly improved over using the same amount of computation without information sharing. I investigate two approaches to model learning. The first is a variational Bayesian method for learning POMDPs. I show that it achieves superior results to the Bayes-adaptive algorithm (Ross, Chaib-draa and Pineau 2007) using their experimental setup. However, this experimental setup makes strong assumptions about prior information, and I show that weakening these assumptions leads to poor performance. I then address model learning for a simpler model, a topological map. I develop a novel non-parametric Bayesian map that sets no limit of the model size, and show experimentally that maps can be learned from robot data with weak prior knowledge." @default.
- W50830905 created "2016-06-24" @default.
- W50830905 creator A5055382092 @default.
- W50830905 date "2011-07-01" @default.
- W50830905 modified "2023-09-27" @default.
- W50830905 title "Learning and acting in unknown and uncertain worlds" @default.
- W50830905 cites W1176136657 @default.
- W50830905 cites W1486341833 @default.
- W50830905 cites W1489119587 @default.
- W50830905 cites W1489258026 @default.
- W50830905 cites W1496855202 @default.
- W50830905 cites W1515891729 @default.
- W50830905 cites W1517266559 @default.
- W50830905 cites W1526654727 @default.
- W50830905 cites W1529851927 @default.
- W50830905 cites W1535392723 @default.
- W50830905 cites W1539054658 @default.
- W50830905 cites W1540337045 @default.
- W50830905 cites W1541084404 @default.
- W50830905 cites W1557073320 @default.
- W50830905 cites W1564755532 @default.
- W50830905 cites W1568770747 @default.
- W50830905 cites W1570690983 @default.
- W50830905 cites W1572411218 @default.
- W50830905 cites W158205031 @default.
- W50830905 cites W1583380718 @default.
- W50830905 cites W1594297126 @default.
- W50830905 cites W1594783240 @default.
- W50830905 cites W1599296339 @default.
- W50830905 cites W1617610651 @default.
- W50830905 cites W1640774615 @default.
- W50830905 cites W1657542410 @default.
- W50830905 cites W1657674574 @default.
- W50830905 cites W1677409904 @default.
- W50830905 cites W1683541793 @default.
- W50830905 cites W1687873425 @default.
- W50830905 cites W1712653483 @default.
- W50830905 cites W1766015196 @default.
- W50830905 cites W180325379 @default.
- W50830905 cites W181022050 @default.
- W50830905 cites W1850488217 @default.
- W50830905 cites W1914583973 @default.
- W50830905 cites W195465510 @default.
- W50830905 cites W1961810468 @default.
- W50830905 cites W1967687583 @default.
- W50830905 cites W1970789124 @default.
- W50830905 cites W1984205520 @default.
- W50830905 cites W1987532879 @default.
- W50830905 cites W1994920552 @default.
- W50830905 cites W2011671528 @default.
- W50830905 cites W2013391942 @default.
- W50830905 cites W2018122853 @default.
- W50830905 cites W2020999234 @default.
- W50830905 cites W2024060531 @default.
- W50830905 cites W2031594824 @default.
- W50830905 cites W2034725503 @default.
- W50830905 cites W2049633694 @default.
- W50830905 cites W2050234982 @default.
- W50830905 cites W2052014837 @default.
- W50830905 cites W2052627427 @default.
- W50830905 cites W2064675550 @default.
- W50830905 cites W2069429561 @default.
- W50830905 cites W2079987671 @default.
- W50830905 cites W2081547991 @default.
- W50830905 cites W2083875149 @default.
- W50830905 cites W2089484716 @default.
- W50830905 cites W2096533821 @default.
- W50830905 cites W2097936260 @default.
- W50830905 cites W2099291701 @default.
- W50830905 cites W2102379184 @default.
- W50830905 cites W2103581399 @default.
- W50830905 cites W2104209160 @default.
- W50830905 cites W2104368525 @default.
- W50830905 cites W2106008679 @default.
- W50830905 cites W2107726111 @default.
- W50830905 cites W2108519123 @default.
- W50830905 cites W2113958614 @default.
- W50830905 cites W2115979064 @default.
- W50830905 cites W2119567691 @default.
- W50830905 cites W2122410182 @default.
- W50830905 cites W2123372395 @default.
- W50830905 cites W2125523964 @default.
- W50830905 cites W2125792273 @default.
- W50830905 cites W2125838338 @default.
- W50830905 cites W2128002512 @default.
- W50830905 cites W2128775537 @default.
- W50830905 cites W2130801532 @default.
- W50830905 cites W2134802714 @default.
- W50830905 cites W2135194391 @default.
- W50830905 cites W2139302369 @default.
- W50830905 cites W2140881794 @default.
- W50830905 cites W2141690016 @default.
- W50830905 cites W2144794447 @default.
- W50830905 cites W2144824356 @default.
- W50830905 cites W2144913588 @default.
- W50830905 cites W2147000823 @default.
- W50830905 cites W2147249804 @default.
- W50830905 cites W2148067905 @default.
- W50830905 cites W2151040408 @default.
- W50830905 cites W2151103935 @default.