Matches in SemOpenAlex for { <https://semopenalex.org/work/W215298514> ?p ?o ?g. }
- W215298514 endingPage "598" @default.
- W215298514 startingPage "590" @default.
- W215298514 abstract "We consider modelling policies for MDPs in (vector-valued) reproducing kernel Hilbert function spaces (RKHS). This enables us to work “non-parametrically” in a rich function class, and provides the ability to learn complex policies. We present a framework for performing gradientbased policy optimization in the RKHS, deriving the functional gradient of the return for our policy, which has a simple form and can be estimated efficiently. The policy representation naturally focuses on the relevant region of state space defined by the policy trajectories, and does not rely on a-priori defined basis points; this can be an advantage in high dimensions where suitable basis points may be difficult to define a-priori. The method is adaptive in the sense that the policy representation will naturally adapt to the complexity of the policy being modelled, which is achieved with standard efficient sparsification tools in an RKHS. We argue that finding a good kernel on states can be easier then remetrizing a high dimensional feature space. We demonstrate the approach on benchmark domains and a simulated quadrocopter navigation task." @default.
- W215298514 created "2016-06-24" @default.
- W215298514 creator A5031943811 @default.
- W215298514 creator A5077227507 @default.
- W215298514 date "2015-02-21" @default.
- W215298514 modified "2023-09-26" @default.
- W215298514 title "Modelling Policies in MDPs in Reproducing Kernel Hilbert Space" @default.
- W215298514 cites W1510073064 @default.
- W215298514 cites W2012392077 @default.
- W215298514 cites W2103581399 @default.
- W215298514 cites W2113913482 @default.
- W215298514 cites W2115003579 @default.
- W215298514 cites W2129809168 @default.
- W215298514 cites W2130801532 @default.
- W215298514 cites W2139053308 @default.
- W215298514 cites W2140135625 @default.
- W215298514 cites W2151693816 @default.
- W215298514 cites W2155027007 @default.
- W215298514 cites W2165150801 @default.
- W215298514 cites W2182304831 @default.
- W215298514 cites W2578657880 @default.
- W215298514 cites W60181528 @default.
- W215298514 hasPublicationYear "2015" @default.
- W215298514 type Work @default.
- W215298514 sameAs 215298514 @default.
- W215298514 citedByCount "12" @default.
- W215298514 countsByYear W2152985142016 @default.
- W215298514 countsByYear W2152985142018 @default.
- W215298514 countsByYear W2152985142019 @default.
- W215298514 countsByYear W2152985142020 @default.
- W215298514 countsByYear W2152985142021 @default.
- W215298514 crossrefType "proceedings-article" @default.
- W215298514 hasAuthorship W215298514A5031943811 @default.
- W215298514 hasAuthorship W215298514A5077227507 @default.
- W215298514 hasConcept C111472728 @default.
- W215298514 hasConcept C118615104 @default.
- W215298514 hasConcept C126255220 @default.
- W215298514 hasConcept C13280743 @default.
- W215298514 hasConcept C134306372 @default.
- W215298514 hasConcept C138885662 @default.
- W215298514 hasConcept C14036430 @default.
- W215298514 hasConcept C142730499 @default.
- W215298514 hasConcept C154945302 @default.
- W215298514 hasConcept C17744445 @default.
- W215298514 hasConcept C185798385 @default.
- W215298514 hasConcept C199539241 @default.
- W215298514 hasConcept C205649164 @default.
- W215298514 hasConcept C2776359362 @default.
- W215298514 hasConcept C33923547 @default.
- W215298514 hasConcept C41008148 @default.
- W215298514 hasConcept C5917680 @default.
- W215298514 hasConcept C62799726 @default.
- W215298514 hasConcept C74193536 @default.
- W215298514 hasConcept C75553542 @default.
- W215298514 hasConcept C78458016 @default.
- W215298514 hasConcept C80884492 @default.
- W215298514 hasConcept C86803240 @default.
- W215298514 hasConcept C94625758 @default.
- W215298514 hasConceptScore W215298514C111472728 @default.
- W215298514 hasConceptScore W215298514C118615104 @default.
- W215298514 hasConceptScore W215298514C126255220 @default.
- W215298514 hasConceptScore W215298514C13280743 @default.
- W215298514 hasConceptScore W215298514C134306372 @default.
- W215298514 hasConceptScore W215298514C138885662 @default.
- W215298514 hasConceptScore W215298514C14036430 @default.
- W215298514 hasConceptScore W215298514C142730499 @default.
- W215298514 hasConceptScore W215298514C154945302 @default.
- W215298514 hasConceptScore W215298514C17744445 @default.
- W215298514 hasConceptScore W215298514C185798385 @default.
- W215298514 hasConceptScore W215298514C199539241 @default.
- W215298514 hasConceptScore W215298514C205649164 @default.
- W215298514 hasConceptScore W215298514C2776359362 @default.
- W215298514 hasConceptScore W215298514C33923547 @default.
- W215298514 hasConceptScore W215298514C41008148 @default.
- W215298514 hasConceptScore W215298514C5917680 @default.
- W215298514 hasConceptScore W215298514C62799726 @default.
- W215298514 hasConceptScore W215298514C74193536 @default.
- W215298514 hasConceptScore W215298514C75553542 @default.
- W215298514 hasConceptScore W215298514C78458016 @default.
- W215298514 hasConceptScore W215298514C80884492 @default.
- W215298514 hasConceptScore W215298514C86803240 @default.
- W215298514 hasConceptScore W215298514C94625758 @default.
- W215298514 hasLocation W2152985141 @default.
- W215298514 hasOpenAccess W215298514 @default.
- W215298514 hasPrimaryLocation W2152985141 @default.
- W215298514 hasRelatedWork W1560724230 @default.
- W215298514 hasRelatedWork W1746819321 @default.
- W215298514 hasRelatedWork W1898511152 @default.
- W215298514 hasRelatedWork W1977655452 @default.
- W215298514 hasRelatedWork W2012587148 @default.
- W215298514 hasRelatedWork W2053559248 @default.
- W215298514 hasRelatedWork W2103568863 @default.
- W215298514 hasRelatedWork W2115003579 @default.
- W215298514 hasRelatedWork W2119717200 @default.
- W215298514 hasRelatedWork W2121863487 @default.
- W215298514 hasRelatedWork W2129809168 @default.
- W215298514 hasRelatedWork W2155027007 @default.
- W215298514 hasRelatedWork W2286609364 @default.