Matches in SemOpenAlex for { <https://semopenalex.org/work/W172667874> ?p ?o ?g. }
Showing items 1 to 88 of
88
with 100 items per page.
- W172667874 endingPage "226" @default.
- W172667874 startingPage "211" @default.
- W172667874 abstract "This paper investigates a reinforcement learning method that combines learning a model of the environment with least-squares policy iteration (LSPI). The LSPI algorithm learns a linear approximation of the optimal state-action value function; the idea studied here is to let this value function depend on a learned estimate of the expected next state instead of directly on the current state and action. This approach makes it easier to define useful basis functions, and hence to learn a useful linear approximation of the value function. Experiments show that the new algorithm, called NSPI for next-state policy iteration, performs well on two standard benchmarks, the well-known mountain car and inverted pendulum swing-up tasks. More importantly, the NSPI algorithm performs well, and better than a specialized recent method, on a resource management task known as the day-ahead wind commitment problem. This latter task has action and state spaces that are high-dimensional and continuous." @default.
- W172667874 created "2016-06-24" @default.
- W172667874 creator A5010127861 @default.
- W172667874 creator A5038246567 @default.
- W172667874 date "2012-01-01" @default.
- W172667874 modified "2023-10-14" @default.
- W172667874 title "Policy Iteration Based on a Learned Transition Model" @default.
- W172667874 cites W1998172110 @default.
- W172667874 cites W2026503511 @default.
- W172667874 cites W2041684917 @default.
- W172667874 cites W2068052921 @default.
- W172667874 cites W2108281899 @default.
- W172667874 cites W2123979492 @default.
- W172667874 cites W2131831090 @default.
- W172667874 cites W2158462832 @default.
- W172667874 cites W2171890636 @default.
- W172667874 cites W4214717370 @default.
- W172667874 cites W4235282977 @default.
- W172667874 cites W965082120 @default.
- W172667874 doi "https://doi.org/10.1007/978-3-642-33486-3_14" @default.
- W172667874 hasPublicationYear "2012" @default.
- W172667874 type Work @default.
- W172667874 sameAs 172667874 @default.
- W172667874 citedByCount "4" @default.
- W172667874 countsByYear W1726678742013 @default.
- W172667874 countsByYear W1726678742015 @default.
- W172667874 countsByYear W1726678742021 @default.
- W172667874 crossrefType "book-chapter" @default.
- W172667874 hasAuthorship W172667874A5010127861 @default.
- W172667874 hasAuthorship W172667874A5038246567 @default.
- W172667874 hasBestOaLocation W1726678741 @default.
- W172667874 hasConcept C11413529 @default.
- W172667874 hasConcept C121332964 @default.
- W172667874 hasConcept C126255220 @default.
- W172667874 hasConcept C14036430 @default.
- W172667874 hasConcept C14646407 @default.
- W172667874 hasConcept C154945302 @default.
- W172667874 hasConcept C162324750 @default.
- W172667874 hasConcept C187736073 @default.
- W172667874 hasConcept C2780451532 @default.
- W172667874 hasConcept C2780791683 @default.
- W172667874 hasConcept C33923547 @default.
- W172667874 hasConcept C41008148 @default.
- W172667874 hasConcept C48103436 @default.
- W172667874 hasConcept C50644808 @default.
- W172667874 hasConcept C62520636 @default.
- W172667874 hasConcept C78458016 @default.
- W172667874 hasConcept C86803240 @default.
- W172667874 hasConcept C91873725 @default.
- W172667874 hasConcept C97541855 @default.
- W172667874 hasConceptScore W172667874C11413529 @default.
- W172667874 hasConceptScore W172667874C121332964 @default.
- W172667874 hasConceptScore W172667874C126255220 @default.
- W172667874 hasConceptScore W172667874C14036430 @default.
- W172667874 hasConceptScore W172667874C14646407 @default.
- W172667874 hasConceptScore W172667874C154945302 @default.
- W172667874 hasConceptScore W172667874C162324750 @default.
- W172667874 hasConceptScore W172667874C187736073 @default.
- W172667874 hasConceptScore W172667874C2780451532 @default.
- W172667874 hasConceptScore W172667874C2780791683 @default.
- W172667874 hasConceptScore W172667874C33923547 @default.
- W172667874 hasConceptScore W172667874C41008148 @default.
- W172667874 hasConceptScore W172667874C48103436 @default.
- W172667874 hasConceptScore W172667874C50644808 @default.
- W172667874 hasConceptScore W172667874C62520636 @default.
- W172667874 hasConceptScore W172667874C78458016 @default.
- W172667874 hasConceptScore W172667874C86803240 @default.
- W172667874 hasConceptScore W172667874C91873725 @default.
- W172667874 hasConceptScore W172667874C97541855 @default.
- W172667874 hasLocation W1726678741 @default.
- W172667874 hasOpenAccess W172667874 @default.
- W172667874 hasPrimaryLocation W1726678741 @default.
- W172667874 hasRelatedWork W1624593201 @default.
- W172667874 hasRelatedWork W2025663273 @default.
- W172667874 hasRelatedWork W2072697031 @default.
- W172667874 hasRelatedWork W2155027007 @default.
- W172667874 hasRelatedWork W2416943787 @default.
- W172667874 hasRelatedWork W2734912394 @default.
- W172667874 hasRelatedWork W3124157877 @default.
- W172667874 hasRelatedWork W4240668504 @default.
- W172667874 hasRelatedWork W4285484150 @default.
- W172667874 hasRelatedWork W4289355352 @default.
- W172667874 isParatext "false" @default.
- W172667874 isRetracted "false" @default.
- W172667874 magId "172667874" @default.
- W172667874 workType "book-chapter" @default.