Matches in SemOpenAlex for { <https://semopenalex.org/work/W1980241579> ?p ?o ?g. }
- W1980241579 abstract "Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists of learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. An important RL subtopic is to approximate this function when the system is too large for an exact representation. This survey reviews and unifies state of the art methods for parametric value function approximation by grouping them into three main categories: bootstrapping, residuals and projected fixed-point approaches. Related algorithms are derived by considering one of the associated cost functions and a specific way to minimize it, almost always a stochastic gradient descent or a recursive least-squares approach." @default.
- W1980241579 created "2016-06-24" @default.
- W1980241579 creator A5004267040 @default.
- W1980241579 creator A5065100569 @default.
- W1980241579 date "2011-04-01" @default.
- W1980241579 modified "2023-10-13" @default.
- W1980241579 title "Parametric value function approximation: A unified view" @default.
- W1980241579 cites W1507222174 @default.
- W1980241579 cites W1547105496 @default.
- W1980241579 cites W1646707810 @default.
- W1980241579 cites W2007240904 @default.
- W1980241579 cites W2007655795 @default.
- W1980241579 cites W2012547817 @default.
- W1980241579 cites W2019172585 @default.
- W1980241579 cites W2027648864 @default.
- W1980241579 cites W2044271447 @default.
- W1980241579 cites W2062541405 @default.
- W1980241579 cites W2072931156 @default.
- W1980241579 cites W2075268401 @default.
- W1980241579 cites W2082997771 @default.
- W1980241579 cites W2104753538 @default.
- W1980241579 cites W2105934661 @default.
- W1980241579 cites W2109504867 @default.
- W1980241579 cites W2112264645 @default.
- W1980241579 cites W2118556122 @default.
- W1980241579 cites W2127530277 @default.
- W1980241579 cites W2132351269 @default.
- W1980241579 cites W2139418546 @default.
- W1980241579 cites W2141562047 @default.
- W1980241579 cites W2147839652 @default.
- W1980241579 cites W2153290280 @default.
- W1980241579 cites W2156974606 @default.
- W1980241579 cites W2158984235 @default.
- W1980241579 cites W2334782222 @default.
- W1980241579 cites W2787259794 @default.
- W1980241579 cites W3198350258 @default.
- W1980241579 cites W4245296547 @default.
- W1980241579 cites W4362203700 @default.
- W1980241579 doi "https://doi.org/10.1109/adprl.2011.5967355" @default.
- W1980241579 hasPublicationYear "2011" @default.
- W1980241579 type Work @default.
- W1980241579 sameAs 1980241579 @default.
- W1980241579 citedByCount "22" @default.
- W1980241579 countsByYear W19802415792012 @default.
- W1980241579 countsByYear W19802415792013 @default.
- W1980241579 countsByYear W19802415792014 @default.
- W1980241579 countsByYear W19802415792015 @default.
- W1980241579 countsByYear W19802415792017 @default.
- W1980241579 countsByYear W19802415792019 @default.
- W1980241579 countsByYear W19802415792020 @default.
- W1980241579 countsByYear W19802415792023 @default.
- W1980241579 crossrefType "proceedings-article" @default.
- W1980241579 hasAuthorship W1980241579A5004267040 @default.
- W1980241579 hasAuthorship W1980241579A5065100569 @default.
- W1980241579 hasBestOaLocation W19802415792 @default.
- W1980241579 hasConcept C105795698 @default.
- W1980241579 hasConcept C117251300 @default.
- W1980241579 hasConcept C119857082 @default.
- W1980241579 hasConcept C126255220 @default.
- W1980241579 hasConcept C14036430 @default.
- W1980241579 hasConcept C14646407 @default.
- W1980241579 hasConcept C149782125 @default.
- W1980241579 hasConcept C154945302 @default.
- W1980241579 hasConcept C17744445 @default.
- W1980241579 hasConcept C199539241 @default.
- W1980241579 hasConcept C206688291 @default.
- W1980241579 hasConcept C207609745 @default.
- W1980241579 hasConcept C26517878 @default.
- W1980241579 hasConcept C2776291640 @default.
- W1980241579 hasConcept C2776359362 @default.
- W1980241579 hasConcept C33923547 @default.
- W1980241579 hasConcept C38652104 @default.
- W1980241579 hasConcept C41008148 @default.
- W1980241579 hasConcept C50644808 @default.
- W1980241579 hasConcept C55479107 @default.
- W1980241579 hasConcept C78458016 @default.
- W1980241579 hasConcept C86803240 @default.
- W1980241579 hasConcept C91873725 @default.
- W1980241579 hasConcept C94625758 @default.
- W1980241579 hasConcept C97541855 @default.
- W1980241579 hasConceptScore W1980241579C105795698 @default.
- W1980241579 hasConceptScore W1980241579C117251300 @default.
- W1980241579 hasConceptScore W1980241579C119857082 @default.
- W1980241579 hasConceptScore W1980241579C126255220 @default.
- W1980241579 hasConceptScore W1980241579C14036430 @default.
- W1980241579 hasConceptScore W1980241579C14646407 @default.
- W1980241579 hasConceptScore W1980241579C149782125 @default.
- W1980241579 hasConceptScore W1980241579C154945302 @default.
- W1980241579 hasConceptScore W1980241579C17744445 @default.
- W1980241579 hasConceptScore W1980241579C199539241 @default.
- W1980241579 hasConceptScore W1980241579C206688291 @default.
- W1980241579 hasConceptScore W1980241579C207609745 @default.
- W1980241579 hasConceptScore W1980241579C26517878 @default.
- W1980241579 hasConceptScore W1980241579C2776291640 @default.
- W1980241579 hasConceptScore W1980241579C2776359362 @default.
- W1980241579 hasConceptScore W1980241579C33923547 @default.
- W1980241579 hasConceptScore W1980241579C38652104 @default.
- W1980241579 hasConceptScore W1980241579C41008148 @default.
- W1980241579 hasConceptScore W1980241579C50644808 @default.
- W1980241579 hasConceptScore W1980241579C55479107 @default.