Matches in SemOpenAlex for { <https://semopenalex.org/work/W2397607997> ?p ?o ?g. }
Showing items 1 to 84 of
84
with 100 items per page.
- W2397607997 abstract "We introduce a variant of the classification-based approach to policy iteration which uses a cost-sensitive loss function weighting each classification mistake by its actual regret, that is, the difference between the action-value of the greedy action and of the action chosen by the classifier. For this algorithm, we provide a full finite-sample analysis. Our results state a performance bound in terms of the number of policy improvement steps, the number of rollouts used in each iteration, the capacity of the considered policy space (classifier), and a capacity measure which indicates how well the policy space can approximate policies that are greedy with respect to any of its members. The analysis reveals a tradeoff between the estimation and approximation errors in this classification-based policy iteration setting. Furthermore it confirms the intuition that classification-based policy iteration algorithms could be favorably compared to value-based approaches when the policies can be approximated more easily than their corresponding value functions. We also study the consistency of the algorithm when there exists a sequence of policy spaces with increasing capacity." @default.
- W2397607997 created "2016-06-24" @default.
- W2397607997 creator A5006533777 @default.
- W2397607997 creator A5013843778 @default.
- W2397607997 creator A5014791481 @default.
- W2397607997 date "2016-01-01" @default.
- W2397607997 modified "2023-10-17" @default.
- W2397607997 title "Analysis of Classification-based Policy Iteration Algorithms" @default.
- W2397607997 cites W107583932 @default.
- W2397607997 cites W1484867920 @default.
- W2397607997 cites W1526641569 @default.
- W2397607997 cites W1564947197 @default.
- W2397607997 cites W1573195610 @default.
- W2397607997 cites W1575592356 @default.
- W2397607997 cites W1600830292 @default.
- W2397607997 cites W1878054055 @default.
- W2397607997 cites W2003603259 @default.
- W2397607997 cites W2010029425 @default.
- W2397607997 cites W2012547817 @default.
- W2397607997 cites W2025263811 @default.
- W2397607997 cites W2028145673 @default.
- W2397607997 cites W2037199950 @default.
- W2397607997 cites W2058239479 @default.
- W2397607997 cites W2072931156 @default.
- W2397607997 cites W2104753538 @default.
- W2397607997 cites W2106191340 @default.
- W2397607997 cites W2117355432 @default.
- W2397607997 cites W2123917165 @default.
- W2397607997 cites W2128547596 @default.
- W2397607997 cites W2128812357 @default.
- W2397607997 cites W2130005627 @default.
- W2397607997 cites W2130599357 @default.
- W2397607997 cites W2133435356 @default.
- W2397607997 cites W2134289401 @default.
- W2397607997 cites W2138663952 @default.
- W2397607997 cites W2142261479 @default.
- W2397607997 cites W2148603752 @default.
- W2397607997 cites W2150821861 @default.
- W2397607997 cites W2165060096 @default.
- W2397607997 cites W2165421048 @default.
- W2397607997 cites W2181692883 @default.
- W2397607997 cites W2952963672 @default.
- W2397607997 cites W3099235411 @default.
- W2397607997 cites W55145144 @default.
- W2397607997 cites W55778226 @default.
- W2397607997 cites W56894658 @default.
- W2397607997 hasPublicationYear "2016" @default.
- W2397607997 type Work @default.
- W2397607997 sameAs 2397607997 @default.
- W2397607997 citedByCount "15" @default.
- W2397607997 countsByYear W23976079972016 @default.
- W2397607997 countsByYear W23976079972017 @default.
- W2397607997 countsByYear W23976079972019 @default.
- W2397607997 countsByYear W23976079972020 @default.
- W2397607997 countsByYear W23976079972021 @default.
- W2397607997 countsByYear W23976079972022 @default.
- W2397607997 crossrefType "journal-article" @default.
- W2397607997 hasAuthorship W2397607997A5006533777 @default.
- W2397607997 hasAuthorship W2397607997A5013843778 @default.
- W2397607997 hasAuthorship W2397607997A5014791481 @default.
- W2397607997 hasBestOaLocation W23976079971 @default.
- W2397607997 hasConcept C11413529 @default.
- W2397607997 hasConcept C41008148 @default.
- W2397607997 hasConceptScore W2397607997C11413529 @default.
- W2397607997 hasConceptScore W2397607997C41008148 @default.
- W2397607997 hasLocation W23976079971 @default.
- W2397607997 hasLocation W23976079972 @default.
- W2397607997 hasLocation W23976079973 @default.
- W2397607997 hasOpenAccess W2397607997 @default.
- W2397607997 hasPrimaryLocation W23976079971 @default.
- W2397607997 hasRelatedWork W2051487156 @default.
- W2397607997 hasRelatedWork W2073681303 @default.
- W2397607997 hasRelatedWork W2317200988 @default.
- W2397607997 hasRelatedWork W2350741829 @default.
- W2397607997 hasRelatedWork W2358668433 @default.
- W2397607997 hasRelatedWork W2376932109 @default.
- W2397607997 hasRelatedWork W2382290278 @default.
- W2397607997 hasRelatedWork W2390279801 @default.
- W2397607997 hasRelatedWork W2748952813 @default.
- W2397607997 hasRelatedWork W2899084033 @default.
- W2397607997 isParatext "false" @default.
- W2397607997 isRetracted "false" @default.
- W2397607997 magId "2397607997" @default.
- W2397607997 workType "article" @default.