Matches in SemOpenAlex for { <https://semopenalex.org/work/W2949677831> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W2949677831 abstract "In some reinforcement learning problems an agent may be provided with a set of input policies, perhaps learned from prior experience or provided by advisors. We present a reinforcement learning with policy advice (RLPA) algorithm which leverages this input set and learns to use the best policy in the set for the reinforcement learning task at hand. We prove that RLPA has a sub-linear regret of tilde O(sqrt{T}) relative to the best input policy, and that both this regret and its computational complexity are independent of the size of the state and action space. Our empirical simulations support our theoretical analysis. This suggests RLPA may offer significant advantages in large domains where some prior good policies are provided." @default.
- W2949677831 created "2019-06-27" @default.
- W2949677831 creator A5014791481 @default.
- W2949677831 creator A5043355670 @default.
- W2949677831 creator A5084989076 @default.
- W2949677831 date "2013-09-01" @default.
- W2949677831 modified "2023-09-27" @default.
- W2949677831 title "Regret Bounds for Reinforcement Learning with Policy Advice" @default.
- W2949677831 cites W1279312 @default.
- W2949677831 cites W1662803991 @default.
- W2949677831 cites W167970998 @default.
- W2949677831 cites W1850488217 @default.
- W2949677831 cites W1979675141 @default.
- W2949677831 cites W2031727428 @default.
- W2949677831 cites W203338875 @default.
- W2949677831 cites W2112899086 @default.
- W2949677831 cites W2116459397 @default.
- W2949677831 cites W2119567691 @default.
- W2949677831 cites W2121863487 @default.
- W2949677831 cites W2137125320 @default.
- W2949677831 cites W2952099252 @default.
- W2949677831 cites W3102381603 @default.
- W2949677831 hasPublicationYear "2013" @default.
- W2949677831 type Work @default.
- W2949677831 sameAs 2949677831 @default.
- W2949677831 citedByCount "15" @default.
- W2949677831 countsByYear W29496778312013 @default.
- W2949677831 countsByYear W29496778312014 @default.
- W2949677831 countsByYear W29496778312015 @default.
- W2949677831 countsByYear W29496778312016 @default.
- W2949677831 countsByYear W29496778312017 @default.
- W2949677831 countsByYear W29496778312018 @default.
- W2949677831 crossrefType "proceedings-article" @default.
- W2949677831 hasAuthorship W2949677831A5014791481 @default.
- W2949677831 hasAuthorship W2949677831A5043355670 @default.
- W2949677831 hasAuthorship W2949677831A5084989076 @default.
- W2949677831 hasBestOaLocation W29496778311 @default.
- W2949677831 hasConcept C119857082 @default.
- W2949677831 hasConcept C154945302 @default.
- W2949677831 hasConcept C15744967 @default.
- W2949677831 hasConcept C199360897 @default.
- W2949677831 hasConcept C2779955035 @default.
- W2949677831 hasConcept C41008148 @default.
- W2949677831 hasConcept C50817715 @default.
- W2949677831 hasConcept C67203356 @default.
- W2949677831 hasConcept C77805123 @default.
- W2949677831 hasConcept C97541855 @default.
- W2949677831 hasConceptScore W2949677831C119857082 @default.
- W2949677831 hasConceptScore W2949677831C154945302 @default.
- W2949677831 hasConceptScore W2949677831C15744967 @default.
- W2949677831 hasConceptScore W2949677831C199360897 @default.
- W2949677831 hasConceptScore W2949677831C2779955035 @default.
- W2949677831 hasConceptScore W2949677831C41008148 @default.
- W2949677831 hasConceptScore W2949677831C50817715 @default.
- W2949677831 hasConceptScore W2949677831C67203356 @default.
- W2949677831 hasConceptScore W2949677831C77805123 @default.
- W2949677831 hasConceptScore W2949677831C97541855 @default.
- W2949677831 hasLocation W29496778311 @default.
- W2949677831 hasLocation W29496778312 @default.
- W2949677831 hasLocation W29496778313 @default.
- W2949677831 hasLocation W29496778314 @default.
- W2949677831 hasOpenAccess W2949677831 @default.
- W2949677831 hasPrimaryLocation W29496778311 @default.
- W2949677831 hasRelatedWork W1788769502 @default.
- W2949677831 hasRelatedWork W1925875298 @default.
- W2949677831 hasRelatedWork W2002805310 @default.
- W2949677831 hasRelatedWork W2132908009 @default.
- W2949677831 hasRelatedWork W2558906668 @default.
- W2949677831 hasRelatedWork W2945119207 @default.
- W2949677831 hasRelatedWork W4284890489 @default.
- W2949677831 hasRelatedWork W4285324069 @default.
- W2949677831 hasRelatedWork W4292701710 @default.
- W2949677831 hasRelatedWork W4296078469 @default.
- W2949677831 isParatext "false" @default.
- W2949677831 isRetracted "false" @default.
- W2949677831 magId "2949677831" @default.
- W2949677831 workType "article" @default.