Matches in SemOpenAlex for { <https://semopenalex.org/work/W1517383877> ?p ?o ?g. }
- W1517383877 abstract "Transferring knowledge across a sequence of reinforcement-learning tasks is challenging, and has a number of important applications. Though there is encouraging empirical evidence that transfer can improve performance in subsequent reinforcement-learning tasks, there has been very little theoretical analysis. In this paper, we introduce a new multi-task algorithm for a sequence of reinforcement-learning tasks when each task is sampled independently from (an unknown) distribution over a finite set of Markov decision processes whose parameters are initially unknown. For this setting, we prove under certain assumptions that the per-task sample complexity of exploration is reduced significantly due to transfer compared to standard single-task algorithms. Our multi-task algorithm also has the desired characteristic that it is guaranteed not to exhibit negative transfer: in the worst case its per-task sample complexity is comparable to the corresponding single-task algorithm." @default.
- W1517383877 created "2016-06-24" @default.
- W1517383877 creator A5054850777 @default.
- W1517383877 creator A5084989076 @default.
- W1517383877 date "2013-09-26" @default.
- W1517383877 modified "2023-09-26" @default.
- W1517383877 title "Sample Complexity of Multi-task Reinforcement Learning" @default.
- W1517383877 cites W107583932 @default.
- W1517383877 cites W1279312 @default.
- W1517383877 cites W1505937442 @default.
- W1517383877 cites W1511400694 @default.
- W1517383877 cites W1515851193 @default.
- W1517383877 cites W1526654727 @default.
- W1517383877 cites W1587317356 @default.
- W1517383877 cites W167970998 @default.
- W1517383877 cites W1850488217 @default.
- W1517383877 cites W1988526405 @default.
- W1517383877 cites W2020294948 @default.
- W1517383877 cites W2097381042 @default.
- W1517383877 cites W2112899086 @default.
- W1517383877 cites W2116459397 @default.
- W1517383877 cites W2119567691 @default.
- W1517383877 cites W2121863487 @default.
- W1517383877 cites W2123447947 @default.
- W1517383877 cites W2129670787 @default.
- W1517383877 cites W2132057084 @default.
- W1517383877 cites W2142502798 @default.
- W1517383877 cites W2169743339 @default.
- W1517383877 cites W2403765497 @default.
- W1517383877 cites W24272225 @default.
- W1517383877 cites W2488247662 @default.
- W1517383877 cites W2489939061 @default.
- W1517383877 cites W2952973061 @default.
- W1517383877 doi "https://doi.org/10.48550/arxiv.1309.6821" @default.
- W1517383877 hasPublicationYear "2013" @default.
- W1517383877 type Work @default.
- W1517383877 sameAs 1517383877 @default.
- W1517383877 citedByCount "20" @default.
- W1517383877 countsByYear W15173838772014 @default.
- W1517383877 countsByYear W15173838772015 @default.
- W1517383877 countsByYear W15173838772019 @default.
- W1517383877 countsByYear W15173838772020 @default.
- W1517383877 countsByYear W15173838772021 @default.
- W1517383877 crossrefType "posted-content" @default.
- W1517383877 hasAuthorship W1517383877A5054850777 @default.
- W1517383877 hasAuthorship W1517383877A5084989076 @default.
- W1517383877 hasBestOaLocation W15173838771 @default.
- W1517383877 hasConcept C105795698 @default.
- W1517383877 hasConcept C106189395 @default.
- W1517383877 hasConcept C119857082 @default.
- W1517383877 hasConcept C127413603 @default.
- W1517383877 hasConcept C150899416 @default.
- W1517383877 hasConcept C154945302 @default.
- W1517383877 hasConcept C15744967 @default.
- W1517383877 hasConcept C159886148 @default.
- W1517383877 hasConcept C177264268 @default.
- W1517383877 hasConcept C185592680 @default.
- W1517383877 hasConcept C198531522 @default.
- W1517383877 hasConcept C199360897 @default.
- W1517383877 hasConcept C201995342 @default.
- W1517383877 hasConcept C2778112365 @default.
- W1517383877 hasConcept C2778445095 @default.
- W1517383877 hasConcept C2780451532 @default.
- W1517383877 hasConcept C28006648 @default.
- W1517383877 hasConcept C33923547 @default.
- W1517383877 hasConcept C41008148 @default.
- W1517383877 hasConcept C43617362 @default.
- W1517383877 hasConcept C54355233 @default.
- W1517383877 hasConcept C67203356 @default.
- W1517383877 hasConcept C77805123 @default.
- W1517383877 hasConcept C86803240 @default.
- W1517383877 hasConcept C97541855 @default.
- W1517383877 hasConcept C98763669 @default.
- W1517383877 hasConceptScore W1517383877C105795698 @default.
- W1517383877 hasConceptScore W1517383877C106189395 @default.
- W1517383877 hasConceptScore W1517383877C119857082 @default.
- W1517383877 hasConceptScore W1517383877C127413603 @default.
- W1517383877 hasConceptScore W1517383877C150899416 @default.
- W1517383877 hasConceptScore W1517383877C154945302 @default.
- W1517383877 hasConceptScore W1517383877C15744967 @default.
- W1517383877 hasConceptScore W1517383877C159886148 @default.
- W1517383877 hasConceptScore W1517383877C177264268 @default.
- W1517383877 hasConceptScore W1517383877C185592680 @default.
- W1517383877 hasConceptScore W1517383877C198531522 @default.
- W1517383877 hasConceptScore W1517383877C199360897 @default.
- W1517383877 hasConceptScore W1517383877C201995342 @default.
- W1517383877 hasConceptScore W1517383877C2778112365 @default.
- W1517383877 hasConceptScore W1517383877C2778445095 @default.
- W1517383877 hasConceptScore W1517383877C2780451532 @default.
- W1517383877 hasConceptScore W1517383877C28006648 @default.
- W1517383877 hasConceptScore W1517383877C33923547 @default.
- W1517383877 hasConceptScore W1517383877C41008148 @default.
- W1517383877 hasConceptScore W1517383877C43617362 @default.
- W1517383877 hasConceptScore W1517383877C54355233 @default.
- W1517383877 hasConceptScore W1517383877C67203356 @default.
- W1517383877 hasConceptScore W1517383877C77805123 @default.
- W1517383877 hasConceptScore W1517383877C86803240 @default.
- W1517383877 hasConceptScore W1517383877C97541855 @default.
- W1517383877 hasConceptScore W1517383877C98763669 @default.
- W1517383877 hasLocation W15173838771 @default.