Matches in SemOpenAlex for { <https://semopenalex.org/work/W2947232011> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W2947232011 abstract "We take initial steps in studying PAC-MDP algorithms with limited adaptivity, that is, algorithms that change its exploration policy as infrequently as possible during regret minimization. This is motivated by the difficulty of running fully adaptive algorithms in real-world applications (such as medical domains), and we propose to quantify adaptivity using the notion of local switching cost. Our main contribution, Q-Learning with UCB2 exploration, is a model-free algorithm for H-step episodic MDP that achieves sublinear regret whose local switching cost in K episodes is $O(H^3SAlog K)$, and we provide a lower bound of $Omega(HSA)$ on the local switching cost for any no-regret algorithm. Our algorithm can be naturally adapted to the concurrent setting, which yields nontrivial results that improve upon prior work in certain aspects." @default.
- W2947232011 created "2019-06-07" @default.
- W2947232011 creator A5008181744 @default.
- W2947232011 creator A5048050496 @default.
- W2947232011 creator A5051934623 @default.
- W2947232011 creator A5061481788 @default.
- W2947232011 date "2019-05-30" @default.
- W2947232011 modified "2023-09-27" @default.
- W2947232011 title "Provably Efficient Q-Learning with Low Switching Cost." @default.
- W2947232011 cites W1958090791 @default.
- W2947232011 cites W2019384544 @default.
- W2947232011 cites W2168405694 @default.
- W2947232011 cites W2214971211 @default.
- W2947232011 cites W2275802500 @default.
- W2947232011 cites W2347129741 @default.
- W2947232011 cites W2489939061 @default.
- W2947232011 cites W2769648743 @default.
- W2947232011 cites W2803543472 @default.
- W2947232011 cites W2885801653 @default.
- W2947232011 cites W2899637793 @default.
- W2947232011 cites W2928820257 @default.
- W2947232011 cites W2944009172 @default.
- W2947232011 cites W2950791045 @default.
- W2947232011 cites W2952926545 @default.
- W2947232011 cites W2970927156 @default.
- W2947232011 hasPublicationYear "2019" @default.
- W2947232011 type Work @default.
- W2947232011 sameAs 2947232011 @default.
- W2947232011 citedByCount "3" @default.
- W2947232011 countsByYear W29472320112019 @default.
- W2947232011 countsByYear W29472320112021 @default.
- W2947232011 crossrefType "posted-content" @default.
- W2947232011 hasAuthorship W2947232011A5008181744 @default.
- W2947232011 hasAuthorship W2947232011A5048050496 @default.
- W2947232011 hasAuthorship W2947232011A5051934623 @default.
- W2947232011 hasAuthorship W2947232011A5061481788 @default.
- W2947232011 hasConcept C11413529 @default.
- W2947232011 hasConcept C117160843 @default.
- W2947232011 hasConcept C118615104 @default.
- W2947232011 hasConcept C119857082 @default.
- W2947232011 hasConcept C126255220 @default.
- W2947232011 hasConcept C134306372 @default.
- W2947232011 hasConcept C33923547 @default.
- W2947232011 hasConcept C41008148 @default.
- W2947232011 hasConcept C50817715 @default.
- W2947232011 hasConcept C77553402 @default.
- W2947232011 hasConceptScore W2947232011C11413529 @default.
- W2947232011 hasConceptScore W2947232011C117160843 @default.
- W2947232011 hasConceptScore W2947232011C118615104 @default.
- W2947232011 hasConceptScore W2947232011C119857082 @default.
- W2947232011 hasConceptScore W2947232011C126255220 @default.
- W2947232011 hasConceptScore W2947232011C134306372 @default.
- W2947232011 hasConceptScore W2947232011C33923547 @default.
- W2947232011 hasConceptScore W2947232011C41008148 @default.
- W2947232011 hasConceptScore W2947232011C50817715 @default.
- W2947232011 hasConceptScore W2947232011C77553402 @default.
- W2947232011 hasLocation W29472320111 @default.
- W2947232011 hasOpenAccess W2947232011 @default.
- W2947232011 hasPrimaryLocation W29472320111 @default.
- W2947232011 hasRelatedWork W2203325308 @default.
- W2947232011 hasRelatedWork W2850875954 @default.
- W2947232011 hasRelatedWork W2915158376 @default.
- W2947232011 hasRelatedWork W2928649327 @default.
- W2947232011 hasRelatedWork W2954525165 @default.
- W2947232011 hasRelatedWork W2963094735 @default.
- W2947232011 hasRelatedWork W2963475649 @default.
- W2947232011 hasRelatedWork W2970073609 @default.
- W2947232011 hasRelatedWork W2982589023 @default.
- W2947232011 hasRelatedWork W2985590422 @default.
- W2947232011 hasRelatedWork W3034707348 @default.
- W2947232011 hasRelatedWork W3104156937 @default.
- W2947232011 hasRelatedWork W3110181803 @default.
- W2947232011 hasRelatedWork W3131973708 @default.
- W2947232011 hasRelatedWork W3173673246 @default.
- W2947232011 hasRelatedWork W3175599512 @default.
- W2947232011 hasRelatedWork W3187326931 @default.
- W2947232011 hasRelatedWork W3206149081 @default.
- W2947232011 hasRelatedWork W3210044325 @default.
- W2947232011 hasRelatedWork W2161853425 @default.
- W2947232011 isParatext "false" @default.
- W2947232011 isRetracted "false" @default.
- W2947232011 magId "2947232011" @default.
- W2947232011 workType "article" @default.