Matches in SemOpenAlex for { <https://semopenalex.org/work/W3048815067> ?p ?o ?g. }
Showing items 1 to 90 of
90
with 100 items per page.
- W3048815067 abstract "As a powerful mathematical framework that allows intelligent agents to gradually learn their optimal strategies in unknown dynamic environments, reinforcement learning (RL) has found its success in many important applications. Nonetheless, a common stumbling block of RL algorithms is their low learning speed. Although different methods have been developed in literature to enhance the learning speed when special structure or prior learning experience is available, expediting RL in the general settings still remains a challenge. The Zap Q-learning is a recent breakthrough in this direction, which is shown to be an order of magnitude faster than the conventional Q-learning and its cutting-edging variants. Inspired by this exciting result, a novel algorithm, termed Glide and Zap Q-learning (G-Zap Q-learning), is proposed in this work by incorporating a novel gliding step into the learning process. The proposed algorithm is provably convergent to the optimal strategy and can further increase the learning speed of the original Zap Q-learning by several folds. In addition, it is applicable to general Markov decision processes (MDPs) and hence assumes wide applications. Simulations over both randomly generated MDPs and an exemplary application of privacy-aware task offloading in mobile-edge computing are conducted to validate the effectiveness of the proposed algorithm." @default.
- W3048815067 created "2020-08-18" @default.
- W3048815067 creator A5027155270 @default.
- W3048815067 creator A5070646892 @default.
- W3048815067 creator A5077587039 @default.
- W3048815067 date "2020-07-01" @default.
- W3048815067 modified "2023-09-24" @default.
- W3048815067 title "Glide and Zap Q-Learning" @default.
- W3048815067 cites W2071983464 @default.
- W3048815067 cites W2122311135 @default.
- W3048815067 cites W2130424419 @default.
- W3048815067 cites W2145339207 @default.
- W3048815067 cites W2156288624 @default.
- W3048815067 cites W2164114810 @default.
- W3048815067 cites W2344423009 @default.
- W3048815067 cites W2624989916 @default.
- W3048815067 cites W2783988488 @default.
- W3048815067 cites W2895879715 @default.
- W3048815067 cites W2898035736 @default.
- W3048815067 cites W2962459360 @default.
- W3048815067 cites W2962890638 @default.
- W3048815067 cites W32403112 @default.
- W3048815067 cites W4233696721 @default.
- W3048815067 doi "https://doi.org/10.1109/infocomwkshps50562.2020.9162650" @default.
- W3048815067 hasPublicationYear "2020" @default.
- W3048815067 type Work @default.
- W3048815067 sameAs 3048815067 @default.
- W3048815067 citedByCount "1" @default.
- W3048815067 countsByYear W30488150672023 @default.
- W3048815067 crossrefType "proceedings-article" @default.
- W3048815067 hasAuthorship W3048815067A5027155270 @default.
- W3048815067 hasAuthorship W3048815067A5070646892 @default.
- W3048815067 hasAuthorship W3048815067A5077587039 @default.
- W3048815067 hasConcept C105795698 @default.
- W3048815067 hasConcept C106189395 @default.
- W3048815067 hasConcept C111919701 @default.
- W3048815067 hasConcept C11413529 @default.
- W3048815067 hasConcept C119857082 @default.
- W3048815067 hasConcept C127413603 @default.
- W3048815067 hasConcept C134448949 @default.
- W3048815067 hasConcept C154945302 @default.
- W3048815067 hasConcept C159886148 @default.
- W3048815067 hasConcept C162324750 @default.
- W3048815067 hasConcept C187736073 @default.
- W3048815067 hasConcept C188116033 @default.
- W3048815067 hasConcept C201995342 @default.
- W3048815067 hasConcept C2524010 @default.
- W3048815067 hasConcept C2777210771 @default.
- W3048815067 hasConcept C2780451532 @default.
- W3048815067 hasConcept C33923547 @default.
- W3048815067 hasConcept C41008148 @default.
- W3048815067 hasConcept C97541855 @default.
- W3048815067 hasConcept C98045186 @default.
- W3048815067 hasConceptScore W3048815067C105795698 @default.
- W3048815067 hasConceptScore W3048815067C106189395 @default.
- W3048815067 hasConceptScore W3048815067C111919701 @default.
- W3048815067 hasConceptScore W3048815067C11413529 @default.
- W3048815067 hasConceptScore W3048815067C119857082 @default.
- W3048815067 hasConceptScore W3048815067C127413603 @default.
- W3048815067 hasConceptScore W3048815067C134448949 @default.
- W3048815067 hasConceptScore W3048815067C154945302 @default.
- W3048815067 hasConceptScore W3048815067C159886148 @default.
- W3048815067 hasConceptScore W3048815067C162324750 @default.
- W3048815067 hasConceptScore W3048815067C187736073 @default.
- W3048815067 hasConceptScore W3048815067C188116033 @default.
- W3048815067 hasConceptScore W3048815067C201995342 @default.
- W3048815067 hasConceptScore W3048815067C2524010 @default.
- W3048815067 hasConceptScore W3048815067C2777210771 @default.
- W3048815067 hasConceptScore W3048815067C2780451532 @default.
- W3048815067 hasConceptScore W3048815067C33923547 @default.
- W3048815067 hasConceptScore W3048815067C41008148 @default.
- W3048815067 hasConceptScore W3048815067C97541855 @default.
- W3048815067 hasConceptScore W3048815067C98045186 @default.
- W3048815067 hasLocation W30488150671 @default.
- W3048815067 hasOpenAccess W3048815067 @default.
- W3048815067 hasPrimaryLocation W30488150671 @default.
- W3048815067 hasRelatedWork W1511927616 @default.
- W3048815067 hasRelatedWork W1556532828 @default.
- W3048815067 hasRelatedWork W2146763310 @default.
- W3048815067 hasRelatedWork W2182304831 @default.
- W3048815067 hasRelatedWork W2808418668 @default.
- W3048815067 hasRelatedWork W2937181779 @default.
- W3048815067 hasRelatedWork W3048815067 @default.
- W3048815067 hasRelatedWork W3096874164 @default.
- W3048815067 hasRelatedWork W3167472281 @default.
- W3048815067 hasRelatedWork W4297095626 @default.
- W3048815067 isParatext "false" @default.
- W3048815067 isRetracted "false" @default.
- W3048815067 magId "3048815067" @default.
- W3048815067 workType "article" @default.