Matches in SemOpenAlex for { <https://semopenalex.org/work/W2953183007> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W2953183007 abstract "To overcome the curse of dimensionality and curse of modeling in Dynamic Programming (DP) methods for solving classical Markov Decision Process (MDP) problems, Reinforcement Learning (RL) algorithms are popular. In this paper, we consider an infinite-horizon average reward MDP problem and prove the optimality of the threshold policy under certain conditions. Traditional RL techniques do not exploit the threshold nature of optimal policy while learning. In this paper, we propose a new RL algorithm which utilizes the known threshold structure of the optimal policy while learning by reducing the feasible policy space. We establish that the proposed algorithm converges to the optimal policy. It provides a significant improvement in convergence speed and computational and storage complexity over traditional RL algorithms. The proposed technique can be applied to a wide variety of optimization problems that include energy efficient data transmission and management of queues. We exhibit the improvement in convergence speed of the proposed algorithm over other RL algorithms through simulations." @default.
- W2953183007 created "2019-06-27" @default.
- W2953183007 creator A5018541798 @default.
- W2953183007 creator A5018587336 @default.
- W2953183007 creator A5018946124 @default.
- W2953183007 creator A5034853149 @default.
- W2953183007 date "2018-11-28" @default.
- W2953183007 modified "2023-09-27" @default.
- W2953183007 title "A Structure-aware Online Learning Algorithm for Markov Decision Processes" @default.
- W2953183007 cites W1500945877 @default.
- W2953183007 cites W1601081659 @default.
- W2953183007 cites W1796544916 @default.
- W2953183007 cites W1980922866 @default.
- W2953183007 cites W2021441076 @default.
- W2953183007 cites W2070570138 @default.
- W2953183007 cites W2071983464 @default.
- W2953183007 cites W2082261506 @default.
- W2953183007 cites W2102195169 @default.
- W2953183007 cites W2119567691 @default.
- W2953183007 cites W2120465407 @default.
- W2953183007 cites W2124715093 @default.
- W2953183007 cites W2138410336 @default.
- W2953183007 cites W2138717292 @default.
- W2953183007 cites W2154204727 @default.
- W2953183007 cites W2155027007 @default.
- W2953183007 cites W2167641136 @default.
- W2953183007 cites W2312609093 @default.
- W2953183007 cites W2791825310 @default.
- W2953183007 cites W2964273152 @default.
- W2953183007 cites W594357522 @default.
- W2953183007 hasPublicationYear "2018" @default.
- W2953183007 type Work @default.
- W2953183007 sameAs 2953183007 @default.
- W2953183007 citedByCount "0" @default.
- W2953183007 crossrefType "posted-content" @default.
- W2953183007 hasAuthorship W2953183007A5018541798 @default.
- W2953183007 hasAuthorship W2953183007A5018587336 @default.
- W2953183007 hasAuthorship W2953183007A5018946124 @default.
- W2953183007 hasAuthorship W2953183007A5034853149 @default.
- W2953183007 hasConcept C105795698 @default.
- W2953183007 hasConcept C106189395 @default.
- W2953183007 hasConcept C111030470 @default.
- W2953183007 hasConcept C11413529 @default.
- W2953183007 hasConcept C119857082 @default.
- W2953183007 hasConcept C126255220 @default.
- W2953183007 hasConcept C154945302 @default.
- W2953183007 hasConcept C159886148 @default.
- W2953183007 hasConcept C162324750 @default.
- W2953183007 hasConcept C2777303404 @default.
- W2953183007 hasConcept C33923547 @default.
- W2953183007 hasConcept C37404715 @default.
- W2953183007 hasConcept C41008148 @default.
- W2953183007 hasConcept C50522688 @default.
- W2953183007 hasConcept C97541855 @default.
- W2953183007 hasConcept C98763669 @default.
- W2953183007 hasConceptScore W2953183007C105795698 @default.
- W2953183007 hasConceptScore W2953183007C106189395 @default.
- W2953183007 hasConceptScore W2953183007C111030470 @default.
- W2953183007 hasConceptScore W2953183007C11413529 @default.
- W2953183007 hasConceptScore W2953183007C119857082 @default.
- W2953183007 hasConceptScore W2953183007C126255220 @default.
- W2953183007 hasConceptScore W2953183007C154945302 @default.
- W2953183007 hasConceptScore W2953183007C159886148 @default.
- W2953183007 hasConceptScore W2953183007C162324750 @default.
- W2953183007 hasConceptScore W2953183007C2777303404 @default.
- W2953183007 hasConceptScore W2953183007C33923547 @default.
- W2953183007 hasConceptScore W2953183007C37404715 @default.
- W2953183007 hasConceptScore W2953183007C41008148 @default.
- W2953183007 hasConceptScore W2953183007C50522688 @default.
- W2953183007 hasConceptScore W2953183007C97541855 @default.
- W2953183007 hasConceptScore W2953183007C98763669 @default.
- W2953183007 hasLocation W29531830071 @default.
- W2953183007 hasOpenAccess W2953183007 @default.
- W2953183007 hasPrimaryLocation W29531830071 @default.
- W2953183007 hasRelatedWork W1851714595 @default.
- W2953183007 hasRelatedWork W1990538571 @default.
- W2953183007 hasRelatedWork W1999784270 @default.
- W2953183007 hasRelatedWork W2116459397 @default.
- W2953183007 hasRelatedWork W2121816508 @default.
- W2953183007 hasRelatedWork W2137497393 @default.
- W2953183007 hasRelatedWork W2150339816 @default.
- W2953183007 hasRelatedWork W2160067530 @default.
- W2953183007 hasRelatedWork W2276878381 @default.
- W2953183007 hasRelatedWork W2512014291 @default.
- W2953183007 hasRelatedWork W2759520088 @default.
- W2953183007 hasRelatedWork W2902806543 @default.
- W2953183007 hasRelatedWork W2970870329 @default.
- W2953183007 hasRelatedWork W2993527038 @default.
- W2953183007 hasRelatedWork W2996383434 @default.
- W2953183007 hasRelatedWork W3091799185 @default.
- W2953183007 hasRelatedWork W3110191413 @default.
- W2953183007 hasRelatedWork W3154040352 @default.
- W2953183007 hasRelatedWork W3155179702 @default.
- W2953183007 hasRelatedWork W307767029 @default.
- W2953183007 isParatext "false" @default.
- W2953183007 isRetracted "false" @default.
- W2953183007 magId "2953183007" @default.
- W2953183007 workType "article" @default.