Matches in SemOpenAlex for { <https://semopenalex.org/work/W3172841285> ?p ?o ?g. }
Showing items 1 to 91 of
91
with 100 items per page.
- W3172841285 abstract "We introduce a mapping between Maximum Entropy Reinforcement Learning (MaxEnt RL) and Markovian processes conditioned on rare events. In the long time limit, this mapping allows us to derive analytical expressions for the optimal policy, dynamics and initial state distributions for the general case of stochastic dynamics in MaxEnt RL. We find that soft-$mathcal{Q}$ functions in MaxEnt RL can be obtained from the Perron-Frobenius eigenvalue and the corresponding left eigenvector of a regular, non-negative matrix derived from the underlying Markov Decision Process (MDP). The results derived lead to novel algorithms for model-based and model-free MaxEnt RL, which we validate by numerical simulations. The mapping established in this work opens further avenues for the application of novel analytical and computational approaches to problems in MaxEnt RL. We make our code available at: this https URL" @default.
- W3172841285 created "2021-06-22" @default.
- W3172841285 creator A5046771851 @default.
- W3172841285 creator A5048266570 @default.
- W3172841285 creator A5054837175 @default.
- W3172841285 date "2021-06-07" @default.
- W3172841285 modified "2023-09-27" @default.
- W3172841285 title "Closed-Form Analytical Results for Maximum Entropy Reinforcement Learning." @default.
- W3172841285 cites W1615577543 @default.
- W3172841285 cites W2145060720 @default.
- W3172841285 hasPublicationYear "2021" @default.
- W3172841285 type Work @default.
- W3172841285 sameAs 3172841285 @default.
- W3172841285 citedByCount "0" @default.
- W3172841285 crossrefType "posted-content" @default.
- W3172841285 hasAuthorship W3172841285A5046771851 @default.
- W3172841285 hasAuthorship W3172841285A5048266570 @default.
- W3172841285 hasAuthorship W3172841285A5054837175 @default.
- W3172841285 hasConcept C105795698 @default.
- W3172841285 hasConcept C106189395 @default.
- W3172841285 hasConcept C106301342 @default.
- W3172841285 hasConcept C11413529 @default.
- W3172841285 hasConcept C119857082 @default.
- W3172841285 hasConcept C121332964 @default.
- W3172841285 hasConcept C121864883 @default.
- W3172841285 hasConcept C126255220 @default.
- W3172841285 hasConcept C134306372 @default.
- W3172841285 hasConcept C151201525 @default.
- W3172841285 hasConcept C154945302 @default.
- W3172841285 hasConcept C158693339 @default.
- W3172841285 hasConcept C159886148 @default.
- W3172841285 hasConcept C177264268 @default.
- W3172841285 hasConcept C199360897 @default.
- W3172841285 hasConcept C2776760102 @default.
- W3172841285 hasConcept C28826006 @default.
- W3172841285 hasConcept C33923547 @default.
- W3172841285 hasConcept C41008148 @default.
- W3172841285 hasConcept C62520636 @default.
- W3172841285 hasConcept C9679016 @default.
- W3172841285 hasConcept C97541855 @default.
- W3172841285 hasConcept C98763669 @default.
- W3172841285 hasConceptScore W3172841285C105795698 @default.
- W3172841285 hasConceptScore W3172841285C106189395 @default.
- W3172841285 hasConceptScore W3172841285C106301342 @default.
- W3172841285 hasConceptScore W3172841285C11413529 @default.
- W3172841285 hasConceptScore W3172841285C119857082 @default.
- W3172841285 hasConceptScore W3172841285C121332964 @default.
- W3172841285 hasConceptScore W3172841285C121864883 @default.
- W3172841285 hasConceptScore W3172841285C126255220 @default.
- W3172841285 hasConceptScore W3172841285C134306372 @default.
- W3172841285 hasConceptScore W3172841285C151201525 @default.
- W3172841285 hasConceptScore W3172841285C154945302 @default.
- W3172841285 hasConceptScore W3172841285C158693339 @default.
- W3172841285 hasConceptScore W3172841285C159886148 @default.
- W3172841285 hasConceptScore W3172841285C177264268 @default.
- W3172841285 hasConceptScore W3172841285C199360897 @default.
- W3172841285 hasConceptScore W3172841285C2776760102 @default.
- W3172841285 hasConceptScore W3172841285C28826006 @default.
- W3172841285 hasConceptScore W3172841285C33923547 @default.
- W3172841285 hasConceptScore W3172841285C41008148 @default.
- W3172841285 hasConceptScore W3172841285C62520636 @default.
- W3172841285 hasConceptScore W3172841285C9679016 @default.
- W3172841285 hasConceptScore W3172841285C97541855 @default.
- W3172841285 hasConceptScore W3172841285C98763669 @default.
- W3172841285 hasLocation W31728412851 @default.
- W3172841285 hasOpenAccess W3172841285 @default.
- W3172841285 hasPrimaryLocation W31728412851 @default.
- W3172841285 hasRelatedWork W176758818 @default.
- W3172841285 hasRelatedWork W2093124207 @default.
- W3172841285 hasRelatedWork W2117804315 @default.
- W3172841285 hasRelatedWork W2142162192 @default.
- W3172841285 hasRelatedWork W2154798771 @default.
- W3172841285 hasRelatedWork W2259458763 @default.
- W3172841285 hasRelatedWork W2462392705 @default.
- W3172841285 hasRelatedWork W2468208319 @default.
- W3172841285 hasRelatedWork W2735595928 @default.
- W3172841285 hasRelatedWork W2755453370 @default.
- W3172841285 hasRelatedWork W2788076844 @default.
- W3172841285 hasRelatedWork W2940817393 @default.
- W3172841285 hasRelatedWork W2963451883 @default.
- W3172841285 hasRelatedWork W2963776266 @default.
- W3172841285 hasRelatedWork W2970068706 @default.
- W3172841285 hasRelatedWork W3012988968 @default.
- W3172841285 hasRelatedWork W3022854535 @default.
- W3172841285 hasRelatedWork W3132578471 @default.
- W3172841285 hasRelatedWork W3133569444 @default.
- W3172841285 hasRelatedWork W3206976204 @default.
- W3172841285 isParatext "false" @default.
- W3172841285 isRetracted "false" @default.
- W3172841285 magId "3172841285" @default.
- W3172841285 workType "article" @default.