Matches in SemOpenAlex for { <https://semopenalex.org/work/W2890926617> ?p ?o ?g. }
Showing items 1 to 90 of
90
with 100 items per page.
- W2890926617 abstract "The objective is to study an on-line Hidden Markov model (HMM) estimation-based Q-learning algorithm for partially observable Markov decision process (POMDP) on finite state and action sets. When the full state observation is available, Q-learning finds the optimal action-value function given the current action (Q function). However, Q-learning can perform poorly when the full state observation is not available. In this paper, we formulate the POMDP estimation into a HMM estimation problem and propose a recursive algorithm to estimate both the POMDP parameter and Q function concurrently. Also, we show that the POMDP estimation converges to a set of stationary points for the maximum likelihood estimate, and the Q function estimation converges to a fixed point that satisfies the Bellman optimality equation weighted on the invariant distribution of the state belief determined by the HMM estimation process." @default.
- W2890926617 created "2018-09-27" @default.
- W2890926617 creator A5027028606 @default.
- W2890926617 creator A5029998748 @default.
- W2890926617 creator A5080473137 @default.
- W2890926617 date "2018-09-17" @default.
- W2890926617 modified "2023-10-18" @default.
- W2890926617 title "Hidden Markov Model Estimation-Based Q-learning for Partially Observable Markov Decision Process" @default.
- W2890926617 cites W1499021337 @default.
- W2890926617 cites W1539216098 @default.
- W2890926617 cites W1541084404 @default.
- W2890926617 cites W1657674574 @default.
- W2890926617 cites W1983016559 @default.
- W2890926617 cites W2065087844 @default.
- W2890926617 cites W2124458906 @default.
- W2890926617 cites W2144794447 @default.
- W2890926617 cites W2145339207 @default.
- W2890926617 cites W2291973609 @default.
- W2890926617 cites W2411359688 @default.
- W2890926617 cites W2798766386 @default.
- W2890926617 cites W2962893898 @default.
- W2890926617 cites W2962938178 @default.
- W2890926617 cites W2962956149 @default.
- W2890926617 hasPublicationYear "2018" @default.
- W2890926617 type Work @default.
- W2890926617 sameAs 2890926617 @default.
- W2890926617 citedByCount "0" @default.
- W2890926617 crossrefType "posted-content" @default.
- W2890926617 hasAuthorship W2890926617A5027028606 @default.
- W2890926617 hasAuthorship W2890926617A5029998748 @default.
- W2890926617 hasAuthorship W2890926617A5080473137 @default.
- W2890926617 hasConcept C105795698 @default.
- W2890926617 hasConcept C106189395 @default.
- W2890926617 hasConcept C119857082 @default.
- W2890926617 hasConcept C121332964 @default.
- W2890926617 hasConcept C126255220 @default.
- W2890926617 hasConcept C154945302 @default.
- W2890926617 hasConcept C159886148 @default.
- W2890926617 hasConcept C163540672 @default.
- W2890926617 hasConcept C163836022 @default.
- W2890926617 hasConcept C17098449 @default.
- W2890926617 hasConcept C23224414 @default.
- W2890926617 hasConcept C32848918 @default.
- W2890926617 hasConcept C33923547 @default.
- W2890926617 hasConcept C41008148 @default.
- W2890926617 hasConcept C54907487 @default.
- W2890926617 hasConcept C62520636 @default.
- W2890926617 hasConcept C98763669 @default.
- W2890926617 hasConceptScore W2890926617C105795698 @default.
- W2890926617 hasConceptScore W2890926617C106189395 @default.
- W2890926617 hasConceptScore W2890926617C119857082 @default.
- W2890926617 hasConceptScore W2890926617C121332964 @default.
- W2890926617 hasConceptScore W2890926617C126255220 @default.
- W2890926617 hasConceptScore W2890926617C154945302 @default.
- W2890926617 hasConceptScore W2890926617C159886148 @default.
- W2890926617 hasConceptScore W2890926617C163540672 @default.
- W2890926617 hasConceptScore W2890926617C163836022 @default.
- W2890926617 hasConceptScore W2890926617C17098449 @default.
- W2890926617 hasConceptScore W2890926617C23224414 @default.
- W2890926617 hasConceptScore W2890926617C32848918 @default.
- W2890926617 hasConceptScore W2890926617C33923547 @default.
- W2890926617 hasConceptScore W2890926617C41008148 @default.
- W2890926617 hasConceptScore W2890926617C54907487 @default.
- W2890926617 hasConceptScore W2890926617C62520636 @default.
- W2890926617 hasConceptScore W2890926617C98763669 @default.
- W2890926617 hasOpenAccess W2890926617 @default.
- W2890926617 hasRelatedWork W1413305739 @default.
- W2890926617 hasRelatedWork W1607059475 @default.
- W2890926617 hasRelatedWork W1675167203 @default.
- W2890926617 hasRelatedWork W1965564958 @default.
- W2890926617 hasRelatedWork W1989197122 @default.
- W2890926617 hasRelatedWork W1999862258 @default.
- W2890926617 hasRelatedWork W2020246581 @default.
- W2890926617 hasRelatedWork W2027976015 @default.
- W2890926617 hasRelatedWork W2091359055 @default.
- W2890926617 hasRelatedWork W2101789897 @default.
- W2890926617 hasRelatedWork W2108779650 @default.
- W2890926617 hasRelatedWork W2115795564 @default.
- W2890926617 hasRelatedWork W2139302369 @default.
- W2890926617 hasRelatedWork W2152431402 @default.
- W2890926617 hasRelatedWork W2154625153 @default.
- W2890926617 hasRelatedWork W2160019187 @default.
- W2890926617 hasRelatedWork W2529059936 @default.
- W2890926617 hasRelatedWork W2593146132 @default.
- W2890926617 hasRelatedWork W2791752131 @default.
- W2890926617 hasRelatedWork W2973214749 @default.
- W2890926617 isParatext "false" @default.
- W2890926617 isRetracted "false" @default.
- W2890926617 magId "2890926617" @default.
- W2890926617 workType "article" @default.