Matches in SemOpenAlex for { <https://semopenalex.org/work/W3100821630> ?p ?o ?g. }
- W3100821630 abstract "Dynamic programming principle (DPP) is fundamental for control and optimization, including Markov decision problems (MDPs), reinforcement learning (RL), and more recently mean-field controls (MFCs). However, in the learning framework of MFCs, DPP has not been rigorously established, despite its critical importance for algorithm designs. In this paper, we first present a simple example in MFCs with learning where DPP fails with a mis-specified Q function; and then propose the correct form of Q function in an appropriate space for MFCs with learning. This particular form of Q function is different from the classical one and is called the IQ function. In the special case when the transition probability and the reward are independent of the mean-field information, it integrates the classical Q function for single-agent RL over the state-action distribution. In other words, MFCs with learning can be viewed as lifting the classical RLs by replacing the state-action space with its probability distribution space. This identification of the IQ function enables us to establish precisely the DPP in the learning framework of MFCs. Finally, we illustrate through numerical experiments the time consistency of this IQ function." @default.
- W3100821630 created "2020-11-23" @default.
- W3100821630 creator A5007953835 @default.
- W3100821630 creator A5047850847 @default.
- W3100821630 creator A5068576165 @default.
- W3100821630 creator A5085674806 @default.
- W3100821630 date "2019-11-17" @default.
- W3100821630 modified "2023-10-17" @default.
- W3100821630 title "Dynamic Programming Principles for Mean-Field Controls with Learning" @default.
- W3100821630 cites W2017903822 @default.
- W3100821630 cites W2037152246 @default.
- W3100821630 cites W2038398071 @default.
- W3100821630 cites W2045400766 @default.
- W3100821630 cites W2076337359 @default.
- W3100821630 cites W2088595989 @default.
- W3100821630 cites W2113501460 @default.
- W3100821630 cites W2118686230 @default.
- W3100821630 cites W2145339207 @default.
- W3100821630 cites W2156060009 @default.
- W3100821630 cites W2156737235 @default.
- W3100821630 cites W2168342951 @default.
- W3100821630 cites W2173248099 @default.
- W3100821630 cites W2707993456 @default.
- W3100821630 cites W2788125442 @default.
- W3100821630 cites W2791784110 @default.
- W3100821630 cites W2914154006 @default.
- W3100821630 cites W2915917668 @default.
- W3100821630 cites W2963164374 @default.
- W3100821630 cites W2963617451 @default.
- W3100821630 cites W2964078161 @default.
- W3100821630 cites W2970875146 @default.
- W3100821630 cites W2979330446 @default.
- W3100821630 cites W2982138249 @default.
- W3100821630 cites W2990667616 @default.
- W3100821630 cites W2995725774 @default.
- W3100821630 cites W3011120880 @default.
- W3100821630 cites W3093413390 @default.
- W3100821630 cites W586490843 @default.
- W3100821630 doi "https://doi.org/10.48550/arxiv.1911.07314" @default.
- W3100821630 hasPublicationYear "2019" @default.
- W3100821630 type Work @default.
- W3100821630 sameAs 3100821630 @default.
- W3100821630 citedByCount "10" @default.
- W3100821630 countsByYear W31008216302019 @default.
- W3100821630 countsByYear W31008216302020 @default.
- W3100821630 countsByYear W31008216302021 @default.
- W3100821630 countsByYear W31008216302023 @default.
- W3100821630 crossrefType "posted-content" @default.
- W3100821630 hasAuthorship W3100821630A5007953835 @default.
- W3100821630 hasAuthorship W3100821630A5047850847 @default.
- W3100821630 hasAuthorship W3100821630A5068576165 @default.
- W3100821630 hasAuthorship W3100821630A5085674806 @default.
- W3100821630 hasBestOaLocation W31008216301 @default.
- W3100821630 hasConcept C103784038 @default.
- W3100821630 hasConcept C105795698 @default.
- W3100821630 hasConcept C106189395 @default.
- W3100821630 hasConcept C111919701 @default.
- W3100821630 hasConcept C11413529 @default.
- W3100821630 hasConcept C121332964 @default.
- W3100821630 hasConcept C126255220 @default.
- W3100821630 hasConcept C14036430 @default.
- W3100821630 hasConcept C14646407 @default.
- W3100821630 hasConcept C149441793 @default.
- W3100821630 hasConcept C154945302 @default.
- W3100821630 hasConcept C159886148 @default.
- W3100821630 hasConcept C197055811 @default.
- W3100821630 hasConcept C202444582 @default.
- W3100821630 hasConcept C2776436953 @default.
- W3100821630 hasConcept C2778572836 @default.
- W3100821630 hasConcept C2780791683 @default.
- W3100821630 hasConcept C33923547 @default.
- W3100821630 hasConcept C37404715 @default.
- W3100821630 hasConcept C41008148 @default.
- W3100821630 hasConcept C48103436 @default.
- W3100821630 hasConcept C50644808 @default.
- W3100821630 hasConcept C55842286 @default.
- W3100821630 hasConcept C62520636 @default.
- W3100821630 hasConcept C72434380 @default.
- W3100821630 hasConcept C78458016 @default.
- W3100821630 hasConcept C86803240 @default.
- W3100821630 hasConcept C91873725 @default.
- W3100821630 hasConcept C9652623 @default.
- W3100821630 hasConcept C97541855 @default.
- W3100821630 hasConceptScore W3100821630C103784038 @default.
- W3100821630 hasConceptScore W3100821630C105795698 @default.
- W3100821630 hasConceptScore W3100821630C106189395 @default.
- W3100821630 hasConceptScore W3100821630C111919701 @default.
- W3100821630 hasConceptScore W3100821630C11413529 @default.
- W3100821630 hasConceptScore W3100821630C121332964 @default.
- W3100821630 hasConceptScore W3100821630C126255220 @default.
- W3100821630 hasConceptScore W3100821630C14036430 @default.
- W3100821630 hasConceptScore W3100821630C14646407 @default.
- W3100821630 hasConceptScore W3100821630C149441793 @default.
- W3100821630 hasConceptScore W3100821630C154945302 @default.
- W3100821630 hasConceptScore W3100821630C159886148 @default.
- W3100821630 hasConceptScore W3100821630C197055811 @default.
- W3100821630 hasConceptScore W3100821630C202444582 @default.
- W3100821630 hasConceptScore W3100821630C2776436953 @default.
- W3100821630 hasConceptScore W3100821630C2778572836 @default.
- W3100821630 hasConceptScore W3100821630C2780791683 @default.