Matches in SemOpenAlex for { <https://semopenalex.org/work/W3035337758> ?p ?o ?g. }
- W3035337758 abstract "Model-based reinforcement learning (MBRL) has shown its advantages in sample-efficiency over model-free reinforcement learning (MFRL). Despite the impressive results it achieves, it still faces a trade-off between the ease of data generation and model bias. In this paper, we propose a simple and elegant model-embedding model-based reinforcement learning (MEMB) algorithm in the framework of the probabilistic reinforcement learning. To balance the sample-efficiency and model bias, we exploit both real and imaginary data in the training. In particular, we embed the model in the policy update and learn $Q$ and $V$ functions from the real data set. We provide the theoretical analysis of MEMB with the Lipschitz continuity assumption on the model and policy. At last, we evaluate MEMB on several benchmarks and demonstrate our algorithm can achieve state-of-the-art performance." @default.
- W3035337758 created "2020-06-19" @default.
- W3035337758 creator A5021671895 @default.
- W3035337758 creator A5025995347 @default.
- W3035337758 creator A5031646561 @default.
- W3035337758 creator A5083934553 @default.
- W3035337758 date "2020-06-16" @default.
- W3035337758 modified "2023-09-27" @default.
- W3035337758 title "Model Embedding Model-Based Reinforcement Learning." @default.
- W3035337758 cites W1491843047 @default.
- W3035337758 cites W1570233100 @default.
- W3035337758 cites W1771410628 @default.
- W3035337758 cites W2087617385 @default.
- W3035337758 cites W2098774185 @default.
- W3035337758 cites W2104733512 @default.
- W3035337758 cites W2121103318 @default.
- W3035337758 cites W2140135625 @default.
- W3035337758 cites W2145339207 @default.
- W3035337758 cites W2173248099 @default.
- W3035337758 cites W2594103415 @default.
- W3035337758 cites W2774354230 @default.
- W3035337758 cites W2781726626 @default.
- W3035337758 cites W2785389871 @default.
- W3035337758 cites W2787938642 @default.
- W3035337758 cites W2799151646 @default.
- W3035337758 cites W2859967432 @default.
- W3035337758 cites W2892230114 @default.
- W3035337758 cites W2949608212 @default.
- W3035337758 cites W2953708620 @default.
- W3035337758 cites W2962872206 @default.
- W3035337758 cites W2962879692 @default.
- W3035337758 cites W2963267001 @default.
- W3035337758 cites W2963590100 @default.
- W3035337758 cites W2963960193 @default.
- W3035337758 cites W2964006217 @default.
- W3035337758 cites W2970277495 @default.
- W3035337758 cites W2996347495 @default.
- W3035337758 cites W385466589 @default.
- W3035337758 hasPublicationYear "2020" @default.
- W3035337758 type Work @default.
- W3035337758 sameAs 3035337758 @default.
- W3035337758 citedByCount "0" @default.
- W3035337758 crossrefType "posted-content" @default.
- W3035337758 hasAuthorship W3035337758A5021671895 @default.
- W3035337758 hasAuthorship W3035337758A5025995347 @default.
- W3035337758 hasAuthorship W3035337758A5031646561 @default.
- W3035337758 hasAuthorship W3035337758A5083934553 @default.
- W3035337758 hasConcept C119857082 @default.
- W3035337758 hasConcept C134306372 @default.
- W3035337758 hasConcept C154945302 @default.
- W3035337758 hasConcept C165696696 @default.
- W3035337758 hasConcept C177264268 @default.
- W3035337758 hasConcept C185592680 @default.
- W3035337758 hasConcept C198531522 @default.
- W3035337758 hasConcept C199360897 @default.
- W3035337758 hasConcept C22324862 @default.
- W3035337758 hasConcept C33923547 @default.
- W3035337758 hasConcept C38652104 @default.
- W3035337758 hasConcept C41008148 @default.
- W3035337758 hasConcept C41608201 @default.
- W3035337758 hasConcept C43617362 @default.
- W3035337758 hasConcept C49937458 @default.
- W3035337758 hasConcept C97541855 @default.
- W3035337758 hasConceptScore W3035337758C119857082 @default.
- W3035337758 hasConceptScore W3035337758C134306372 @default.
- W3035337758 hasConceptScore W3035337758C154945302 @default.
- W3035337758 hasConceptScore W3035337758C165696696 @default.
- W3035337758 hasConceptScore W3035337758C177264268 @default.
- W3035337758 hasConceptScore W3035337758C185592680 @default.
- W3035337758 hasConceptScore W3035337758C198531522 @default.
- W3035337758 hasConceptScore W3035337758C199360897 @default.
- W3035337758 hasConceptScore W3035337758C22324862 @default.
- W3035337758 hasConceptScore W3035337758C33923547 @default.
- W3035337758 hasConceptScore W3035337758C38652104 @default.
- W3035337758 hasConceptScore W3035337758C41008148 @default.
- W3035337758 hasConceptScore W3035337758C41608201 @default.
- W3035337758 hasConceptScore W3035337758C43617362 @default.
- W3035337758 hasConceptScore W3035337758C49937458 @default.
- W3035337758 hasConceptScore W3035337758C97541855 @default.
- W3035337758 hasLocation W30353377581 @default.
- W3035337758 hasOpenAccess W3035337758 @default.
- W3035337758 hasPrimaryLocation W30353377581 @default.
- W3035337758 hasRelatedWork W1552148478 @default.
- W3035337758 hasRelatedWork W1976800061 @default.
- W3035337758 hasRelatedWork W2073242129 @default.
- W3035337758 hasRelatedWork W2112899086 @default.
- W3035337758 hasRelatedWork W2144655553 @default.
- W3035337758 hasRelatedWork W2491675558 @default.
- W3035337758 hasRelatedWork W2507457504 @default.
- W3035337758 hasRelatedWork W2787719459 @default.
- W3035337758 hasRelatedWork W2789824229 @default.
- W3035337758 hasRelatedWork W2899685447 @default.
- W3035337758 hasRelatedWork W2917322258 @default.
- W3035337758 hasRelatedWork W2947612938 @default.
- W3035337758 hasRelatedWork W2950471160 @default.
- W3035337758 hasRelatedWork W2963170229 @default.
- W3035337758 hasRelatedWork W3031182738 @default.
- W3035337758 hasRelatedWork W3038032959 @default.
- W3035337758 hasRelatedWork W3096980216 @default.
- W3035337758 hasRelatedWork W3126916000 @default.