Matches in SemOpenAlex for { <https://semopenalex.org/work/W3005871422> ?p ?o ?g. }
- W3005871422 abstract "Model-based reinforcement learning has been empirically demonstrated as a successful strategy to improve sample efficiency. In particular, Dyna is an elegant model-based architecture integrating learning and planning that provides huge flexibility of using a model. One of the most important components in Dyna is called search-control, which refers to the process of generating state or state-action pairs from which we query the model to acquire simulated experiences. Search-control is critical in improving learning efficiency. In this work, we propose a simple and novel search-control strategy by searching high frequency regions of the value function. Our main intuition is built on Shannon sampling theorem from signal processing, which indicates that a high frequency signal requires more samples to reconstruct. We empirically show that a high frequency function is more difficult to approximate. This suggests a search-control strategy: we should use states from high frequency regions of the value function to query the model to acquire more samples. We develop a simple strategy to locally measure the frequency of a function by gradient and hessian norms, and provide theoretical justification for this approach. We then apply our strategy to search-control in Dyna, and conduct experiments to show its property and effectiveness on benchmark domains." @default.
- W3005871422 created "2020-02-24" @default.
- W3005871422 creator A5014823249 @default.
- W3005871422 creator A5051536192 @default.
- W3005871422 creator A5084325716 @default.
- W3005871422 date "2020-02-13" @default.
- W3005871422 modified "2023-09-26" @default.
- W3005871422 title "Frequency-based Search-control in Dyna" @default.
- W3005871422 cites W1533861849 @default.
- W3005871422 cites W1572121205 @default.
- W3005871422 cites W1758031947 @default.
- W3005871422 cites W1870822514 @default.
- W3005871422 cites W1898511152 @default.
- W3005871422 cites W1924935520 @default.
- W3005871422 cites W195033972 @default.
- W3005871422 cites W1980035368 @default.
- W3005871422 cites W1982262386 @default.
- W3005871422 cites W2000021956 @default.
- W3005871422 cites W2002428251 @default.
- W3005871422 cites W2027057364 @default.
- W3005871422 cites W2048226872 @default.
- W3005871422 cites W2056138823 @default.
- W3005871422 cites W2073384958 @default.
- W3005871422 cites W2087617385 @default.
- W3005871422 cites W2114384389 @default.
- W3005871422 cites W2121863487 @default.
- W3005871422 cites W2140135625 @default.
- W3005871422 cites W2141559645 @default.
- W3005871422 cites W2145339207 @default.
- W3005871422 cites W2155027007 @default.
- W3005871422 cites W2158782408 @default.
- W3005871422 cites W2290354866 @default.
- W3005871422 cites W2404689820 @default.
- W3005871422 cites W2626747984 @default.
- W3005871422 cites W2890208753 @default.
- W3005871422 cites W2903158431 @default.
- W3005871422 cites W2910195212 @default.
- W3005871422 cites W2912453235 @default.
- W3005871422 cites W2962951703 @default.
- W3005871422 cites W2963477884 @default.
- W3005871422 cites W2963582482 @default.
- W3005871422 cites W2963604043 @default.
- W3005871422 cites W2963661429 @default.
- W3005871422 cites W2964121744 @default.
- W3005871422 cites W2964220198 @default.
- W3005871422 cites W2966234803 @default.
- W3005871422 cites W2994714051 @default.
- W3005871422 doi "https://doi.org/10.48550/arxiv.2002.05822" @default.
- W3005871422 hasPublicationYear "2020" @default.
- W3005871422 type Work @default.
- W3005871422 sameAs 3005871422 @default.
- W3005871422 citedByCount "4" @default.
- W3005871422 countsByYear W30058714222020 @default.
- W3005871422 countsByYear W30058714222021 @default.
- W3005871422 crossrefType "posted-content" @default.
- W3005871422 hasAuthorship W3005871422A5014823249 @default.
- W3005871422 hasAuthorship W3005871422A5051536192 @default.
- W3005871422 hasAuthorship W3005871422A5084325716 @default.
- W3005871422 hasBestOaLocation W30058714221 @default.
- W3005871422 hasConcept C119857082 @default.
- W3005871422 hasConcept C126255220 @default.
- W3005871422 hasConcept C13280743 @default.
- W3005871422 hasConcept C14036430 @default.
- W3005871422 hasConcept C14646407 @default.
- W3005871422 hasConcept C154945302 @default.
- W3005871422 hasConcept C185798385 @default.
- W3005871422 hasConcept C203616005 @default.
- W3005871422 hasConcept C205649164 @default.
- W3005871422 hasConcept C28826006 @default.
- W3005871422 hasConcept C33923547 @default.
- W3005871422 hasConcept C41008148 @default.
- W3005871422 hasConcept C78458016 @default.
- W3005871422 hasConcept C86803240 @default.
- W3005871422 hasConcept C97541855 @default.
- W3005871422 hasConceptScore W3005871422C119857082 @default.
- W3005871422 hasConceptScore W3005871422C126255220 @default.
- W3005871422 hasConceptScore W3005871422C13280743 @default.
- W3005871422 hasConceptScore W3005871422C14036430 @default.
- W3005871422 hasConceptScore W3005871422C14646407 @default.
- W3005871422 hasConceptScore W3005871422C154945302 @default.
- W3005871422 hasConceptScore W3005871422C185798385 @default.
- W3005871422 hasConceptScore W3005871422C203616005 @default.
- W3005871422 hasConceptScore W3005871422C205649164 @default.
- W3005871422 hasConceptScore W3005871422C28826006 @default.
- W3005871422 hasConceptScore W3005871422C33923547 @default.
- W3005871422 hasConceptScore W3005871422C41008148 @default.
- W3005871422 hasConceptScore W3005871422C78458016 @default.
- W3005871422 hasConceptScore W3005871422C86803240 @default.
- W3005871422 hasConceptScore W3005871422C97541855 @default.
- W3005871422 hasLocation W30058714221 @default.
- W3005871422 hasOpenAccess W3005871422 @default.
- W3005871422 hasPrimaryLocation W30058714221 @default.
- W3005871422 hasRelatedWork W2151702863 @default.
- W3005871422 hasRelatedWork W2729602312 @default.
- W3005871422 hasRelatedWork W2775408020 @default.
- W3005871422 hasRelatedWork W3022038857 @default.
- W3005871422 hasRelatedWork W3040891685 @default.
- W3005871422 hasRelatedWork W3115682199 @default.
- W3005871422 hasRelatedWork W3141495010 @default.
- W3005871422 hasRelatedWork W3208245767 @default.