Matches in SemOpenAlex for { <https://semopenalex.org/work/W3137629537> ?p ?o ?g. }
- W3137629537 endingPage "6704" @default.
- W3137629537 startingPage "6696" @default.
- W3137629537 abstract "A core operation in reinforcement learning (RL) is finding an action that is optimal with respect to a learned value function. This operation is often challenging when the learned value function takes continuous actions as input. We introduce deep radial-basis value functions (RBVFs): value functions learned using a deep network with a radial-basis function (RBF) output layer. We show that the maximum action-value with respect to a deep RBVF can be approximated easily and accurately. Moreover, deep RBVFs can represent any true value function owing to their support for universal function approximation. We extend the standard DQN algorithm to continuous control by endowing the agent with a deep RBVF. We show that the resultant agent, called RBF-DQN, significantly outperforms value-function-only baselines, and is competitive with state-of-the-art actor-critic algorithms." @default.
- W3137629537 created "2021-03-29" @default.
- W3137629537 creator A5009722403 @default.
- W3137629537 creator A5021682235 @default.
- W3137629537 creator A5037667167 @default.
- W3137629537 creator A5046930792 @default.
- W3137629537 creator A5078124517 @default.
- W3137629537 date "2021-05-18" @default.
- W3137629537 modified "2023-10-01" @default.
- W3137629537 title "Deep Radial-Basis Value Functions for Continuous Control" @default.
- W3137629537 cites W1524100745 @default.
- W3137629537 cites W1967005434 @default.
- W3137629537 cites W1994833873 @default.
- W3137629537 cites W2056653303 @default.
- W3137629537 cites W2058755553 @default.
- W3137629537 cites W2091565802 @default.
- W3137629537 cites W2096053111 @default.
- W3137629537 cites W2106261932 @default.
- W3137629537 cites W2115519224 @default.
- W3137629537 cites W2119567691 @default.
- W3137629537 cites W2119821739 @default.
- W3137629537 cites W2121863487 @default.
- W3137629537 cites W2124175081 @default.
- W3137629537 cites W2145339207 @default.
- W3137629537 cites W2165150801 @default.
- W3137629537 cites W2170141744 @default.
- W3137629537 cites W2171277043 @default.
- W3137629537 cites W2183243664 @default.
- W3137629537 cites W2290354866 @default.
- W3137629537 cites W2296319761 @default.
- W3137629537 cites W2557283755 @default.
- W3137629537 cites W2896412913 @default.
- W3137629537 cites W2945159000 @default.
- W3137629537 cites W2962902376 @default.
- W3137629537 cites W2963169817 @default.
- W3137629537 cites W2963266548 @default.
- W3137629537 cites W2963864421 @default.
- W3137629537 cites W2963910368 @default.
- W3137629537 cites W2963923407 @default.
- W3137629537 cites W2963938771 @default.
- W3137629537 cites W2964158321 @default.
- W3137629537 cites W2964291307 @default.
- W3137629537 cites W2995801821 @default.
- W3137629537 cites W3009498344 @default.
- W3137629537 cites W3011120880 @default.
- W3137629537 cites W3015662311 @default.
- W3137629537 cites W3034442282 @default.
- W3137629537 cites W3139377883 @default.
- W3137629537 cites W3146803896 @default.
- W3137629537 cites W94523489 @default.
- W3137629537 doi "https://doi.org/10.1609/aaai.v35i8.16828" @default.
- W3137629537 hasPublicationYear "2021" @default.
- W3137629537 type Work @default.
- W3137629537 sameAs 3137629537 @default.
- W3137629537 citedByCount "2" @default.
- W3137629537 countsByYear W31376295372022 @default.
- W3137629537 countsByYear W31376295372023 @default.
- W3137629537 crossrefType "journal-article" @default.
- W3137629537 hasAuthorship W3137629537A5009722403 @default.
- W3137629537 hasAuthorship W3137629537A5021682235 @default.
- W3137629537 hasAuthorship W3137629537A5037667167 @default.
- W3137629537 hasAuthorship W3137629537A5046930792 @default.
- W3137629537 hasAuthorship W3137629537A5078124517 @default.
- W3137629537 hasBestOaLocation W31376295371 @default.
- W3137629537 hasConcept C108583219 @default.
- W3137629537 hasConcept C11413529 @default.
- W3137629537 hasConcept C119857082 @default.
- W3137629537 hasConcept C121332964 @default.
- W3137629537 hasConcept C12426560 @default.
- W3137629537 hasConcept C126255220 @default.
- W3137629537 hasConcept C132917294 @default.
- W3137629537 hasConcept C134306372 @default.
- W3137629537 hasConcept C14036430 @default.
- W3137629537 hasConcept C14646407 @default.
- W3137629537 hasConcept C154945302 @default.
- W3137629537 hasConcept C2164484 @default.
- W3137629537 hasConcept C2524010 @default.
- W3137629537 hasConcept C2776291640 @default.
- W3137629537 hasConcept C2778199601 @default.
- W3137629537 hasConcept C2780791683 @default.
- W3137629537 hasConcept C33923547 @default.
- W3137629537 hasConcept C41008148 @default.
- W3137629537 hasConcept C50644808 @default.
- W3137629537 hasConcept C5917680 @default.
- W3137629537 hasConcept C62520636 @default.
- W3137629537 hasConcept C76155785 @default.
- W3137629537 hasConcept C78458016 @default.
- W3137629537 hasConcept C86803240 @default.
- W3137629537 hasConcept C97541855 @default.
- W3137629537 hasConcept C98856871 @default.
- W3137629537 hasConceptScore W3137629537C108583219 @default.
- W3137629537 hasConceptScore W3137629537C11413529 @default.
- W3137629537 hasConceptScore W3137629537C119857082 @default.
- W3137629537 hasConceptScore W3137629537C121332964 @default.
- W3137629537 hasConceptScore W3137629537C12426560 @default.
- W3137629537 hasConceptScore W3137629537C126255220 @default.
- W3137629537 hasConceptScore W3137629537C132917294 @default.
- W3137629537 hasConceptScore W3137629537C134306372 @default.