Matches in SemOpenAlex for { <https://semopenalex.org/work/W2173564293> ?p ?o ?g. }
- W2173564293 endingPage "2003" @default.
- W2173564293 startingPage "1995" @default.
- W2173564293 abstract "In recent years there have been many successes of using deep representations in reinforcement learning. Still, many of these applications use conventional architectures, such as convolutional networks, LSTMs, or auto-encoders. In this paper, we present a new neural network architecture for model-free reinforcement learning. Our dueling network represents two separate estimators: one for the state value function and one for the state-dependent action advantage function. The main benefit of this factoring is to generalize learning across actions without imposing any change to the underlying reinforcement learning algorithm. Our results show that this architecture leads to better policy evaluation in the presence of many similar-valued actions. Moreover, the dueling architecture enables our RL agent to outperform the state-of-the-art on the Atari 2600 domain." @default.
- W2173564293 created "2016-06-24" @default.
- W2173564293 creator A5033135596 @default.
- W2173564293 creator A5036908874 @default.
- W2173564293 creator A5049659586 @default.
- W2173564293 creator A5064627383 @default.
- W2173564293 creator A5081322018 @default.
- W2173564293 creator A5082304130 @default.
- W2173564293 date "2016-06-19" @default.
- W2173564293 modified "2023-10-02" @default.
- W2173564293 title "Dueling network architectures for deep reinforcement learning" @default.
- W2173564293 cites W1515851193 @default.
- W2173564293 cites W1595483645 @default.
- W2173564293 cites W1658008008 @default.
- W2173564293 cites W1947291763 @default.
- W2173564293 cites W2010315761 @default.
- W2173564293 cites W2100752967 @default.
- W2173564293 cites W2108563286 @default.
- W2173564293 cites W2124215603 @default.
- W2173564293 cites W2145339207 @default.
- W2173564293 cites W2151210636 @default.
- W2173564293 cites W2155007355 @default.
- W2173564293 cites W2155027007 @default.
- W2173564293 cites W2155968351 @default.
- W2173564293 cites W2169393322 @default.
- W2173564293 cites W2257979135 @default.
- W2173564293 cites W2919115771 @default.
- W2173564293 cites W2962847657 @default.
- W2173564293 cites W2963477884 @default.
- W2173564293 cites W2964036520 @default.
- W2173564293 cites W779494576 @default.
- W2173564293 cites W834081922 @default.
- W2173564293 cites W1599347336 @default.
- W2173564293 hasPublicationYear "2016" @default.
- W2173564293 type Work @default.
- W2173564293 sameAs 2173564293 @default.
- W2173564293 citedByCount "703" @default.
- W2173564293 countsByYear W21735642932016 @default.
- W2173564293 countsByYear W21735642932017 @default.
- W2173564293 countsByYear W21735642932018 @default.
- W2173564293 countsByYear W21735642932019 @default.
- W2173564293 countsByYear W21735642932020 @default.
- W2173564293 countsByYear W21735642932021 @default.
- W2173564293 countsByYear W21735642932022 @default.
- W2173564293 countsByYear W21735642932023 @default.
- W2173564293 crossrefType "proceedings-article" @default.
- W2173564293 hasAuthorship W2173564293A5033135596 @default.
- W2173564293 hasAuthorship W2173564293A5036908874 @default.
- W2173564293 hasAuthorship W2173564293A5049659586 @default.
- W2173564293 hasAuthorship W2173564293A5064627383 @default.
- W2173564293 hasAuthorship W2173564293A5081322018 @default.
- W2173564293 hasAuthorship W2173564293A5082304130 @default.
- W2173564293 hasConcept C10138342 @default.
- W2173564293 hasConcept C108583219 @default.
- W2173564293 hasConcept C119857082 @default.
- W2173564293 hasConcept C123657996 @default.
- W2173564293 hasConcept C134306372 @default.
- W2173564293 hasConcept C14036430 @default.
- W2173564293 hasConcept C142362112 @default.
- W2173564293 hasConcept C144133560 @default.
- W2173564293 hasConcept C153349607 @default.
- W2173564293 hasConcept C154945302 @default.
- W2173564293 hasConcept C162324750 @default.
- W2173564293 hasConcept C162853370 @default.
- W2173564293 hasConcept C177225278 @default.
- W2173564293 hasConcept C193415008 @default.
- W2173564293 hasConcept C199360897 @default.
- W2173564293 hasConcept C33923547 @default.
- W2173564293 hasConcept C36503486 @default.
- W2173564293 hasConcept C38652104 @default.
- W2173564293 hasConcept C41008148 @default.
- W2173564293 hasConcept C4216890 @default.
- W2173564293 hasConcept C48103436 @default.
- W2173564293 hasConcept C78458016 @default.
- W2173564293 hasConcept C86803240 @default.
- W2173564293 hasConcept C89249532 @default.
- W2173564293 hasConcept C97541855 @default.
- W2173564293 hasConceptScore W2173564293C10138342 @default.
- W2173564293 hasConceptScore W2173564293C108583219 @default.
- W2173564293 hasConceptScore W2173564293C119857082 @default.
- W2173564293 hasConceptScore W2173564293C123657996 @default.
- W2173564293 hasConceptScore W2173564293C134306372 @default.
- W2173564293 hasConceptScore W2173564293C14036430 @default.
- W2173564293 hasConceptScore W2173564293C142362112 @default.
- W2173564293 hasConceptScore W2173564293C144133560 @default.
- W2173564293 hasConceptScore W2173564293C153349607 @default.
- W2173564293 hasConceptScore W2173564293C154945302 @default.
- W2173564293 hasConceptScore W2173564293C162324750 @default.
- W2173564293 hasConceptScore W2173564293C162853370 @default.
- W2173564293 hasConceptScore W2173564293C177225278 @default.
- W2173564293 hasConceptScore W2173564293C193415008 @default.
- W2173564293 hasConceptScore W2173564293C199360897 @default.
- W2173564293 hasConceptScore W2173564293C33923547 @default.
- W2173564293 hasConceptScore W2173564293C36503486 @default.
- W2173564293 hasConceptScore W2173564293C38652104 @default.
- W2173564293 hasConceptScore W2173564293C41008148 @default.
- W2173564293 hasConceptScore W2173564293C4216890 @default.
- W2173564293 hasConceptScore W2173564293C48103436 @default.