Matches in SemOpenAlex for { <https://semopenalex.org/work/W2912279568> ?p ?o ?g. }
- W2912279568 abstract "Deep reinforcement learning (DRL) has gained a lot of attention in recent years, and has been proven to be able to play Atari games and Go at or above human levels. However, those games are assumed to have a small fixed number of actions and could be trained with a simple CNN network. In this paper, we study a special class of Asian popular card games called Dou Di Zhu, in which two adversarial groups of agents must consider numerous card combinations at each time step, leading to huge number of actions. We propose a novel method to handle combinatorial actions, which we call combinational Q-learning (CQL). We employ a two-stage network to reduce action space and also leverage order-invariant max-pooling operations to extract relationships between primitive actions. Results show that our method prevails over state-of-the art methods like naive Q-learning and A3C. We develop an easy-to-use card game environments and train all agents adversarially from sractch, with only knowledge of game rules and verify that our agents are comparative to humans. Our code to reproduce all reported results will be available online." @default.
- W2912279568 created "2019-02-21" @default.
- W2912279568 creator A5009705417 @default.
- W2912279568 creator A5010726528 @default.
- W2912279568 creator A5015961049 @default.
- W2912279568 creator A5029277113 @default.
- W2912279568 creator A5084283534 @default.
- W2912279568 date "2019-01-24" @default.
- W2912279568 modified "2023-09-23" @default.
- W2912279568 title "Combinational Q-Learning for Dou Di Zhu" @default.
- W2912279568 cites W1641379095 @default.
- W2912279568 cites W1757796397 @default.
- W2912279568 cites W2120745256 @default.
- W2912279568 cites W2121863487 @default.
- W2912279568 cites W2145339207 @default.
- W2912279568 cites W2150738382 @default.
- W2912279568 cites W2155968351 @default.
- W2912279568 cites W2201581102 @default.
- W2912279568 cites W2215378786 @default.
- W2912279568 cites W2257979135 @default.
- W2912279568 cites W2467604901 @default.
- W2912279568 cites W2560609797 @default.
- W2912279568 cites W2574978968 @default.
- W2912279568 cites W2766447205 @default.
- W2912279568 cites W2785315072 @default.
- W2912279568 cites W2914156981 @default.
- W2912279568 cites W2951004968 @default.
- W2912279568 cites W2962694783 @default.
- W2912279568 cites W2964043796 @default.
- W2912279568 hasPublicationYear "2019" @default.
- W2912279568 type Work @default.
- W2912279568 sameAs 2912279568 @default.
- W2912279568 citedByCount "1" @default.
- W2912279568 countsByYear W29122795682021 @default.
- W2912279568 crossrefType "posted-content" @default.
- W2912279568 hasAuthorship W2912279568A5009705417 @default.
- W2912279568 hasAuthorship W2912279568A5010726528 @default.
- W2912279568 hasAuthorship W2912279568A5015961049 @default.
- W2912279568 hasAuthorship W2912279568A5029277113 @default.
- W2912279568 hasAuthorship W2912279568A5084283534 @default.
- W2912279568 hasConcept C102234262 @default.
- W2912279568 hasConcept C111472728 @default.
- W2912279568 hasConcept C119857082 @default.
- W2912279568 hasConcept C121332964 @default.
- W2912279568 hasConcept C138885662 @default.
- W2912279568 hasConcept C144237770 @default.
- W2912279568 hasConcept C153083717 @default.
- W2912279568 hasConcept C154945302 @default.
- W2912279568 hasConcept C177142836 @default.
- W2912279568 hasConcept C177264268 @default.
- W2912279568 hasConcept C190470478 @default.
- W2912279568 hasConcept C199360897 @default.
- W2912279568 hasConcept C2776760102 @default.
- W2912279568 hasConcept C2780586882 @default.
- W2912279568 hasConcept C2780791683 @default.
- W2912279568 hasConcept C33923547 @default.
- W2912279568 hasConcept C37736160 @default.
- W2912279568 hasConcept C37914503 @default.
- W2912279568 hasConcept C41008148 @default.
- W2912279568 hasConcept C62520636 @default.
- W2912279568 hasConcept C70437156 @default.
- W2912279568 hasConcept C73795354 @default.
- W2912279568 hasConcept C80444323 @default.
- W2912279568 hasConcept C97541855 @default.
- W2912279568 hasConceptScore W2912279568C102234262 @default.
- W2912279568 hasConceptScore W2912279568C111472728 @default.
- W2912279568 hasConceptScore W2912279568C119857082 @default.
- W2912279568 hasConceptScore W2912279568C121332964 @default.
- W2912279568 hasConceptScore W2912279568C138885662 @default.
- W2912279568 hasConceptScore W2912279568C144237770 @default.
- W2912279568 hasConceptScore W2912279568C153083717 @default.
- W2912279568 hasConceptScore W2912279568C154945302 @default.
- W2912279568 hasConceptScore W2912279568C177142836 @default.
- W2912279568 hasConceptScore W2912279568C177264268 @default.
- W2912279568 hasConceptScore W2912279568C190470478 @default.
- W2912279568 hasConceptScore W2912279568C199360897 @default.
- W2912279568 hasConceptScore W2912279568C2776760102 @default.
- W2912279568 hasConceptScore W2912279568C2780586882 @default.
- W2912279568 hasConceptScore W2912279568C2780791683 @default.
- W2912279568 hasConceptScore W2912279568C33923547 @default.
- W2912279568 hasConceptScore W2912279568C37736160 @default.
- W2912279568 hasConceptScore W2912279568C37914503 @default.
- W2912279568 hasConceptScore W2912279568C41008148 @default.
- W2912279568 hasConceptScore W2912279568C62520636 @default.
- W2912279568 hasConceptScore W2912279568C70437156 @default.
- W2912279568 hasConceptScore W2912279568C73795354 @default.
- W2912279568 hasConceptScore W2912279568C80444323 @default.
- W2912279568 hasConceptScore W2912279568C97541855 @default.
- W2912279568 hasLocation W29122795681 @default.
- W2912279568 hasOpenAccess W2912279568 @default.
- W2912279568 hasPrimaryLocation W29122795681 @default.
- W2912279568 hasRelatedWork W183995860 @default.
- W2912279568 hasRelatedWork W2342292876 @default.
- W2912279568 hasRelatedWork W2418933646 @default.
- W2912279568 hasRelatedWork W2522489477 @default.
- W2912279568 hasRelatedWork W2590714056 @default.
- W2912279568 hasRelatedWork W2621153329 @default.
- W2912279568 hasRelatedWork W2810865311 @default.
- W2912279568 hasRelatedWork W2920362155 @default.
- W2912279568 hasRelatedWork W2921578896 @default.