Matches in SemOpenAlex for { <https://semopenalex.org/work/W2896584636> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W2896584636 abstract "After the recent groundbreaking results of AlphaGo and AlphaZero, we have seen strong interests in deep reinforcement learning and artificial general intelligence (AGI) in game playing. However, deep learning is resource-intensive and the theory is not yet well developed. For small games, simple classical table-based Q-learning might still be the algorithm of choice. General Game Playing (GGP) provides a good testbed for reinforcement learning to research AGI. Q-learning is one of the canonical reinforcement learning methods, and has been used by (Banerjee $&$ Stone, IJCAI 2007) in GGP. In this paper we implement Q-learning in GGP for three small-board games (Tic-Tac-Toe, Connect Four, Hex)footnote{source code: https://github.com/wh1992v/ggp-rl}, to allow comparison to Banerjee et al.. We find that Q-learning converges to a high win rate in GGP. For the $epsilon$-greedy strategy, we propose a first enhancement, the dynamic $epsilon$ algorithm. In addition, inspired by (Gelly $&$ Silver, ICML 2007) we combine online search (Monte Carlo Search) to enhance offline learning, and propose QM-learning for GGP. Both enhancements improve the performance of classical Q-learning. In this work, GGP allows us to show, if augmented by appropriate enhancements, that classical table-based Q-learning can perform well in small games." @default.
- W2896584636 created "2018-10-26" @default.
- W2896584636 creator A5007378438 @default.
- W2896584636 creator A5085542421 @default.
- W2896584636 creator A5090366405 @default.
- W2896584636 date "2018-10-14" @default.
- W2896584636 modified "2023-10-03" @default.
- W2896584636 title "Assessing the Potential of Classical Q-learning in General Game Playing" @default.
- W2896584636 cites W2099587183 @default.
- W2896584636 cites W2120846115 @default.
- W2896584636 cites W2126316555 @default.
- W2896584636 cites W2145339207 @default.
- W2896584636 cites W2169819076 @default.
- W2896584636 cites W2248882425 @default.
- W2896584636 cites W2257979135 @default.
- W2896584636 cites W2766447205 @default.
- W2896584636 cites W2772709170 @default.
- W2896584636 cites W3011120880 @default.
- W2896584636 cites W95097286 @default.
- W2896584636 cites W3145167355 @default.
- W2896584636 doi "https://doi.org/10.48550/arxiv.1810.06078" @default.
- W2896584636 hasPublicationYear "2018" @default.
- W2896584636 type Work @default.
- W2896584636 sameAs 2896584636 @default.
- W2896584636 citedByCount "1" @default.
- W2896584636 countsByYear W28965846362021 @default.
- W2896584636 crossrefType "posted-content" @default.
- W2896584636 hasAuthorship W2896584636A5007378438 @default.
- W2896584636 hasAuthorship W2896584636A5085542421 @default.
- W2896584636 hasAuthorship W2896584636A5090366405 @default.
- W2896584636 hasBestOaLocation W28965846361 @default.
- W2896584636 hasConcept C102234262 @default.
- W2896584636 hasConcept C111919701 @default.
- W2896584636 hasConcept C119857082 @default.
- W2896584636 hasConcept C144237770 @default.
- W2896584636 hasConcept C154945302 @default.
- W2896584636 hasConcept C162027153 @default.
- W2896584636 hasConcept C177142836 @default.
- W2896584636 hasConcept C188116033 @default.
- W2896584636 hasConcept C31258907 @default.
- W2896584636 hasConcept C31395832 @default.
- W2896584636 hasConcept C33923547 @default.
- W2896584636 hasConcept C41008148 @default.
- W2896584636 hasConcept C43126263 @default.
- W2896584636 hasConcept C45235069 @default.
- W2896584636 hasConcept C73795354 @default.
- W2896584636 hasConcept C77088390 @default.
- W2896584636 hasConcept C97541855 @default.
- W2896584636 hasConceptScore W2896584636C102234262 @default.
- W2896584636 hasConceptScore W2896584636C111919701 @default.
- W2896584636 hasConceptScore W2896584636C119857082 @default.
- W2896584636 hasConceptScore W2896584636C144237770 @default.
- W2896584636 hasConceptScore W2896584636C154945302 @default.
- W2896584636 hasConceptScore W2896584636C162027153 @default.
- W2896584636 hasConceptScore W2896584636C177142836 @default.
- W2896584636 hasConceptScore W2896584636C188116033 @default.
- W2896584636 hasConceptScore W2896584636C31258907 @default.
- W2896584636 hasConceptScore W2896584636C31395832 @default.
- W2896584636 hasConceptScore W2896584636C33923547 @default.
- W2896584636 hasConceptScore W2896584636C41008148 @default.
- W2896584636 hasConceptScore W2896584636C43126263 @default.
- W2896584636 hasConceptScore W2896584636C45235069 @default.
- W2896584636 hasConceptScore W2896584636C73795354 @default.
- W2896584636 hasConceptScore W2896584636C77088390 @default.
- W2896584636 hasConceptScore W2896584636C97541855 @default.
- W2896584636 hasLocation W28965846361 @default.
- W2896584636 hasLocation W28965846362 @default.
- W2896584636 hasOpenAccess W2896584636 @default.
- W2896584636 hasPrimaryLocation W28965846361 @default.
- W2896584636 hasRelatedWork W2101018825 @default.
- W2896584636 hasRelatedWork W2103943675 @default.
- W2896584636 hasRelatedWork W2152593421 @default.
- W2896584636 hasRelatedWork W2273814841 @default.
- W2896584636 hasRelatedWork W2889109362 @default.
- W2896584636 hasRelatedWork W2953241056 @default.
- W2896584636 hasRelatedWork W3018270953 @default.
- W2896584636 hasRelatedWork W3074294383 @default.
- W2896584636 hasRelatedWork W4206669594 @default.
- W2896584636 hasRelatedWork W4319083788 @default.
- W2896584636 isParatext "false" @default.
- W2896584636 isRetracted "false" @default.
- W2896584636 magId "2896584636" @default.
- W2896584636 workType "article" @default.