Matches in SemOpenAlex for { <https://semopenalex.org/work/W4384930905> ?p ?o ?g. }
Showing items 1 to 61 of
61
with 100 items per page.
- W4384930905 abstract "Intuitively, experience playing against one mixture of opponents in a given domain should be relevant for a different mixture in the same domain. We propose a transfer learning method, Q-Mixing, that starts by learning Q-values against each pure-strategy opponent. Then a Q-value for any distribution of opponent strategies is approximated by appropriately averaging the separately learned Q-values. From these components, we construct policies against all opponent mixtures without any further training. We empirically validate Q-Mixing in two environments: a simple grid-world soccer environment, and a complicated cyber-security game. We find that Q-Mixing is able to successfully transfer knowledge across any mixture of opponents. We next consider the use of observations during play to update the believed distribution of opponents. We introduce an opponent classifier -- trained in parallel to Q-learning, using the same data -- and use the classifier results to refine the mixing of Q-values. We find that Q-Mixing augmented with the opponent classifier function performs comparably, and with lower variance, than training directly against a mixed-strategy opponent." @default.
- W4384930905 created "2023-07-22" @default.
- W4384930905 creator A5000081835 @default.
- W4384930905 creator A5002102744 @default.
- W4384930905 creator A5077158087 @default.
- W4384930905 creator A5082981003 @default.
- W4384930905 date "2020-09-29" @default.
- W4384930905 modified "2023-10-17" @default.
- W4384930905 title "Learning to Play against Any Mixture of Opponents" @default.
- W4384930905 doi "https://doi.org/10.48550/arxiv.2009.14180" @default.
- W4384930905 hasPublicationYear "2020" @default.
- W4384930905 type Work @default.
- W4384930905 citedByCount "0" @default.
- W4384930905 crossrefType "posted-content" @default.
- W4384930905 hasAuthorship W4384930905A5000081835 @default.
- W4384930905 hasAuthorship W4384930905A5002102744 @default.
- W4384930905 hasAuthorship W4384930905A5077158087 @default.
- W4384930905 hasAuthorship W4384930905A5082981003 @default.
- W4384930905 hasBestOaLocation W43849309051 @default.
- W4384930905 hasConcept C119857082 @default.
- W4384930905 hasConcept C121332964 @default.
- W4384930905 hasConcept C138777275 @default.
- W4384930905 hasConcept C144237770 @default.
- W4384930905 hasConcept C145071142 @default.
- W4384930905 hasConcept C154945302 @default.
- W4384930905 hasConcept C177142836 @default.
- W4384930905 hasConcept C33923547 @default.
- W4384930905 hasConcept C38652104 @default.
- W4384930905 hasConcept C41008148 @default.
- W4384930905 hasConcept C41065033 @default.
- W4384930905 hasConcept C62520636 @default.
- W4384930905 hasConcept C95623464 @default.
- W4384930905 hasConceptScore W4384930905C119857082 @default.
- W4384930905 hasConceptScore W4384930905C121332964 @default.
- W4384930905 hasConceptScore W4384930905C138777275 @default.
- W4384930905 hasConceptScore W4384930905C144237770 @default.
- W4384930905 hasConceptScore W4384930905C145071142 @default.
- W4384930905 hasConceptScore W4384930905C154945302 @default.
- W4384930905 hasConceptScore W4384930905C177142836 @default.
- W4384930905 hasConceptScore W4384930905C33923547 @default.
- W4384930905 hasConceptScore W4384930905C38652104 @default.
- W4384930905 hasConceptScore W4384930905C41008148 @default.
- W4384930905 hasConceptScore W4384930905C41065033 @default.
- W4384930905 hasConceptScore W4384930905C62520636 @default.
- W4384930905 hasConceptScore W4384930905C95623464 @default.
- W4384930905 hasLocation W43849309051 @default.
- W4384930905 hasOpenAccess W4384930905 @default.
- W4384930905 hasPrimaryLocation W43849309051 @default.
- W4384930905 hasRelatedWork W2460937040 @default.
- W4384930905 hasRelatedWork W2512018286 @default.
- W4384930905 hasRelatedWork W2556319748 @default.
- W4384930905 hasRelatedWork W2623427976 @default.
- W4384930905 hasRelatedWork W2953083558 @default.
- W4384930905 hasRelatedWork W2961085424 @default.
- W4384930905 hasRelatedWork W3158264953 @default.
- W4384930905 hasRelatedWork W3200179079 @default.
- W4384930905 hasRelatedWork W4249229055 @default.
- W4384930905 hasRelatedWork W4300511536 @default.
- W4384930905 isParatext "false" @default.
- W4384930905 isRetracted "false" @default.
- W4384930905 workType "article" @default.