Matches in SemOpenAlex for { <https://semopenalex.org/work/W4281668144> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W4281668144 abstract "Finding a best response policy is a central objective in game theory and multi-agent learning, with modern population-based training approaches employing reinforcement learning algorithms as best-response oracles to improve play against candidate opponents (typically previously learnt policies). We propose Best Response Expert Iteration (BRExIt), which accelerates learning in games by incorporating opponent models into the state-of-the-art learning algorithm Expert Iteration (ExIt). BRExIt aims to (1) improve feature shaping in the apprentice, with a policy head predicting opponent policies as an auxiliary task, and (2) bias opponent moves in planning towards the given or learnt opponent model, to generate apprentice targets that better approximate a best response. In an empirical ablation on BRExIt's algorithmic variants against a set of fixed test agents, we provide statistical evidence that BRExIt learns better performing policies than ExIt." @default.
- W4281668144 created "2022-06-13" @default.
- W4281668144 creator A5015781911 @default.
- W4281668144 creator A5065056119 @default.
- W4281668144 creator A5069752991 @default.
- W4281668144 date "2022-05-31" @default.
- W4281668144 modified "2023-09-25" @default.
- W4281668144 title "BRExIt: On Opponent Modelling in Expert Iteration" @default.
- W4281668144 doi "https://doi.org/10.48550/arxiv.2206.00113" @default.
- W4281668144 hasPublicationYear "2022" @default.
- W4281668144 type Work @default.
- W4281668144 citedByCount "0" @default.
- W4281668144 crossrefType "posted-content" @default.
- W4281668144 hasAuthorship W4281668144A5015781911 @default.
- W4281668144 hasAuthorship W4281668144A5065056119 @default.
- W4281668144 hasAuthorship W4281668144A5069752991 @default.
- W4281668144 hasBestOaLocation W42816681441 @default.
- W4281668144 hasConcept C119857082 @default.
- W4281668144 hasConcept C138885662 @default.
- W4281668144 hasConcept C154945302 @default.
- W4281668144 hasConcept C155202549 @default.
- W4281668144 hasConcept C162324750 @default.
- W4281668144 hasConcept C177264268 @default.
- W4281668144 hasConcept C187736073 @default.
- W4281668144 hasConcept C199360897 @default.
- W4281668144 hasConcept C2776401178 @default.
- W4281668144 hasConcept C2776469822 @default.
- W4281668144 hasConcept C2780451532 @default.
- W4281668144 hasConcept C2910001868 @default.
- W4281668144 hasConcept C38652104 @default.
- W4281668144 hasConcept C41008148 @default.
- W4281668144 hasConcept C41065033 @default.
- W4281668144 hasConcept C41895202 @default.
- W4281668144 hasConcept C97541855 @default.
- W4281668144 hasConceptScore W4281668144C119857082 @default.
- W4281668144 hasConceptScore W4281668144C138885662 @default.
- W4281668144 hasConceptScore W4281668144C154945302 @default.
- W4281668144 hasConceptScore W4281668144C155202549 @default.
- W4281668144 hasConceptScore W4281668144C162324750 @default.
- W4281668144 hasConceptScore W4281668144C177264268 @default.
- W4281668144 hasConceptScore W4281668144C187736073 @default.
- W4281668144 hasConceptScore W4281668144C199360897 @default.
- W4281668144 hasConceptScore W4281668144C2776401178 @default.
- W4281668144 hasConceptScore W4281668144C2776469822 @default.
- W4281668144 hasConceptScore W4281668144C2780451532 @default.
- W4281668144 hasConceptScore W4281668144C2910001868 @default.
- W4281668144 hasConceptScore W4281668144C38652104 @default.
- W4281668144 hasConceptScore W4281668144C41008148 @default.
- W4281668144 hasConceptScore W4281668144C41065033 @default.
- W4281668144 hasConceptScore W4281668144C41895202 @default.
- W4281668144 hasConceptScore W4281668144C97541855 @default.
- W4281668144 hasLocation W42816681441 @default.
- W4281668144 hasOpenAccess W4281668144 @default.
- W4281668144 hasPrimaryLocation W42816681441 @default.
- W4281668144 hasRelatedWork W1509467138 @default.
- W4281668144 hasRelatedWork W2616430965 @default.
- W4281668144 hasRelatedWork W3022038857 @default.
- W4281668144 hasRelatedWork W3129981047 @default.
- W4281668144 hasRelatedWork W3183432322 @default.
- W4281668144 hasRelatedWork W4307204265 @default.
- W4281668144 hasRelatedWork W4319083788 @default.
- W4281668144 hasRelatedWork W4319453732 @default.
- W4281668144 hasRelatedWork W4323338448 @default.
- W4281668144 hasRelatedWork W1876931688 @default.
- W4281668144 isParatext "false" @default.
- W4281668144 isRetracted "false" @default.
- W4281668144 workType "article" @default.