Matches in SemOpenAlex for { <https://semopenalex.org/work/W3122372191> ?p ?o ?g. }
- W3122372191 abstract "We propose a simple, general and effective technique, Reward Randomization for discovering diverse strategic policies in complex multi-agent games. Combining reward randomization and policy gradient, we derive a new algorithm, Reward-Randomized Policy Gradient (RPG). RPG is able to discover a set of multiple distinctive human-interpretable strategies in challenging temporal trust dilemmas, including grid-world games and a real-world game Agar.io, where multiple equilibria exist but standard multi-agent policy gradient algorithms always converge to a fixed one with a sub-optimal payoff for every player even using state-of-the-art exploration techniques. Furthermore, with the set of diverse strategies from RPG, we can (1) achieve higher payoffs by fine-tuning the best policy from the set; and (2) obtain an adaptive agent by using this set of strategies as its training opponents." @default.
- W3122372191 created "2021-02-01" @default.
- W3122372191 creator A5004199796 @default.
- W3122372191 creator A5008951080 @default.
- W3122372191 creator A5033061754 @default.
- W3122372191 creator A5049093671 @default.
- W3122372191 creator A5054149338 @default.
- W3122372191 creator A5061127138 @default.
- W3122372191 creator A5061701229 @default.
- W3122372191 creator A5066028215 @default.
- W3122372191 creator A5079594267 @default.
- W3122372191 date "2021-05-03" @default.
- W3122372191 modified "2023-09-24" @default.
- W3122372191 title "Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization" @default.
- W3122372191 cites W119236796 @default.
- W3122372191 cites W138212348 @default.
- W3122372191 cites W1519783625 @default.
- W3122372191 cites W1575244154 @default.
- W3122372191 cites W1607392272 @default.
- W3122372191 cites W1738827650 @default.
- W3122372191 cites W1777239053 @default.
- W3122372191 cites W1960974220 @default.
- W3122372191 cites W1970647289 @default.
- W3122372191 cites W1998191601 @default.
- W3122372191 cites W1999762141 @default.
- W3122372191 cites W2061562262 @default.
- W3122372191 cites W2081570464 @default.
- W3122372191 cites W2098774185 @default.
- W3122372191 cites W2107815843 @default.
- W3122372191 cites W2113351146 @default.
- W3122372191 cites W2142925354 @default.
- W3122372191 cites W2146626080 @default.
- W3122372191 cites W2149254401 @default.
- W3122372191 cites W2158330236 @default.
- W3122372191 cites W2175915769 @default.
- W3122372191 cites W2264897026 @default.
- W3122372191 cites W2330024298 @default.
- W3122372191 cites W2489422010 @default.
- W3122372191 cites W2561776174 @default.
- W3122372191 cites W2567015638 @default.
- W3122372191 cites W2605102758 @default.
- W3122372191 cites W2736601468 @default.
- W3122372191 cites W2749807327 @default.
- W3122372191 cites W2766447205 @default.
- W3122372191 cites W2783375473 @default.
- W3122372191 cites W2787542351 @default.
- W3122372191 cites W2902907165 @default.
- W3122372191 cites W2904455790 @default.
- W3122372191 cites W2913409451 @default.
- W3122372191 cites W2945843259 @default.
- W3122372191 cites W2962722276 @default.
- W3122372191 cites W2963162637 @default.
- W3122372191 cites W2963317585 @default.
- W3122372191 cites W2963407617 @default.
- W3122372191 cites W2963438456 @default.
- W3122372191 cites W2963523627 @default.
- W3122372191 cites W2963627051 @default.
- W3122372191 cites W2963646405 @default.
- W3122372191 cites W2963937357 @default.
- W3122372191 cites W2963978142 @default.
- W3122372191 cites W2964067469 @default.
- W3122372191 cites W2964106499 @default.
- W3122372191 cites W2964161785 @default.
- W3122372191 cites W2964345382 @default.
- W3122372191 cites W2970514967 @default.
- W3122372191 cites W2973251960 @default.
- W3122372191 cites W2982316857 @default.
- W3122372191 cites W2995520132 @default.
- W3122372191 cites W2995556868 @default.
- W3122372191 cites W2996784529 @default.
- W3122372191 cites W3005199613 @default.
- W3122372191 cites W3009741087 @default.
- W3122372191 cites W3027456239 @default.
- W3122372191 cites W3032377877 @default.
- W3122372191 cites W3037497841 @default.
- W3122372191 cites W3089482831 @default.
- W3122372191 cites W3112730052 @default.
- W3122372191 cites W3170651526 @default.
- W3122372191 cites W2770298516 @default.
- W3122372191 hasPublicationYear "2021" @default.
- W3122372191 type Work @default.
- W3122372191 sameAs 3122372191 @default.
- W3122372191 citedByCount "1" @default.
- W3122372191 countsByYear W31223721912021 @default.
- W3122372191 crossrefType "proceedings-article" @default.
- W3122372191 hasAuthorship W3122372191A5004199796 @default.
- W3122372191 hasAuthorship W3122372191A5008951080 @default.
- W3122372191 hasAuthorship W3122372191A5033061754 @default.
- W3122372191 hasAuthorship W3122372191A5049093671 @default.
- W3122372191 hasAuthorship W3122372191A5054149338 @default.
- W3122372191 hasAuthorship W3122372191A5061127138 @default.
- W3122372191 hasAuthorship W3122372191A5061701229 @default.
- W3122372191 hasAuthorship W3122372191A5066028215 @default.
- W3122372191 hasAuthorship W3122372191A5079594267 @default.
- W3122372191 hasConcept C111472728 @default.
- W3122372191 hasConcept C119857082 @default.
- W3122372191 hasConcept C126255220 @default.
- W3122372191 hasConcept C138885662 @default.
- W3122372191 hasConcept C144237770 @default.
- W3122372191 hasConcept C154945302 @default.