Matches in SemOpenAlex for { <https://semopenalex.org/work/W2893188946> ?p ?o ?g. }
- W2893188946 abstract "Multi-agent reinforcement learning has received significant interest in recent years notably due to the advancements made in deep reinforcement learning which have allowed for the developments of new architectures and learning algorithms. Using social dilemmas as the training ground, we present a novel learning architecture, Learning through Probing (LTP), where agents utilize a probing mechanism to incorporate how their opponent's behavior changes when an agent takes an action. We use distinct training phases and adjust rewards according to the overall outcome of the experiences accounting for changes to the opponents behavior. We introduce a parameter eta to determine the significance of these future changes to opponent behavior. When applied to the Iterated Prisoner's Dilemma (IPD), LTP agents demonstrate that they can learn to cooperate with each other, achieving higher average cumulative rewards than other reinforcement learning methods while also maintaining good performance in playing against static agents that are present in Axelrod tournaments. We compare this method with traditional reinforcement learning algorithms and agent-tracking techniques to highlight key differences and potential applications. We also draw attention to the differences between solving games and societal-like interactions and analyze the training of Q-learning agents in makeshift societies. This is to emphasize how cooperation may emerge in societies and demonstrate this using environments where interactions with opponents are determined through a random encounter format of the IPD." @default.
- W2893188946 created "2018-10-05" @default.
- W2893188946 creator A5025751811 @default.
- W2893188946 creator A5078886343 @default.
- W2893188946 date "2018-09-26" @default.
- W2893188946 modified "2023-09-27" @default.
- W2893188946 title "Learning through Probing: a decentralized reinforcement learning architecture for social dilemmas." @default.
- W2893188946 cites W1540725368 @default.
- W2893188946 cites W1564229172 @default.
- W2893188946 cites W1641379095 @default.
- W2893188946 cites W1972176362 @default.
- W2893188946 cites W1974491836 @default.
- W2893188946 cites W1982262386 @default.
- W2893188946 cites W2039338430 @default.
- W2893188946 cites W2053616263 @default.
- W2893188946 cites W2062663664 @default.
- W2893188946 cites W206679605 @default.
- W2893188946 cites W2085366587 @default.
- W2893188946 cites W2097498347 @default.
- W2893188946 cites W2103561211 @default.
- W2893188946 cites W2104602264 @default.
- W2893188946 cites W2107112577 @default.
- W2893188946 cites W2108892923 @default.
- W2893188946 cites W2115749907 @default.
- W2893188946 cites W2120846115 @default.
- W2893188946 cites W2121863487 @default.
- W2893188946 cites W2122253967 @default.
- W2893188946 cites W2132979098 @default.
- W2893188946 cites W2157592153 @default.
- W2893188946 cites W2160311507 @default.
- W2893188946 cites W2167062553 @default.
- W2893188946 cites W2312609093 @default.
- W2893188946 cites W2535652371 @default.
- W2893188946 cites W2594794854 @default.
- W2893188946 cites W2623431351 @default.
- W2893188946 cites W2730328371 @default.
- W2893188946 cites W2810102121 @default.
- W2893188946 cites W2949201811 @default.
- W2893188946 cites W2963477884 @default.
- W2893188946 cites W2963627051 @default.
- W2893188946 cites W2963689090 @default.
- W2893188946 hasPublicationYear "2018" @default.
- W2893188946 type Work @default.
- W2893188946 sameAs 2893188946 @default.
- W2893188946 citedByCount "2" @default.
- W2893188946 countsByYear W28931889462019 @default.
- W2893188946 countsByYear W28931889462020 @default.
- W2893188946 crossrefType "posted-content" @default.
- W2893188946 hasAuthorship W2893188946A5025751811 @default.
- W2893188946 hasAuthorship W2893188946A5078886343 @default.
- W2893188946 hasConcept C111472728 @default.
- W2893188946 hasConcept C113494165 @default.
- W2893188946 hasConcept C121332964 @default.
- W2893188946 hasConcept C138885662 @default.
- W2893188946 hasConcept C148220186 @default.
- W2893188946 hasConcept C154945302 @default.
- W2893188946 hasConcept C15744967 @default.
- W2893188946 hasConcept C162324750 @default.
- W2893188946 hasConcept C175444787 @default.
- W2893188946 hasConcept C187206662 @default.
- W2893188946 hasConcept C2778496695 @default.
- W2893188946 hasConcept C2780791683 @default.
- W2893188946 hasConcept C38652104 @default.
- W2893188946 hasConcept C41008148 @default.
- W2893188946 hasConcept C41065033 @default.
- W2893188946 hasConcept C56739046 @default.
- W2893188946 hasConcept C62520636 @default.
- W2893188946 hasConcept C67203356 @default.
- W2893188946 hasConcept C77805123 @default.
- W2893188946 hasConcept C79416737 @default.
- W2893188946 hasConcept C97541855 @default.
- W2893188946 hasConceptScore W2893188946C111472728 @default.
- W2893188946 hasConceptScore W2893188946C113494165 @default.
- W2893188946 hasConceptScore W2893188946C121332964 @default.
- W2893188946 hasConceptScore W2893188946C138885662 @default.
- W2893188946 hasConceptScore W2893188946C148220186 @default.
- W2893188946 hasConceptScore W2893188946C154945302 @default.
- W2893188946 hasConceptScore W2893188946C15744967 @default.
- W2893188946 hasConceptScore W2893188946C162324750 @default.
- W2893188946 hasConceptScore W2893188946C175444787 @default.
- W2893188946 hasConceptScore W2893188946C187206662 @default.
- W2893188946 hasConceptScore W2893188946C2778496695 @default.
- W2893188946 hasConceptScore W2893188946C2780791683 @default.
- W2893188946 hasConceptScore W2893188946C38652104 @default.
- W2893188946 hasConceptScore W2893188946C41008148 @default.
- W2893188946 hasConceptScore W2893188946C41065033 @default.
- W2893188946 hasConceptScore W2893188946C56739046 @default.
- W2893188946 hasConceptScore W2893188946C62520636 @default.
- W2893188946 hasConceptScore W2893188946C67203356 @default.
- W2893188946 hasConceptScore W2893188946C77805123 @default.
- W2893188946 hasConceptScore W2893188946C79416737 @default.
- W2893188946 hasConceptScore W2893188946C97541855 @default.
- W2893188946 hasLocation W28931889461 @default.
- W2893188946 hasOpenAccess W2893188946 @default.
- W2893188946 hasPrimaryLocation W28931889461 @default.
- W2893188946 hasRelatedWork W1027665362 @default.
- W2893188946 hasRelatedWork W1428984440 @default.
- W2893188946 hasRelatedWork W1584307643 @default.
- W2893188946 hasRelatedWork W2025516658 @default.
- W2893188946 hasRelatedWork W2186731336 @default.