SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W4221144102> ?p ?o ?g. }

Showing items 1 to 68 of 68 with 100 items per page.

W4221144102 abstract "While advances in multi-agent learning have enabled the training of increasingly complex agents, most existing techniques produce a final policy that is not designed to adapt to a new partner's strategy. However, we would like our AI agents to adjust their strategy based on the strategies of those around them. In this work, we study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time, and we must interact with and adapt to new partners at test time. This setting is challenging because we must infer a new partner's strategy and adapt our policy to that strategy, all without knowledge of the environment reward or dynamics. We formalize this problem of conditional multi-agent imitation learning, and propose a novel approach to address the difficulties of scalability and data scarcity. Our key insight is that variations across partners in multi-agent games are often highly structured, and can be represented via a low-rank subspace. Leveraging tools from tensor decomposition, our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace. We experiments with a mix of collaborative tasks, including bandits, particle, and Hanabi environments. Additionally, we test our conditional policies against real human partners in a user study on the Overcooked game. Our model adapts better to new partners compared to baselines, and robustly handles diverse settings ranging from discrete/continuous actions and static/online evaluation with AI/human partners." @default.
W4221144102 created "2022-04-03" @default.
W4221144102 creator A5080725225 @default.
W4221144102 creator A5088648796 @default.
W4221144102 creator A5091179481 @default.
W4221144102 date "2022-01-04" @default.
W4221144102 modified "2023-10-09" @default.
W4221144102 title "Conditional Imitation Learning for Multi-Agent Games" @default.
W4221144102 doi "https://doi.org/10.48550/arxiv.2201.01448" @default.
W4221144102 hasPublicationYear "2022" @default.
W4221144102 type Work @default.
W4221144102 citedByCount "0" @default.
W4221144102 crossrefType "posted-content" @default.
W4221144102 hasAuthorship W4221144102A5080725225 @default.
W4221144102 hasAuthorship W4221144102A5088648796 @default.
W4221144102 hasAuthorship W4221144102A5091179481 @default.
W4221144102 hasBestOaLocation W42211441021 @default.
W4221144102 hasConcept C107457646 @default.
W4221144102 hasConcept C114614502 @default.
W4221144102 hasConcept C119857082 @default.
W4221144102 hasConcept C126388530 @default.
W4221144102 hasConcept C145071142 @default.
W4221144102 hasConcept C154945302 @default.
W4221144102 hasConcept C15744967 @default.
W4221144102 hasConcept C162324750 @default.
W4221144102 hasConcept C164226766 @default.
W4221144102 hasConcept C175444787 @default.
W4221144102 hasConcept C177142836 @default.
W4221144102 hasConcept C32834561 @default.
W4221144102 hasConcept C33923547 @default.
W4221144102 hasConcept C41008148 @default.
W4221144102 hasConcept C48044578 @default.
W4221144102 hasConcept C77088390 @default.
W4221144102 hasConcept C77805123 @default.
W4221144102 hasConceptScore W4221144102C107457646 @default.
W4221144102 hasConceptScore W4221144102C114614502 @default.
W4221144102 hasConceptScore W4221144102C119857082 @default.
W4221144102 hasConceptScore W4221144102C126388530 @default.
W4221144102 hasConceptScore W4221144102C145071142 @default.
W4221144102 hasConceptScore W4221144102C154945302 @default.
W4221144102 hasConceptScore W4221144102C15744967 @default.
W4221144102 hasConceptScore W4221144102C162324750 @default.
W4221144102 hasConceptScore W4221144102C164226766 @default.
W4221144102 hasConceptScore W4221144102C175444787 @default.
W4221144102 hasConceptScore W4221144102C177142836 @default.
W4221144102 hasConceptScore W4221144102C32834561 @default.
W4221144102 hasConceptScore W4221144102C33923547 @default.
W4221144102 hasConceptScore W4221144102C41008148 @default.
W4221144102 hasConceptScore W4221144102C48044578 @default.
W4221144102 hasConceptScore W4221144102C77088390 @default.
W4221144102 hasConceptScore W4221144102C77805123 @default.
W4221144102 hasLocation W42211441021 @default.
W4221144102 hasLocation W42211441022 @default.
W4221144102 hasOpenAccess W4221144102 @default.
W4221144102 hasPrimaryLocation W42211441021 @default.
W4221144102 hasRelatedWork W1531601525 @default.
W4221144102 hasRelatedWork W1980381208 @default.
W4221144102 hasRelatedWork W2665305151 @default.
W4221144102 hasRelatedWork W2758277628 @default.
W4221144102 hasRelatedWork W2935909890 @default.
W4221144102 hasRelatedWork W2948807893 @default.
W4221144102 hasRelatedWork W3173606202 @default.
W4221144102 hasRelatedWork W3183948672 @default.
W4221144102 hasRelatedWork W2778153218 @default.
W4221144102 hasRelatedWork W3110381201 @default.
W4221144102 isParatext "false" @default.
W4221144102 isRetracted "false" @default.
W4221144102 workType "article" @default.