Matches in SemOpenAlex for { <https://semopenalex.org/work/W2996549507> ?p ?o ?g. }
- W2996549507 abstract "Intrinsically motivated reinforcement learning aims to address the exploration challenge for sparse-reward tasks. However, the study of exploration methods in transition-dependent multi-agent settings is largely absent from the literature. We aim to take a step towards solving this problem. We present two exploration methods: exploration via information-theoretic influence (EITI) and exploration via decision-theoretic influence (EDTI), by exploiting the role of interaction in coordinated behaviors of agents. EITI uses mutual information to capture influence transition dynamics. EDTI uses a novel intrinsic reward, called Value of Interaction (VoI), to characterize and quantify the influence of one agent's behavior on expected returns of other agents. By optimizing EITI or EDTI objective as a regularizer, agents are encouraged to coordinate their exploration and learn policies to optimize team performance. We show how to optimize these regularizers so that they can be easily integrated with policy gradient reinforcement learning. The resulting update rule draws a connection between coordinated exploration and intrinsic reward distribution. Finally, we empirically demonstrate the significant strength of our method in a variety of multi-agent scenarios." @default.
- W2996549507 created "2019-12-26" @default.
- W2996549507 creator A5008951080 @default.
- W2996549507 creator A5010176958 @default.
- W2996549507 creator A5048974354 @default.
- W2996549507 creator A5071784141 @default.
- W2996549507 date "2019-10-12" @default.
- W2996549507 modified "2023-10-17" @default.
- W2996549507 title "Influence-Based Multi-Agent Exploration" @default.
- W2996549507 cites W103885025 @default.
- W2996549507 cites W1582436621 @default.
- W2996549507 cites W172298727 @default.
- W2996549507 cites W1850488217 @default.
- W2996549507 cites W2000514530 @default.
- W2996549507 cites W2012812921 @default.
- W2996549507 cites W2020920737 @default.
- W2996549507 cites W2101524054 @default.
- W2996549507 cites W2111764152 @default.
- W2996549507 cites W2128643385 @default.
- W2996549507 cites W2136634080 @default.
- W2996549507 cites W2139612737 @default.
- W2996549507 cites W2145339207 @default.
- W2996549507 cites W2155027007 @default.
- W2996549507 cites W2188721763 @default.
- W2996549507 cites W2218660691 @default.
- W2996549507 cites W2404646363 @default.
- W2996549507 cites W2518564545 @default.
- W2996549507 cites W2561776174 @default.
- W2996549507 cites W2567015638 @default.
- W2996549507 cites W2606433045 @default.
- W2996549507 cites W2736601468 @default.
- W2996549507 cites W2751516180 @default.
- W2996549507 cites W2751973545 @default.
- W2996549507 cites W2783375473 @default.
- W2996549507 cites W2788904251 @default.
- W2996549507 cites W2789824229 @default.
- W2996549507 cites W2795908317 @default.
- W2996549507 cites W2804327232 @default.
- W2996549507 cites W2807741983 @default.
- W2996549507 cites W2886000153 @default.
- W2996549507 cites W2891661335 @default.
- W2996549507 cites W2895921264 @default.
- W2996549507 cites W2899205164 @default.
- W2996549507 cites W2911743772 @default.
- W2996549507 cites W2913343212 @default.
- W2996549507 cites W2946606218 @default.
- W2996549507 cites W2947526499 @default.
- W2996549507 cites W2962730405 @default.
- W2996549507 cites W2962910611 @default.
- W2996549507 cites W2962938168 @default.
- W2996549507 cites W2962966033 @default.
- W2996549507 cites W2963000099 @default.
- W2996549507 cites W2963049774 @default.
- W2996549507 cites W2963160877 @default.
- W2996549507 cites W2963162637 @default.
- W2996549507 cites W2963276097 @default.
- W2996549507 cites W2963407617 @default.
- W2996549507 cites W2963421140 @default.
- W2996549507 cites W2963477884 @default.
- W2996549507 cites W2963639957 @default.
- W2996549507 cites W2964021598 @default.
- W2996549507 cites W2964083594 @default.
- W2996549507 cites W2964118020 @default.
- W2996549507 cites W2964121744 @default.
- W2996549507 cites W3093287223 @default.
- W2996549507 cites W3102768201 @default.
- W2996549507 doi "https://doi.org/10.48550/arxiv.1910.05512" @default.
- W2996549507 hasPublicationYear "2019" @default.
- W2996549507 type Work @default.
- W2996549507 sameAs 2996549507 @default.
- W2996549507 citedByCount "19" @default.
- W2996549507 countsByYear W29965495072020 @default.
- W2996549507 countsByYear W29965495072021 @default.
- W2996549507 crossrefType "posted-content" @default.
- W2996549507 hasAuthorship W2996549507A5008951080 @default.
- W2996549507 hasAuthorship W2996549507A5010176958 @default.
- W2996549507 hasAuthorship W2996549507A5048974354 @default.
- W2996549507 hasAuthorship W2996549507A5071784141 @default.
- W2996549507 hasBestOaLocation W29965495071 @default.
- W2996549507 hasConcept C119857082 @default.
- W2996549507 hasConcept C136197465 @default.
- W2996549507 hasConcept C154945302 @default.
- W2996549507 hasConcept C41008148 @default.
- W2996549507 hasConcept C97541855 @default.
- W2996549507 hasConceptScore W2996549507C119857082 @default.
- W2996549507 hasConceptScore W2996549507C136197465 @default.
- W2996549507 hasConceptScore W2996549507C154945302 @default.
- W2996549507 hasConceptScore W2996549507C41008148 @default.
- W2996549507 hasConceptScore W2996549507C97541855 @default.
- W2996549507 hasLocation W29965495071 @default.
- W2996549507 hasOpenAccess W2996549507 @default.
- W2996549507 hasPrimaryLocation W29965495071 @default.
- W2996549507 hasRelatedWork W2923653485 @default.
- W2996549507 hasRelatedWork W2957776456 @default.
- W2996549507 hasRelatedWork W3022038857 @default.
- W2996549507 hasRelatedWork W3088315509 @default.
- W2996549507 hasRelatedWork W3209094908 @default.
- W2996549507 hasRelatedWork W4210912933 @default.
- W2996549507 hasRelatedWork W4255994452 @default.
- W2996549507 hasRelatedWork W4319083788 @default.