Matches in SemOpenAlex for { <https://semopenalex.org/work/W2912563296> ?p ?o ?g. }
- W2912563296 abstract "When multiple agents learn in a decentralized manner, the environment appears non-stationary from the perspective of an individual agent due to the exploration and learning of the other agents. Recently proposed deep multi-agent reinforcement learning methods have tried to mitigate this non-stationarity by attempting to determine which samples are from other agent exploration or suboptimality and take them less into account during learning. Based on the same philosophy, this paper introduces a decentralized quantile estimator, which aims to improve performance by distinguishing non-stationary samples based on the likelihood of returns. In particular, each agent considers the likelihood that other agent exploration and policy changes are occurring, essentially utilizing the agent's own estimations to weigh the learning rate that should be applied towards the given samples. We introduce a formal method of calculating differences of our return distribution representations and methods for utilizing it to guide updates. We also explore the effect of risk-seeking strategies for adjusting learning over time and propose adaptive risk distortion functions which guides risk sensitivity. Our experiments, on traditional benchmarks and new domains, show our methods are more stable, sample efficient and more likely to converge to a joint optimal policy than previous methods." @default.
- W2912563296 created "2019-02-21" @default.
- W2912563296 creator A5033129735 @default.
- W2912563296 creator A5079710799 @default.
- W2912563296 date "2018-12-15" @default.
- W2912563296 modified "2023-09-27" @default.
- W2912563296 title "Likelihood Quantile Networks for Coordinating Multi-Agent Reinforcement Learning" @default.
- W2912563296 cites W137071854 @default.
- W2912563296 cites W1522301498 @default.
- W2912563296 cites W1547105496 @default.
- W2912563296 cites W1560074431 @default.
- W2912563296 cites W1575592356 @default.
- W2912563296 cites W1595483645 @default.
- W2912563296 cites W2053984877 @default.
- W2912563296 cites W2056819186 @default.
- W2912563296 cites W2064675550 @default.
- W2912563296 cites W2096145798 @default.
- W2912563296 cites W2102288976 @default.
- W2912563296 cites W2104602264 @default.
- W2912563296 cites W2108449787 @default.
- W2912563296 cites W2108892923 @default.
- W2912563296 cites W2120327309 @default.
- W2912563296 cites W2145339207 @default.
- W2912563296 cites W2159223630 @default.
- W2912563296 cites W2168359464 @default.
- W2912563296 cites W2255045308 @default.
- W2912563296 cites W2395575420 @default.
- W2912563296 cites W2466211196 @default.
- W2912563296 cites W2592798481 @default.
- W2912563296 cites W2604873668 @default.
- W2912563296 cites W2623431351 @default.
- W2912563296 cites W2747213132 @default.
- W2912563296 cites W2798705390 @default.
- W2912563296 cites W2803308811 @default.
- W2912563296 cites W2891925283 @default.
- W2912563296 cites W2962938168 @default.
- W2912563296 cites W2962938178 @default.
- W2912563296 cites W2963390684 @default.
- W2912563296 cites W2963485523 @default.
- W2912563296 cites W3093287223 @default.
- W2912563296 cites W85998123 @default.
- W2912563296 hasPublicationYear "2018" @default.
- W2912563296 type Work @default.
- W2912563296 sameAs 2912563296 @default.
- W2912563296 citedByCount "2" @default.
- W2912563296 countsByYear W29125632962018 @default.
- W2912563296 countsByYear W29125632962020 @default.
- W2912563296 crossrefType "posted-content" @default.
- W2912563296 hasAuthorship W2912563296A5033129735 @default.
- W2912563296 hasAuthorship W2912563296A5079710799 @default.
- W2912563296 hasConcept C105795698 @default.
- W2912563296 hasConcept C118671147 @default.
- W2912563296 hasConcept C119857082 @default.
- W2912563296 hasConcept C126780896 @default.
- W2912563296 hasConcept C12713177 @default.
- W2912563296 hasConcept C149782125 @default.
- W2912563296 hasConcept C154945302 @default.
- W2912563296 hasConcept C185429906 @default.
- W2912563296 hasConcept C185592680 @default.
- W2912563296 hasConcept C194257627 @default.
- W2912563296 hasConcept C198531522 @default.
- W2912563296 hasConcept C2776257435 @default.
- W2912563296 hasConcept C31258907 @default.
- W2912563296 hasConcept C33923547 @default.
- W2912563296 hasConcept C41008148 @default.
- W2912563296 hasConcept C43617362 @default.
- W2912563296 hasConcept C97541855 @default.
- W2912563296 hasConceptScore W2912563296C105795698 @default.
- W2912563296 hasConceptScore W2912563296C118671147 @default.
- W2912563296 hasConceptScore W2912563296C119857082 @default.
- W2912563296 hasConceptScore W2912563296C126780896 @default.
- W2912563296 hasConceptScore W2912563296C12713177 @default.
- W2912563296 hasConceptScore W2912563296C149782125 @default.
- W2912563296 hasConceptScore W2912563296C154945302 @default.
- W2912563296 hasConceptScore W2912563296C185429906 @default.
- W2912563296 hasConceptScore W2912563296C185592680 @default.
- W2912563296 hasConceptScore W2912563296C194257627 @default.
- W2912563296 hasConceptScore W2912563296C198531522 @default.
- W2912563296 hasConceptScore W2912563296C2776257435 @default.
- W2912563296 hasConceptScore W2912563296C31258907 @default.
- W2912563296 hasConceptScore W2912563296C33923547 @default.
- W2912563296 hasConceptScore W2912563296C41008148 @default.
- W2912563296 hasConceptScore W2912563296C43617362 @default.
- W2912563296 hasConceptScore W2912563296C97541855 @default.
- W2912563296 hasLocation W29125632961 @default.
- W2912563296 hasOpenAccess W2912563296 @default.
- W2912563296 hasPrimaryLocation W29125632961 @default.
- W2912563296 hasRelatedWork W2033976720 @default.
- W2912563296 hasRelatedWork W2099945315 @default.
- W2912563296 hasRelatedWork W2708325519 @default.
- W2912563296 hasRelatedWork W2888369163 @default.
- W2912563296 hasRelatedWork W2913326990 @default.
- W2912563296 hasRelatedWork W2949846483 @default.
- W2912563296 hasRelatedWork W2963452950 @default.
- W2912563296 hasRelatedWork W2990239702 @default.
- W2912563296 hasRelatedWork W3014593416 @default.
- W2912563296 hasRelatedWork W3015291918 @default.
- W2912563296 hasRelatedWork W3037940279 @default.
- W2912563296 hasRelatedWork W3039208705 @default.
- W2912563296 hasRelatedWork W3119430014 @default.