Matches in SemOpenAlex for { <https://semopenalex.org/work/W159122836> ?p ?o ?g. }
- W159122836 abstract "An agent that interacts with other agents in multi-agent systems can benefit significantly from adapting to the others. When performing active learning, every agent’s action affects the interaction process in two ways: The effect on the expected reward according to the current knowledge held by the agent, and the effect on the acquired knowledge, and hence, on future rewards expected to be received. The agent must therefore make a tradeoff between the wish to exploit its current knowledge, and the wish to explore other alternatives, to improve its knowledge for better decisions in the future. The goal of this work is to develop exploration strategies for a model-based learning agent to handle its encounters with other agents in a common environment. We first show how to incorporate exploration methods usually used in reinforcement learning into model-based learning. We then demonstrate the risk involved in exploration – an exploratory action taken by the agent can yield a better model of the other agent but also carries the risk of putting the agent into a much worse position. We present the lookahead-based exploration strategy that evaluates actions according to their expected utility, their expected contribution to the acquired knowledge, and the risk they carry. Instead of holding one model, the agent maintains a mixed opponent model, a belief distribution over a set of models that reflects its uncertainty about the opponent’s strategy. Every action is evaluated according to its long run contribution to the expected utility and to the knowledge regarding the opponent’s strategy. Risky actions are more likely to be detected by considering their expected outcome according to the alternative models of the opponent’s behavior. We present an efficient algorithm that returns an almost optimal exploration plan against the mixed model and provide a proof of its correctness and an analysis of its complexity. We report experimental results in the Iterated Prisoner’s Dilemma domain, comparing the capabilities of the different exploration strategies. The experiments demonstrate the superiority of lookahead-based exploration over other exploration methods." @default.
- W159122836 created "2016-06-24" @default.
- W159122836 creator A5038537105 @default.
- W159122836 creator A5088024319 @default.
- W159122836 date "2004-01-01" @default.
- W159122836 modified "2023-09-27" @default.
- W159122836 title "Exploration Strategies for Model-based Learning in Multi-agent Systems" @default.
- W159122836 cites W137281501 @default.
- W159122836 cites W1491843047 @default.
- W159122836 cites W1542941925 @default.
- W159122836 cites W1931792391 @default.
- W159122836 cites W1978545668 @default.
- W159122836 cites W1978865199 @default.
- W159122836 cites W1979592899 @default.
- W159122836 cites W1989445634 @default.
- W159122836 cites W1989538212 @default.
- W159122836 cites W2002089154 @default.
- W159122836 cites W2005034333 @default.
- W159122836 cites W2006069523 @default.
- W159122836 cites W2006198267 @default.
- W159122836 cites W2006650111 @default.
- W159122836 cites W2015667537 @default.
- W159122836 cites W2032100464 @default.
- W159122836 cites W2045291344 @default.
- W159122836 cites W2048226872 @default.
- W159122836 cites W2053616263 @default.
- W159122836 cites W2058879737 @default.
- W159122836 cites W2062663664 @default.
- W159122836 cites W2077400689 @default.
- W159122836 cites W2107726111 @default.
- W159122836 cites W2109223704 @default.
- W159122836 cites W2129760941 @default.
- W159122836 cites W2131724520 @default.
- W159122836 cites W2137041106 @default.
- W159122836 cites W2142839172 @default.
- W159122836 cites W2143100276 @default.
- W159122836 cites W2144578442 @default.
- W159122836 cites W2150339816 @default.
- W159122836 cites W2165619126 @default.
- W159122836 cites W2167448192 @default.
- W159122836 cites W2317700292 @default.
- W159122836 cites W3124042944 @default.
- W159122836 cites W3125686698 @default.
- W159122836 cites W421997344 @default.
- W159122836 cites W45170341 @default.
- W159122836 cites W65193931 @default.
- W159122836 cites W76312321 @default.
- W159122836 hasPublicationYear "2004" @default.
- W159122836 type Work @default.
- W159122836 sameAs 159122836 @default.
- W159122836 citedByCount "7" @default.
- W159122836 countsByYear W1591228362012 @default.
- W159122836 countsByYear W1591228362018 @default.
- W159122836 countsByYear W1591228362019 @default.
- W159122836 countsByYear W1591228362021 @default.
- W159122836 crossrefType "journal-article" @default.
- W159122836 hasAuthorship W159122836A5038537105 @default.
- W159122836 hasAuthorship W159122836A5088024319 @default.
- W159122836 hasConcept C111919701 @default.
- W159122836 hasConcept C112930515 @default.
- W159122836 hasConcept C121332964 @default.
- W159122836 hasConcept C144133560 @default.
- W159122836 hasConcept C154945302 @default.
- W159122836 hasConcept C165696696 @default.
- W159122836 hasConcept C177264268 @default.
- W159122836 hasConcept C199360897 @default.
- W159122836 hasConcept C2780791683 @default.
- W159122836 hasConcept C38652104 @default.
- W159122836 hasConcept C41008148 @default.
- W159122836 hasConcept C41065033 @default.
- W159122836 hasConcept C62520636 @default.
- W159122836 hasConcept C97541855 @default.
- W159122836 hasConcept C98045186 @default.
- W159122836 hasConceptScore W159122836C111919701 @default.
- W159122836 hasConceptScore W159122836C112930515 @default.
- W159122836 hasConceptScore W159122836C121332964 @default.
- W159122836 hasConceptScore W159122836C144133560 @default.
- W159122836 hasConceptScore W159122836C154945302 @default.
- W159122836 hasConceptScore W159122836C165696696 @default.
- W159122836 hasConceptScore W159122836C177264268 @default.
- W159122836 hasConceptScore W159122836C199360897 @default.
- W159122836 hasConceptScore W159122836C2780791683 @default.
- W159122836 hasConceptScore W159122836C38652104 @default.
- W159122836 hasConceptScore W159122836C41008148 @default.
- W159122836 hasConceptScore W159122836C41065033 @default.
- W159122836 hasConceptScore W159122836C62520636 @default.
- W159122836 hasConceptScore W159122836C97541855 @default.
- W159122836 hasConceptScore W159122836C98045186 @default.
- W159122836 hasLocation W1591228361 @default.
- W159122836 hasOpenAccess W159122836 @default.
- W159122836 hasPrimaryLocation W1591228361 @default.
- W159122836 hasRelatedWork W137281501 @default.
- W159122836 hasRelatedWork W1496262375 @default.
- W159122836 hasRelatedWork W1583408742 @default.
- W159122836 hasRelatedWork W1596348109 @default.
- W159122836 hasRelatedWork W1827311204 @default.
- W159122836 hasRelatedWork W1969264857 @default.
- W159122836 hasRelatedWork W2065973244 @default.
- W159122836 hasRelatedWork W2077339704 @default.
- W159122836 hasRelatedWork W2169792277 @default.