Matches in SemOpenAlex for { <https://semopenalex.org/work/W3162183426> ?p ?o ?g. }
Showing items 1 to 65 of
65
with 100 items per page.
- W3162183426 abstract "A ubiquitous challenge for humans and learning agents is the ability to measure and forecast how competent they can become in a specific domain. The problems created by the inability to forecast and measure achievable competence are manifold. We don't know who will be good at which jobs - we don't know which problems a machine learning architecture can solve, we don't know whether other agents might be better, etc. Performance measures exist in all sorts of domains: e.g. video games, athletics, academics, etc. that traditionally capture task-specific performance heuristics such as points accrued, time remaining, accuracy, etc. Assessment is problematic because we don't have the ideal battery of tests, and the time-cost of extensive testing on all plausible tests is prohibitive. In reinforcement learning we desire that agents are able to learn reasonable behavior on novel tasks in new environments, but it is unclear on how to best design tasks and scheduling to provide the agent with a general understanding of its capabilities. These quantities based on task heuristics may not even be appropriate for measuring an agent’s general ability if they do not accurately reflect the agent’s true goals, especially in environments with multiple available tasks. What does it really mean to be competent in an environment? Rather than domain-specific heuristics, a more fundamental notion of skill is the agent’s ability to understand, predict and control their environment. While we normally impute a player’s capabilities indirectly from their score or rank, in this dissertation we show that it is possible to create a direct measure of a player’s capabilities via the empowerment measure. We then use this measure to show the value of using a better universal objective in the context of reinforcement learning. Navigation is a task where it is advantageous to understand, predict, and control the environment. Methods for localization using information-theoretic quantities focus on where to sample signals to reduce uncertainty, rather than use the agent’s understanding of its own capabilities to understand where it might fail. We demonstrate a proof of concept using empowerment to predict navigation failure, and also how it can be used to produce a safer route. This measure is poised to have broad impact across multiple domains, including RL, education, and entertainment." @default.
- W3162183426 created "2021-05-24" @default.
- W3162183426 creator A5054705809 @default.
- W3162183426 date "2020-12-01" @default.
- W3162183426 modified "2023-09-27" @default.
- W3162183426 title "Empowerment as a Task-Agnostic Measure of Domain Competence" @default.
- W3162183426 hasPublicationYear "2020" @default.
- W3162183426 type Work @default.
- W3162183426 sameAs 3162183426 @default.
- W3162183426 citedByCount "0" @default.
- W3162183426 crossrefType "dissertation" @default.
- W3162183426 hasAuthorship W3162183426A5054705809 @default.
- W3162183426 hasConcept C100521375 @default.
- W3162183426 hasConcept C107457646 @default.
- W3162183426 hasConcept C111919701 @default.
- W3162183426 hasConcept C119857082 @default.
- W3162183426 hasConcept C127413603 @default.
- W3162183426 hasConcept C127705205 @default.
- W3162183426 hasConcept C154945302 @default.
- W3162183426 hasConcept C15744967 @default.
- W3162183426 hasConcept C201995342 @default.
- W3162183426 hasConcept C2780451532 @default.
- W3162183426 hasConcept C41008148 @default.
- W3162183426 hasConcept C77805123 @default.
- W3162183426 hasConcept C97541855 @default.
- W3162183426 hasConceptScore W3162183426C100521375 @default.
- W3162183426 hasConceptScore W3162183426C107457646 @default.
- W3162183426 hasConceptScore W3162183426C111919701 @default.
- W3162183426 hasConceptScore W3162183426C119857082 @default.
- W3162183426 hasConceptScore W3162183426C127413603 @default.
- W3162183426 hasConceptScore W3162183426C127705205 @default.
- W3162183426 hasConceptScore W3162183426C154945302 @default.
- W3162183426 hasConceptScore W3162183426C15744967 @default.
- W3162183426 hasConceptScore W3162183426C201995342 @default.
- W3162183426 hasConceptScore W3162183426C2780451532 @default.
- W3162183426 hasConceptScore W3162183426C41008148 @default.
- W3162183426 hasConceptScore W3162183426C77805123 @default.
- W3162183426 hasConceptScore W3162183426C97541855 @default.
- W3162183426 hasLocation W31621834261 @default.
- W3162183426 hasOpenAccess W3162183426 @default.
- W3162183426 hasPrimaryLocation W31621834261 @default.
- W3162183426 hasRelatedWork W158504997 @default.
- W3162183426 hasRelatedWork W1618761262 @default.
- W3162183426 hasRelatedWork W1762402281 @default.
- W3162183426 hasRelatedWork W1971690354 @default.
- W3162183426 hasRelatedWork W2166485688 @default.
- W3162183426 hasRelatedWork W2296212956 @default.
- W3162183426 hasRelatedWork W265012337 @default.
- W3162183426 hasRelatedWork W2749802781 @default.
- W3162183426 hasRelatedWork W2800631177 @default.
- W3162183426 hasRelatedWork W2888831604 @default.
- W3162183426 hasRelatedWork W2922221969 @default.
- W3162183426 hasRelatedWork W2929352350 @default.
- W3162183426 hasRelatedWork W2956596508 @default.
- W3162183426 hasRelatedWork W2958095304 @default.
- W3162183426 hasRelatedWork W3104102736 @default.
- W3162183426 hasRelatedWork W3154061550 @default.
- W3162183426 hasRelatedWork W3202686804 @default.
- W3162183426 hasRelatedWork W3210289157 @default.
- W3162183426 hasRelatedWork W2188173635 @default.
- W3162183426 hasRelatedWork W2915940255 @default.
- W3162183426 isParatext "false" @default.
- W3162183426 isRetracted "false" @default.
- W3162183426 magId "3162183426" @default.
- W3162183426 workType "dissertation" @default.