Matches in SemOpenAlex for { <https://semopenalex.org/work/W115961410> ?p ?o ?g. }
Showing items 1 to 100 of
100
with 100 items per page.
- W115961410 endingPage "365" @default.
- W115961410 startingPage "361" @default.
- W115961410 abstract "Reinforcement learning has become a widely used methodology for creating intelligent agents in a wide range of applications. However, its performance deteriorates in tasks with sparse feedback or lengthy inter-reinforcement times. This paper presents an extension that makes use of an advisory entity to provide additional feedback to the agent. The agent incorporates both the rewards provided by the environment and the advice to attain faster learning speed, and policies that are tuned towards the preferences of the advisor while still achieving the underlying task objective. The advice is converted to “tuning” or user rewards that, together with the task rewards, define a composite reward function that more accurately defines the advisor’s perception of the task. At the same time, the formation of erroneous loops due to incorrect user rewards is avoided using formal bounds on the user reward component. This approach is illustrated using a robot navigation task." @default.
- W115961410 created "2016-06-24" @default.
- W115961410 creator A5038301640 @default.
- W115961410 creator A5047174917 @default.
- W115961410 date "2003-01-01" @default.
- W115961410 modified "2023-09-27" @default.
- W115961410 title "Learning from Reinforcement and Advice Using Composite Reward Functions" @default.
- W115961410 cites W1584607643 @default.
- W115961410 cites W1967608200 @default.
- W115961410 cites W2107726111 @default.
- W115961410 cites W2119409989 @default.
- W115961410 cites W2141559645 @default.
- W115961410 cites W2160032073 @default.
- W115961410 cites W2169562625 @default.
- W115961410 cites W2622020362 @default.
- W115961410 cites W285078413 @default.
- W115961410 cites W3011120880 @default.
- W115961410 hasPublicationYear "2003" @default.
- W115961410 type Work @default.
- W115961410 sameAs 115961410 @default.
- W115961410 citedByCount "5" @default.
- W115961410 countsByYear W1159614102015 @default.
- W115961410 crossrefType "proceedings-article" @default.
- W115961410 hasAuthorship W115961410A5038301640 @default.
- W115961410 hasAuthorship W115961410A5047174917 @default.
- W115961410 hasConcept C107457646 @default.
- W115961410 hasConcept C121332964 @default.
- W115961410 hasConcept C127413603 @default.
- W115961410 hasConcept C14036430 @default.
- W115961410 hasConcept C146978453 @default.
- W115961410 hasConcept C154945302 @default.
- W115961410 hasConcept C15744967 @default.
- W115961410 hasConcept C168167062 @default.
- W115961410 hasConcept C169760540 @default.
- W115961410 hasConcept C199360897 @default.
- W115961410 hasConcept C201995342 @default.
- W115961410 hasConcept C204323151 @default.
- W115961410 hasConcept C26760741 @default.
- W115961410 hasConcept C2779955035 @default.
- W115961410 hasConcept C2780451532 @default.
- W115961410 hasConcept C41008148 @default.
- W115961410 hasConcept C66938386 @default.
- W115961410 hasConcept C67203356 @default.
- W115961410 hasConcept C78458016 @default.
- W115961410 hasConcept C86803240 @default.
- W115961410 hasConcept C90509273 @default.
- W115961410 hasConcept C97355855 @default.
- W115961410 hasConcept C97541855 @default.
- W115961410 hasConceptScore W115961410C107457646 @default.
- W115961410 hasConceptScore W115961410C121332964 @default.
- W115961410 hasConceptScore W115961410C127413603 @default.
- W115961410 hasConceptScore W115961410C14036430 @default.
- W115961410 hasConceptScore W115961410C146978453 @default.
- W115961410 hasConceptScore W115961410C154945302 @default.
- W115961410 hasConceptScore W115961410C15744967 @default.
- W115961410 hasConceptScore W115961410C168167062 @default.
- W115961410 hasConceptScore W115961410C169760540 @default.
- W115961410 hasConceptScore W115961410C199360897 @default.
- W115961410 hasConceptScore W115961410C201995342 @default.
- W115961410 hasConceptScore W115961410C204323151 @default.
- W115961410 hasConceptScore W115961410C26760741 @default.
- W115961410 hasConceptScore W115961410C2779955035 @default.
- W115961410 hasConceptScore W115961410C2780451532 @default.
- W115961410 hasConceptScore W115961410C41008148 @default.
- W115961410 hasConceptScore W115961410C66938386 @default.
- W115961410 hasConceptScore W115961410C67203356 @default.
- W115961410 hasConceptScore W115961410C78458016 @default.
- W115961410 hasConceptScore W115961410C86803240 @default.
- W115961410 hasConceptScore W115961410C90509273 @default.
- W115961410 hasConceptScore W115961410C97355855 @default.
- W115961410 hasConceptScore W115961410C97541855 @default.
- W115961410 hasLocation W1159614101 @default.
- W115961410 hasOpenAccess W115961410 @default.
- W115961410 hasPrimaryLocation W1159614101 @default.
- W115961410 hasRelatedWork W1536323281 @default.
- W115961410 hasRelatedWork W194754089 @default.
- W115961410 hasRelatedWork W2080379318 @default.
- W115961410 hasRelatedWork W2104727569 @default.
- W115961410 hasRelatedWork W2121863487 @default.
- W115961410 hasRelatedWork W2230201233 @default.
- W115961410 hasRelatedWork W2345574366 @default.
- W115961410 hasRelatedWork W2621111958 @default.
- W115961410 hasRelatedWork W2897200624 @default.
- W115961410 hasRelatedWork W2963009616 @default.
- W115961410 hasRelatedWork W2967009433 @default.
- W115961410 hasRelatedWork W3029492306 @default.
- W115961410 hasRelatedWork W3089491416 @default.
- W115961410 hasRelatedWork W3092203123 @default.
- W115961410 hasRelatedWork W3100081294 @default.
- W115961410 hasRelatedWork W3156698756 @default.
- W115961410 hasRelatedWork W3173218700 @default.
- W115961410 hasRelatedWork W3196776285 @default.
- W115961410 hasRelatedWork W3205407166 @default.
- W115961410 hasRelatedWork W3153806512 @default.
- W115961410 isParatext "false" @default.
- W115961410 isRetracted "false" @default.
- W115961410 magId "115961410" @default.
- W115961410 workType "article" @default.