Matches in SemOpenAlex for { <https://semopenalex.org/work/W2890882194> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- W2890882194 abstract "Many reinforcement-learning researchers treat the reward function as a part of the environment, meaning that the agent can only know the reward of a state if it encounters that state in a trial run. However, we argue that this is an unnecessary limitation and instead, the reward function should be provided to the learning algorithm. The advantage is that the algorithm can then use the reward function to check the reward for states that the agent hasn't even encountered yet. In addition, the algorithm can simultaneously learn policies for multiple reward functions. For each state, the algorithm would calculate the reward using each of the reward functions and add the rewards to its experience replay dataset. The Hindsight Experience Replay algorithm developed by Andrychowicz et al. (2017) does just this, and learns to generalize across a distribution of sparse, goal-based rewards. We extend this algorithm to linearly-weighted, multi-objective rewards and learn a single policy that can generalize across all linear combinations of the multi-objective reward. Whereas other multi-objective algorithms teach the Q-function to generalize across the reward weights, our algorithm enables the policy to generalize, and can thus be used with continuous actions." @default.
- W2890882194 created "2018-09-27" @default.
- W2890882194 creator A5058065285 @default.
- W2890882194 creator A5077747744 @default.
- W2890882194 date "2018-09-17" @default.
- W2890882194 modified "2023-09-27" @default.
- W2890882194 title "Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning" @default.
- W2890882194 cites W1998649829 @default.
- W2890882194 cites W2120346334 @default.
- W2890882194 cites W2141481921 @default.
- W2890882194 cites W2155007355 @default.
- W2890882194 cites W2342840547 @default.
- W2890882194 cites W2733961795 @default.
- W2890882194 cites W3103262232 @default.
- W2890882194 hasPublicationYear "2018" @default.
- W2890882194 type Work @default.
- W2890882194 sameAs 2890882194 @default.
- W2890882194 citedByCount "4" @default.
- W2890882194 countsByYear W28908821942019 @default.
- W2890882194 countsByYear W28908821942020 @default.
- W2890882194 crossrefType "posted-content" @default.
- W2890882194 hasAuthorship W2890882194A5058065285 @default.
- W2890882194 hasAuthorship W2890882194A5077747744 @default.
- W2890882194 hasConcept C10347200 @default.
- W2890882194 hasConcept C11413529 @default.
- W2890882194 hasConcept C119857082 @default.
- W2890882194 hasConcept C14036430 @default.
- W2890882194 hasConcept C154945302 @default.
- W2890882194 hasConcept C15744967 @default.
- W2890882194 hasConcept C180747234 @default.
- W2890882194 hasConcept C2780876879 @default.
- W2890882194 hasConcept C41008148 @default.
- W2890882194 hasConcept C48103436 @default.
- W2890882194 hasConcept C542102704 @default.
- W2890882194 hasConcept C78458016 @default.
- W2890882194 hasConcept C86803240 @default.
- W2890882194 hasConcept C97541855 @default.
- W2890882194 hasConceptScore W2890882194C10347200 @default.
- W2890882194 hasConceptScore W2890882194C11413529 @default.
- W2890882194 hasConceptScore W2890882194C119857082 @default.
- W2890882194 hasConceptScore W2890882194C14036430 @default.
- W2890882194 hasConceptScore W2890882194C154945302 @default.
- W2890882194 hasConceptScore W2890882194C15744967 @default.
- W2890882194 hasConceptScore W2890882194C180747234 @default.
- W2890882194 hasConceptScore W2890882194C2780876879 @default.
- W2890882194 hasConceptScore W2890882194C41008148 @default.
- W2890882194 hasConceptScore W2890882194C48103436 @default.
- W2890882194 hasConceptScore W2890882194C542102704 @default.
- W2890882194 hasConceptScore W2890882194C78458016 @default.
- W2890882194 hasConceptScore W2890882194C86803240 @default.
- W2890882194 hasConceptScore W2890882194C97541855 @default.
- W2890882194 hasLocation W28908821941 @default.
- W2890882194 hasOpenAccess W2890882194 @default.
- W2890882194 hasPrimaryLocation W28908821941 @default.
- W2890882194 hasRelatedWork W1999874108 @default.
- W2890882194 hasRelatedWork W2364302853 @default.
- W2890882194 hasRelatedWork W2417936402 @default.
- W2890882194 hasRelatedWork W2605369401 @default.
- W2890882194 hasRelatedWork W2897200624 @default.
- W2890882194 hasRelatedWork W2912432356 @default.
- W2890882194 hasRelatedWork W2915060045 @default.
- W2890882194 hasRelatedWork W2986020463 @default.
- W2890882194 hasRelatedWork W3035599863 @default.
- W2890882194 hasRelatedWork W3037755290 @default.
- W2890882194 hasRelatedWork W3092156990 @default.
- W2890882194 hasRelatedWork W3098951764 @default.
- W2890882194 hasRelatedWork W3121174195 @default.
- W2890882194 hasRelatedWork W3138341590 @default.
- W2890882194 hasRelatedWork W3163049241 @default.
- W2890882194 hasRelatedWork W3163453895 @default.
- W2890882194 hasRelatedWork W3173218700 @default.
- W2890882194 hasRelatedWork W3198608914 @default.
- W2890882194 hasRelatedWork W3212897811 @default.
- W2890882194 hasRelatedWork W88199814 @default.
- W2890882194 isParatext "false" @default.
- W2890882194 isRetracted "false" @default.
- W2890882194 magId "2890882194" @default.
- W2890882194 workType "article" @default.