Matches in SemOpenAlex for { <https://semopenalex.org/work/W4285603402> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W4285603402 abstract "In this work we introduce reinforcement learning techniques for solving lexicographic multi-objective problems. These are problems that involve multiple reward signals, and where the goal is to learn a policy that maximises the first reward signal, and subject to this constraint also maximises the second reward signal, and so on. We present a family of both action-value and policy gradient algorithms that can be used to solve such problems, and prove that they converge to policies that are lexicographically optimal. We evaluate the scalability and performance of these algorithms empirically, and demonstrate their applicability in practical settings. As a more specific application, we show how our algorithms can be used to impose safety constraints on the behaviour of an agent, and compare their performance in this context with that of other constrained reinforcement learning algorithms." @default.
- W4285603402 created "2022-07-16" @default.
- W4285603402 creator A5000252148 @default.
- W4285603402 creator A5004337233 @default.
- W4285603402 creator A5013946815 @default.
- W4285603402 creator A5018073672 @default.
- W4285603402 creator A5021671895 @default.
- W4285603402 creator A5026486057 @default.
- W4285603402 creator A5027828708 @default.
- W4285603402 creator A5036920814 @default.
- W4285603402 creator A5038859429 @default.
- W4285603402 creator A5039360390 @default.
- W4285603402 date "2022-07-01" @default.
- W4285603402 modified "2023-10-11" @default.
- W4285603402 title "Lexicographic Multi-Objective Reinforcement Learning" @default.
- W4285603402 doi "https://doi.org/10.24963/ijcai.2022/476" @default.
- W4285603402 hasPublicationYear "2022" @default.
- W4285603402 type Work @default.
- W4285603402 citedByCount "1" @default.
- W4285603402 countsByYear W42856034022022 @default.
- W4285603402 crossrefType "proceedings-article" @default.
- W4285603402 hasAuthorship W4285603402A5000252148 @default.
- W4285603402 hasAuthorship W4285603402A5004337233 @default.
- W4285603402 hasAuthorship W4285603402A5013946815 @default.
- W4285603402 hasAuthorship W4285603402A5018073672 @default.
- W4285603402 hasAuthorship W4285603402A5021671895 @default.
- W4285603402 hasAuthorship W4285603402A5026486057 @default.
- W4285603402 hasAuthorship W4285603402A5027828708 @default.
- W4285603402 hasAuthorship W4285603402A5036920814 @default.
- W4285603402 hasAuthorship W4285603402A5038859429 @default.
- W4285603402 hasAuthorship W4285603402A5039360390 @default.
- W4285603402 hasBestOaLocation W42856034021 @default.
- W4285603402 hasConcept C114614502 @default.
- W4285603402 hasConcept C119857082 @default.
- W4285603402 hasConcept C126255220 @default.
- W4285603402 hasConcept C151730666 @default.
- W4285603402 hasConcept C154945302 @default.
- W4285603402 hasConcept C159254197 @default.
- W4285603402 hasConcept C2524010 @default.
- W4285603402 hasConcept C2776036281 @default.
- W4285603402 hasConcept C2779343474 @default.
- W4285603402 hasConcept C33923547 @default.
- W4285603402 hasConcept C41008148 @default.
- W4285603402 hasConcept C48044578 @default.
- W4285603402 hasConcept C77088390 @default.
- W4285603402 hasConcept C86803240 @default.
- W4285603402 hasConcept C97541855 @default.
- W4285603402 hasConceptScore W4285603402C114614502 @default.
- W4285603402 hasConceptScore W4285603402C119857082 @default.
- W4285603402 hasConceptScore W4285603402C126255220 @default.
- W4285603402 hasConceptScore W4285603402C151730666 @default.
- W4285603402 hasConceptScore W4285603402C154945302 @default.
- W4285603402 hasConceptScore W4285603402C159254197 @default.
- W4285603402 hasConceptScore W4285603402C2524010 @default.
- W4285603402 hasConceptScore W4285603402C2776036281 @default.
- W4285603402 hasConceptScore W4285603402C2779343474 @default.
- W4285603402 hasConceptScore W4285603402C33923547 @default.
- W4285603402 hasConceptScore W4285603402C41008148 @default.
- W4285603402 hasConceptScore W4285603402C48044578 @default.
- W4285603402 hasConceptScore W4285603402C77088390 @default.
- W4285603402 hasConceptScore W4285603402C86803240 @default.
- W4285603402 hasConceptScore W4285603402C97541855 @default.
- W4285603402 hasLocation W42856034021 @default.
- W4285603402 hasLocation W42856034022 @default.
- W4285603402 hasOpenAccess W4285603402 @default.
- W4285603402 hasPrimaryLocation W42856034021 @default.
- W4285603402 hasRelatedWork W1562959674 @default.
- W4285603402 hasRelatedWork W2302028273 @default.
- W4285603402 hasRelatedWork W2364921833 @default.
- W4285603402 hasRelatedWork W2388030554 @default.
- W4285603402 hasRelatedWork W2461970972 @default.
- W4285603402 hasRelatedWork W2923653485 @default.
- W4285603402 hasRelatedWork W2957776456 @default.
- W4285603402 hasRelatedWork W3022038857 @default.
- W4285603402 hasRelatedWork W4294811414 @default.
- W4285603402 hasRelatedWork W4319083788 @default.
- W4285603402 isParatext "false" @default.
- W4285603402 isRetracted "false" @default.
- W4285603402 workType "article" @default.