Matches in SemOpenAlex for { <https://semopenalex.org/work/W3099437791> ?p ?o ?g. }
Showing items 1 to 73 of
73
with 100 items per page.
- W3099437791 endingPage "18062" @default.
- W3099437791 startingPage "18050" @default.
- W3099437791 abstract "Exploration is a key problem in reinforcement learning, since agents can only learn from data they acquire in the environment. With that in mind, maintaining a population of agents is an attractive method, as it allows data be collected with a diverse set of behaviors. This behavioral diversity is often boosted via multi-objective loss functions. However, those approaches typically leverage mean field updates based on pairwise distances, which makes them susceptible to cycling behaviors and increased redundancy. In addition, explicitly boosting diversity often has a detrimental impact on optimizing already fruitful behaviors for rewards. As such, the reward-diversity trade off typically relies on heuristics. Finally, such methods require behavioral representations, often handcrafted and domain specific. In this paper, we introduce an approach to optimize all members of a population simultaneously. Rather than using pairwise distance, we measure the volume of the entire population in a behavioral manifold, defined by task-agnostic behavioral embeddings. In addition, our algorithm Diversity via Determinants (DvD), adapts the degree of diversity during training using online learning techniques. We introduce both evolutionary and gradient-based instantiations of DvD and show they effectively improve exploration without reducing performance when better exploration is not required." @default.
- W3099437791 created "2020-11-23" @default.
- W3099437791 creator A5006761546 @default.
- W3099437791 creator A5031842812 @default.
- W3099437791 creator A5058617210 @default.
- W3099437791 creator A5083828420 @default.
- W3099437791 date "2020-06-12" @default.
- W3099437791 modified "2023-09-24" @default.
- W3099437791 title "Effective Diversity in Population Based Reinforcement Learning" @default.
- W3099437791 hasPublicationYear "2020" @default.
- W3099437791 type Work @default.
- W3099437791 sameAs 3099437791 @default.
- W3099437791 citedByCount "7" @default.
- W3099437791 countsByYear W30994377912021 @default.
- W3099437791 crossrefType "proceedings-article" @default.
- W3099437791 hasAuthorship W3099437791A5006761546 @default.
- W3099437791 hasAuthorship W3099437791A5031842812 @default.
- W3099437791 hasAuthorship W3099437791A5058617210 @default.
- W3099437791 hasAuthorship W3099437791A5083828420 @default.
- W3099437791 hasConcept C111919701 @default.
- W3099437791 hasConcept C119857082 @default.
- W3099437791 hasConcept C127705205 @default.
- W3099437791 hasConcept C144024400 @default.
- W3099437791 hasConcept C149923435 @default.
- W3099437791 hasConcept C153083717 @default.
- W3099437791 hasConcept C154945302 @default.
- W3099437791 hasConcept C184898388 @default.
- W3099437791 hasConcept C2908647359 @default.
- W3099437791 hasConcept C41008148 @default.
- W3099437791 hasConcept C46686674 @default.
- W3099437791 hasConcept C97541855 @default.
- W3099437791 hasConceptScore W3099437791C111919701 @default.
- W3099437791 hasConceptScore W3099437791C119857082 @default.
- W3099437791 hasConceptScore W3099437791C127705205 @default.
- W3099437791 hasConceptScore W3099437791C144024400 @default.
- W3099437791 hasConceptScore W3099437791C149923435 @default.
- W3099437791 hasConceptScore W3099437791C153083717 @default.
- W3099437791 hasConceptScore W3099437791C154945302 @default.
- W3099437791 hasConceptScore W3099437791C184898388 @default.
- W3099437791 hasConceptScore W3099437791C2908647359 @default.
- W3099437791 hasConceptScore W3099437791C41008148 @default.
- W3099437791 hasConceptScore W3099437791C46686674 @default.
- W3099437791 hasConceptScore W3099437791C97541855 @default.
- W3099437791 hasLocation W30994377911 @default.
- W3099437791 hasOpenAccess W3099437791 @default.
- W3099437791 hasPrimaryLocation W30994377911 @default.
- W3099437791 hasRelatedWork W2294805292 @default.
- W3099437791 hasRelatedWork W2899041500 @default.
- W3099437791 hasRelatedWork W2924740141 @default.
- W3099437791 hasRelatedWork W2962719460 @default.
- W3099437791 hasRelatedWork W2963199420 @default.
- W3099437791 hasRelatedWork W2978242174 @default.
- W3099437791 hasRelatedWork W2995102855 @default.
- W3099437791 hasRelatedWork W2998241503 @default.
- W3099437791 hasRelatedWork W3004082694 @default.
- W3099437791 hasRelatedWork W3022124161 @default.
- W3099437791 hasRelatedWork W3035216917 @default.
- W3099437791 hasRelatedWork W3040907965 @default.
- W3099437791 hasRelatedWork W3091395917 @default.
- W3099437791 hasRelatedWork W3092185126 @default.
- W3099437791 hasRelatedWork W3098960920 @default.
- W3099437791 hasRelatedWork W3111053239 @default.
- W3099437791 hasRelatedWork W3130654908 @default.
- W3099437791 hasRelatedWork W3168815054 @default.
- W3099437791 hasRelatedWork W3183316534 @default.
- W3099437791 hasRelatedWork W3210095716 @default.
- W3099437791 hasVolume "33" @default.
- W3099437791 isParatext "false" @default.
- W3099437791 isRetracted "false" @default.
- W3099437791 magId "3099437791" @default.
- W3099437791 workType "article" @default.