Matches in SemOpenAlex for { <https://semopenalex.org/work/W2995911207> ?p ?o ?g. }
- W2995911207 abstract "A promising approach for teaching artificial agents to use natural language involves using human-in-the-loop training. However, recent work suggests that current machine learning methods are too data inefficient to be trained in this way from scratch. In this paper, we investigate the relationship between two categories of learning signals with the ultimate goal of improving sample efficiency: imitating human language data via supervised learning, and maximizing reward in a simulated multi-agent environment via self-play (as done in emergent communication), and introduce the term textit{supervised self-play (S2P)} for algorithms using both of these signals. We find that first training agents via supervised learning on human data followed by self-play outperforms the converse, suggesting that it is not beneficial to emerge languages from scratch. We then empirically investigate various S2P schedules that begin with supervised learning in two environments: a Lewis signaling game with symbolic inputs, and an image-based referential game with natural language descriptions. Lastly, we introduce population based approaches to S2P, which further improves the performance over single-agent methods." @default.
- W2995911207 created "2019-12-26" @default.
- W2995911207 creator A5004295653 @default.
- W2995911207 creator A5016956470 @default.
- W2995911207 creator A5053034244 @default.
- W2995911207 creator A5059094093 @default.
- W2995911207 creator A5080591144 @default.
- W2995911207 date "2020-04-30" @default.
- W2995911207 modified "2023-10-04" @default.
- W2995911207 title "On the interaction between supervision and self-play in emergent communication" @default.
- W2995911207 cites W1542941925 @default.
- W2995911207 cites W1962416794 @default.
- W2995911207 cites W2007752263 @default.
- W2995911207 cites W2046104360 @default.
- W2995911207 cites W2142947219 @default.
- W2995911207 cites W2157331557 @default.
- W2995911207 cites W2547875792 @default.
- W2995911207 cites W2548228487 @default.
- W2995911207 cites W2564324149 @default.
- W2995911207 cites W2835434549 @default.
- W2995911207 cites W2890476408 @default.
- W2995911207 cites W2897513296 @default.
- W2995911207 cites W2914351253 @default.
- W2995911207 cites W2936752925 @default.
- W2995911207 cites W2959402823 @default.
- W2995911207 cites W2962766710 @default.
- W2995911207 cites W2962852262 @default.
- W2995911207 cites W2963000099 @default.
- W2995911207 cites W2963147362 @default.
- W2995911207 cites W2963155490 @default.
- W2995911207 cites W2963341956 @default.
- W2995911207 cites W2963407617 @default.
- W2995911207 cites W2963455109 @default.
- W2995911207 cites W2963681240 @default.
- W2995911207 cites W2963881016 @default.
- W2995911207 cites W2964289358 @default.
- W2995911207 cites W2964338167 @default.
- W2995911207 cites W2967852106 @default.
- W2995911207 cites W2970251580 @default.
- W2995911207 cites W2970603867 @default.
- W2995911207 cites W2989265781 @default.
- W2995911207 hasPublicationYear "2020" @default.
- W2995911207 type Work @default.
- W2995911207 sameAs 2995911207 @default.
- W2995911207 citedByCount "26" @default.
- W2995911207 countsByYear W29959112072019 @default.
- W2995911207 countsByYear W29959112072020 @default.
- W2995911207 countsByYear W29959112072021 @default.
- W2995911207 crossrefType "proceedings-article" @default.
- W2995911207 hasAuthorship W2995911207A5004295653 @default.
- W2995911207 hasAuthorship W2995911207A5016956470 @default.
- W2995911207 hasAuthorship W2995911207A5053034244 @default.
- W2995911207 hasAuthorship W2995911207A5059094093 @default.
- W2995911207 hasAuthorship W2995911207A5080591144 @default.
- W2995911207 hasConcept C111919701 @default.
- W2995911207 hasConcept C119857082 @default.
- W2995911207 hasConcept C136389625 @default.
- W2995911207 hasConcept C144024400 @default.
- W2995911207 hasConcept C149923435 @default.
- W2995911207 hasConcept C154945302 @default.
- W2995911207 hasConcept C195324797 @default.
- W2995911207 hasConcept C2524010 @default.
- W2995911207 hasConcept C2776809875 @default.
- W2995911207 hasConcept C2781235140 @default.
- W2995911207 hasConcept C2908647359 @default.
- W2995911207 hasConcept C33923547 @default.
- W2995911207 hasConcept C41008148 @default.
- W2995911207 hasConcept C50644808 @default.
- W2995911207 hasConceptScore W2995911207C111919701 @default.
- W2995911207 hasConceptScore W2995911207C119857082 @default.
- W2995911207 hasConceptScore W2995911207C136389625 @default.
- W2995911207 hasConceptScore W2995911207C144024400 @default.
- W2995911207 hasConceptScore W2995911207C149923435 @default.
- W2995911207 hasConceptScore W2995911207C154945302 @default.
- W2995911207 hasConceptScore W2995911207C195324797 @default.
- W2995911207 hasConceptScore W2995911207C2524010 @default.
- W2995911207 hasConceptScore W2995911207C2776809875 @default.
- W2995911207 hasConceptScore W2995911207C2781235140 @default.
- W2995911207 hasConceptScore W2995911207C2908647359 @default.
- W2995911207 hasConceptScore W2995911207C33923547 @default.
- W2995911207 hasConceptScore W2995911207C41008148 @default.
- W2995911207 hasConceptScore W2995911207C50644808 @default.
- W2995911207 hasLocation W29959112071 @default.
- W2995911207 hasOpenAccess W2995911207 @default.
- W2995911207 hasPrimaryLocation W29959112071 @default.
- W2995911207 hasRelatedWork W2064675550 @default.
- W2995911207 hasRelatedWork W2119717200 @default.
- W2995911207 hasRelatedWork W2142947219 @default.
- W2995911207 hasRelatedWork W2547875792 @default.
- W2995911207 hasRelatedWork W2564324149 @default.
- W2995911207 hasRelatedWork W2888912391 @default.
- W2995911207 hasRelatedWork W2914351253 @default.
- W2995911207 hasRelatedWork W2962766710 @default.
- W2995911207 hasRelatedWork W2963000099 @default.
- W2995911207 hasRelatedWork W2963147362 @default.
- W2995911207 hasRelatedWork W2963155490 @default.
- W2995911207 hasRelatedWork W2963455109 @default.
- W2995911207 hasRelatedWork W2963681240 @default.
- W2995911207 hasRelatedWork W2963881016 @default.
- W2995911207 hasRelatedWork W2964289358 @default.