Matches in SemOpenAlex for { <https://semopenalex.org/work/W4378474105> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W4378474105 abstract "In this work we examine the ability of language models to generate explicit world models of scientific and common-sense reasoning tasks by framing this as a problem of generating text-based games. To support this, we introduce ByteSized32, a corpus of 32 highly-templated text games written in Python totaling 24k lines of code, each centered around a particular task, and paired with a set of 16 unseen text game specifications for evaluation. We propose a suite of automatic and manual metrics for assessing simulation validity, compliance with task specifications, playability, winnability, and alignment with the physical world. In a single-shot evaluation of GPT-4 on this simulation-as-code-generation task, we find it capable of producing runnable games in 27% of cases, highlighting the difficulty of this challenge task. We discuss areas of future improvement, including GPT-4's apparent capacity to perform well at simulating near canonical task solutions, with performance dropping off as simulations include distractors or deviate from canonical solutions in the action space." @default.
- W4378474105 created "2023-05-27" @default.
- W4378474105 creator A5032540574 @default.
- W4378474105 creator A5044769257 @default.
- W4378474105 creator A5058160578 @default.
- W4378474105 creator A5072779594 @default.
- W4378474105 creator A5083645580 @default.
- W4378474105 creator A5091495774 @default.
- W4378474105 date "2023-05-24" @default.
- W4378474105 modified "2023-10-16" @default.
- W4378474105 title "ByteSized32: A Corpus and Challenge Task for Generating Task-Specific World Models Expressed as Text Games" @default.
- W4378474105 doi "https://doi.org/10.48550/arxiv.2305.14879" @default.
- W4378474105 hasPublicationYear "2023" @default.
- W4378474105 type Work @default.
- W4378474105 citedByCount "0" @default.
- W4378474105 crossrefType "posted-content" @default.
- W4378474105 hasAuthorship W4378474105A5032540574 @default.
- W4378474105 hasAuthorship W4378474105A5044769257 @default.
- W4378474105 hasAuthorship W4378474105A5058160578 @default.
- W4378474105 hasAuthorship W4378474105A5072779594 @default.
- W4378474105 hasAuthorship W4378474105A5083645580 @default.
- W4378474105 hasAuthorship W4378474105A5091495774 @default.
- W4378474105 hasBestOaLocation W43784741051 @default.
- W4378474105 hasConcept C107457646 @default.
- W4378474105 hasConcept C154945302 @default.
- W4378474105 hasConcept C162324750 @default.
- W4378474105 hasConcept C166957645 @default.
- W4378474105 hasConcept C187736073 @default.
- W4378474105 hasConcept C199360897 @default.
- W4378474105 hasConcept C199519371 @default.
- W4378474105 hasConcept C204321447 @default.
- W4378474105 hasConcept C2777904410 @default.
- W4378474105 hasConcept C2780451532 @default.
- W4378474105 hasConcept C41008148 @default.
- W4378474105 hasConcept C519991488 @default.
- W4378474105 hasConcept C79581498 @default.
- W4378474105 hasConcept C95457728 @default.
- W4378474105 hasConceptScore W4378474105C107457646 @default.
- W4378474105 hasConceptScore W4378474105C154945302 @default.
- W4378474105 hasConceptScore W4378474105C162324750 @default.
- W4378474105 hasConceptScore W4378474105C166957645 @default.
- W4378474105 hasConceptScore W4378474105C187736073 @default.
- W4378474105 hasConceptScore W4378474105C199360897 @default.
- W4378474105 hasConceptScore W4378474105C199519371 @default.
- W4378474105 hasConceptScore W4378474105C204321447 @default.
- W4378474105 hasConceptScore W4378474105C2777904410 @default.
- W4378474105 hasConceptScore W4378474105C2780451532 @default.
- W4378474105 hasConceptScore W4378474105C41008148 @default.
- W4378474105 hasConceptScore W4378474105C519991488 @default.
- W4378474105 hasConceptScore W4378474105C79581498 @default.
- W4378474105 hasConceptScore W4378474105C95457728 @default.
- W4378474105 hasLocation W43784741051 @default.
- W4378474105 hasOpenAccess W4378474105 @default.
- W4378474105 hasPrimaryLocation W43784741051 @default.
- W4378474105 hasRelatedWork W2081647779 @default.
- W4378474105 hasRelatedWork W2143017621 @default.
- W4378474105 hasRelatedWork W2327204559 @default.
- W4378474105 hasRelatedWork W2529681551 @default.
- W4378474105 hasRelatedWork W3017187763 @default.
- W4378474105 hasRelatedWork W3185852197 @default.
- W4378474105 hasRelatedWork W3198474835 @default.
- W4378474105 hasRelatedWork W4232504361 @default.
- W4378474105 hasRelatedWork W4245752324 @default.
- W4378474105 hasRelatedWork W643517603 @default.
- W4378474105 isParatext "false" @default.
- W4378474105 isRetracted "false" @default.
- W4378474105 workType "article" @default.