Matches in SemOpenAlex for { <https://semopenalex.org/work/W4319453286> ?p ?o ?g. }
Showing items 1 to 74 of
74
with 100 items per page.
- W4319453286 abstract "Recent works successfully leveraged Large Language Models' (LLM) abilities to capture abstract knowledge about world's physics to solve decision-making problems. Yet, the alignment between LLMs' knowledge and the environment can be wrong and limit functional competence due to lack of grounding. In this paper, we study an approach (named GLAM) to achieve this alignment through functional grounding: we consider an agent using an LLM as a policy that is progressively updated as the agent interacts with the environment, leveraging online Reinforcement Learning to improve its performance to solve goals. Using an interactive textual environment designed to study higher-level forms of functional grounding, and a set of spatial and navigation tasks, we study several scientific questions: 1) Can LLMs boost sample efficiency for online learning of various RL tasks? 2) How can it boost different forms of generalization? 3) What is the impact of online learning? We study these questions by functionally grounding several variants (size, architecture) of FLAN-T5." @default.
- W4319453286 created "2023-02-09" @default.
- W4319453286 creator A5031635996 @default.
- W4319453286 creator A5039026472 @default.
- W4319453286 creator A5042850624 @default.
- W4319453286 creator A5072955860 @default.
- W4319453286 creator A5078865608 @default.
- W4319453286 creator A5085903884 @default.
- W4319453286 date "2023-02-06" @default.
- W4319453286 modified "2023-09-27" @default.
- W4319453286 title "Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning" @default.
- W4319453286 doi "https://doi.org/10.48550/arxiv.2302.02662" @default.
- W4319453286 hasPublicationYear "2023" @default.
- W4319453286 type Work @default.
- W4319453286 citedByCount "0" @default.
- W4319453286 crossrefType "posted-content" @default.
- W4319453286 hasAuthorship W4319453286A5031635996 @default.
- W4319453286 hasAuthorship W4319453286A5039026472 @default.
- W4319453286 hasAuthorship W4319453286A5042850624 @default.
- W4319453286 hasAuthorship W4319453286A5072955860 @default.
- W4319453286 hasAuthorship W4319453286A5078865608 @default.
- W4319453286 hasAuthorship W4319453286A5085903884 @default.
- W4319453286 hasBestOaLocation W43194532861 @default.
- W4319453286 hasConcept C100521375 @default.
- W4319453286 hasConcept C107457646 @default.
- W4319453286 hasConcept C119599485 @default.
- W4319453286 hasConcept C127413603 @default.
- W4319453286 hasConcept C134306372 @default.
- W4319453286 hasConcept C154945302 @default.
- W4319453286 hasConcept C15744967 @default.
- W4319453286 hasConcept C168993435 @default.
- W4319453286 hasConcept C177148314 @default.
- W4319453286 hasConcept C177264268 @default.
- W4319453286 hasConcept C199360897 @default.
- W4319453286 hasConcept C2986087404 @default.
- W4319453286 hasConcept C33923547 @default.
- W4319453286 hasConcept C41008148 @default.
- W4319453286 hasConcept C49774154 @default.
- W4319453286 hasConcept C77805123 @default.
- W4319453286 hasConcept C97541855 @default.
- W4319453286 hasConceptScore W4319453286C100521375 @default.
- W4319453286 hasConceptScore W4319453286C107457646 @default.
- W4319453286 hasConceptScore W4319453286C119599485 @default.
- W4319453286 hasConceptScore W4319453286C127413603 @default.
- W4319453286 hasConceptScore W4319453286C134306372 @default.
- W4319453286 hasConceptScore W4319453286C154945302 @default.
- W4319453286 hasConceptScore W4319453286C15744967 @default.
- W4319453286 hasConceptScore W4319453286C168993435 @default.
- W4319453286 hasConceptScore W4319453286C177148314 @default.
- W4319453286 hasConceptScore W4319453286C177264268 @default.
- W4319453286 hasConceptScore W4319453286C199360897 @default.
- W4319453286 hasConceptScore W4319453286C2986087404 @default.
- W4319453286 hasConceptScore W4319453286C33923547 @default.
- W4319453286 hasConceptScore W4319453286C41008148 @default.
- W4319453286 hasConceptScore W4319453286C49774154 @default.
- W4319453286 hasConceptScore W4319453286C77805123 @default.
- W4319453286 hasConceptScore W4319453286C97541855 @default.
- W4319453286 hasLocation W43194532861 @default.
- W4319453286 hasLocation W43194532862 @default.
- W4319453286 hasOpenAccess W4319453286 @default.
- W4319453286 hasPrimaryLocation W43194532861 @default.
- W4319453286 hasRelatedWork W183234821 @default.
- W4319453286 hasRelatedWork W260766989 @default.
- W4319453286 hasRelatedWork W2959276766 @default.
- W4319453286 hasRelatedWork W3074294383 @default.
- W4319453286 hasRelatedWork W3111983280 @default.
- W4319453286 hasRelatedWork W3139193008 @default.
- W4319453286 hasRelatedWork W3164468573 @default.
- W4319453286 hasRelatedWork W4206669594 @default.
- W4319453286 hasRelatedWork W4295941380 @default.
- W4319453286 hasRelatedWork W4377293004 @default.
- W4319453286 isParatext "false" @default.
- W4319453286 isRetracted "false" @default.
- W4319453286 workType "article" @default.