Matches in SemOpenAlex for { <https://semopenalex.org/work/W4384268367> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4384268367 abstract "Incremental decision making in real-world environments is one of the most challenging tasks in embodied artificial intelligence. One particularly demanding scenario is Vision and Language Navigation~(VLN) which requires visual and natural language understanding as well as spatial and temporal reasoning capabilities. The embodied agent needs to ground its understanding of navigation instructions in observations of a real-world environment like Street View. Despite the impressive results of LLMs in other research areas, it is an ongoing problem of how to best connect them with an interactive visual environment. In this work, we propose VELMA, an embodied LLM agent that uses a verbalization of the trajectory and of visual environment observations as contextual prompt for the next action. Visual information is verbalized by a pipeline that extracts landmarks from the human written navigation instructions and uses CLIP to determine their visibility in the current panorama view. We show that VELMA is able to successfully follow navigation instructions in Street View with only two in-context examples. We further finetune the LLM agent on a few thousand examples and achieve 25%-30% relative improvement in task completion over the previous state-of-the-art for two datasets." @default.
- W4384268367 created "2023-07-14" @default.
- W4384268367 creator A5011749347 @default.
- W4384268367 creator A5023400154 @default.
- W4384268367 creator A5025148955 @default.
- W4384268367 creator A5036747250 @default.
- W4384268367 creator A5050195037 @default.
- W4384268367 creator A5066593161 @default.
- W4384268367 date "2023-07-12" @default.
- W4384268367 modified "2023-10-17" @default.
- W4384268367 title "VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View" @default.
- W4384268367 doi "https://doi.org/10.48550/arxiv.2307.06082" @default.
- W4384268367 hasPublicationYear "2023" @default.
- W4384268367 type Work @default.
- W4384268367 citedByCount "0" @default.
- W4384268367 crossrefType "posted-content" @default.
- W4384268367 hasAuthorship W4384268367A5011749347 @default.
- W4384268367 hasAuthorship W4384268367A5023400154 @default.
- W4384268367 hasAuthorship W4384268367A5025148955 @default.
- W4384268367 hasAuthorship W4384268367A5036747250 @default.
- W4384268367 hasAuthorship W4384268367A5050195037 @default.
- W4384268367 hasAuthorship W4384268367A5066593161 @default.
- W4384268367 hasBestOaLocation W43842683671 @default.
- W4384268367 hasConcept C100609095 @default.
- W4384268367 hasConcept C103683099 @default.
- W4384268367 hasConcept C107457646 @default.
- W4384268367 hasConcept C120665830 @default.
- W4384268367 hasConcept C121332964 @default.
- W4384268367 hasConcept C123403432 @default.
- W4384268367 hasConcept C127413603 @default.
- W4384268367 hasConcept C151730666 @default.
- W4384268367 hasConcept C154945302 @default.
- W4384268367 hasConcept C201995342 @default.
- W4384268367 hasConcept C2779343474 @default.
- W4384268367 hasConcept C2780451532 @default.
- W4384268367 hasConcept C2780580889 @default.
- W4384268367 hasConcept C2780791683 @default.
- W4384268367 hasConcept C41008148 @default.
- W4384268367 hasConcept C62520636 @default.
- W4384268367 hasConcept C64754055 @default.
- W4384268367 hasConcept C86803240 @default.
- W4384268367 hasConceptScore W4384268367C100609095 @default.
- W4384268367 hasConceptScore W4384268367C103683099 @default.
- W4384268367 hasConceptScore W4384268367C107457646 @default.
- W4384268367 hasConceptScore W4384268367C120665830 @default.
- W4384268367 hasConceptScore W4384268367C121332964 @default.
- W4384268367 hasConceptScore W4384268367C123403432 @default.
- W4384268367 hasConceptScore W4384268367C127413603 @default.
- W4384268367 hasConceptScore W4384268367C151730666 @default.
- W4384268367 hasConceptScore W4384268367C154945302 @default.
- W4384268367 hasConceptScore W4384268367C201995342 @default.
- W4384268367 hasConceptScore W4384268367C2779343474 @default.
- W4384268367 hasConceptScore W4384268367C2780451532 @default.
- W4384268367 hasConceptScore W4384268367C2780580889 @default.
- W4384268367 hasConceptScore W4384268367C2780791683 @default.
- W4384268367 hasConceptScore W4384268367C41008148 @default.
- W4384268367 hasConceptScore W4384268367C62520636 @default.
- W4384268367 hasConceptScore W4384268367C64754055 @default.
- W4384268367 hasConceptScore W4384268367C86803240 @default.
- W4384268367 hasLocation W43842683671 @default.
- W4384268367 hasOpenAccess W4384268367 @default.
- W4384268367 hasPrimaryLocation W43842683671 @default.
- W4384268367 hasRelatedWork W1488803062 @default.
- W4384268367 hasRelatedWork W1522117956 @default.
- W4384268367 hasRelatedWork W1595897272 @default.
- W4384268367 hasRelatedWork W1824318071 @default.
- W4384268367 hasRelatedWork W2031296774 @default.
- W4384268367 hasRelatedWork W2068486122 @default.
- W4384268367 hasRelatedWork W2942109448 @default.
- W4384268367 hasRelatedWork W4303647895 @default.
- W4384268367 hasRelatedWork W4327926809 @default.
- W4384268367 hasRelatedWork W53200246 @default.
- W4384268367 isParatext "false" @default.
- W4384268367 isRetracted "false" @default.
- W4384268367 workType "article" @default.