Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387635823> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4387635823 abstract "The reasoning capabilities of LLM (Large Language Model) are widely acknowledged in recent research, inspiring studies on tool learning and autonomous agents. LLM serves as the brain of agent, orchestrating multiple tools for collaborative multi-step task solving. Unlike methods invoking tools like calculators or weather APIs for straightforward tasks, multi-modal agents excel by integrating diverse AI models for complex challenges. However, current multi-modal agents neglect the significance of model selection: they primarily focus on the planning and execution phases, and will only invoke predefined task-specific models for each subtask, making the execution fragile. Meanwhile, other traditional model selection methods are either incompatible with or suboptimal for the multi-modal agent scenarios, due to ignorance of dependencies among subtasks arising by multi-step reasoning. To this end, we identify the key challenges therein and propose the $textit{M}^3$ framework as a plug-in with negligible runtime overhead at test-time. This framework improves model selection and bolsters the robustness of multi-modal agents in multi-step reasoning. In the absence of suitable benchmarks, we create MS-GQA, a new dataset specifically designed to investigate the model selection challenge in multi-modal agents. Our experiments reveal that our framework enables dynamic model selection, considering both user inputs and subtask dependencies, thereby robustifying the overall reasoning process. Our code and benchmark: https://github.com/LINs-lab/M3." @default.
- W4387635823 created "2023-10-14" @default.
- W4387635823 creator A5029290306 @default.
- W4387635823 creator A5050008056 @default.
- W4387635823 creator A5077696518 @default.
- W4387635823 creator A5087545659 @default.
- W4387635823 date "2023-10-12" @default.
- W4387635823 modified "2023-10-15" @default.
- W4387635823 title "Towards Robust Multi-Modal Reasoning via Model Selection" @default.
- W4387635823 doi "https://doi.org/10.48550/arxiv.2310.08446" @default.
- W4387635823 hasPublicationYear "2023" @default.
- W4387635823 type Work @default.
- W4387635823 citedByCount "0" @default.
- W4387635823 crossrefType "posted-content" @default.
- W4387635823 hasAuthorship W4387635823A5029290306 @default.
- W4387635823 hasAuthorship W4387635823A5050008056 @default.
- W4387635823 hasAuthorship W4387635823A5077696518 @default.
- W4387635823 hasAuthorship W4387635823A5087545659 @default.
- W4387635823 hasBestOaLocation W43876358231 @default.
- W4387635823 hasConcept C104317684 @default.
- W4387635823 hasConcept C119857082 @default.
- W4387635823 hasConcept C127413603 @default.
- W4387635823 hasConcept C13280743 @default.
- W4387635823 hasConcept C154945302 @default.
- W4387635823 hasConcept C185592680 @default.
- W4387635823 hasConcept C185798385 @default.
- W4387635823 hasConcept C188027245 @default.
- W4387635823 hasConcept C199360897 @default.
- W4387635823 hasConcept C201995342 @default.
- W4387635823 hasConcept C205649164 @default.
- W4387635823 hasConcept C2780451532 @default.
- W4387635823 hasConcept C41008148 @default.
- W4387635823 hasConcept C55493867 @default.
- W4387635823 hasConcept C63479239 @default.
- W4387635823 hasConcept C71139939 @default.
- W4387635823 hasConcept C81917197 @default.
- W4387635823 hasConcept C98045186 @default.
- W4387635823 hasConceptScore W4387635823C104317684 @default.
- W4387635823 hasConceptScore W4387635823C119857082 @default.
- W4387635823 hasConceptScore W4387635823C127413603 @default.
- W4387635823 hasConceptScore W4387635823C13280743 @default.
- W4387635823 hasConceptScore W4387635823C154945302 @default.
- W4387635823 hasConceptScore W4387635823C185592680 @default.
- W4387635823 hasConceptScore W4387635823C185798385 @default.
- W4387635823 hasConceptScore W4387635823C188027245 @default.
- W4387635823 hasConceptScore W4387635823C199360897 @default.
- W4387635823 hasConceptScore W4387635823C201995342 @default.
- W4387635823 hasConceptScore W4387635823C205649164 @default.
- W4387635823 hasConceptScore W4387635823C2780451532 @default.
- W4387635823 hasConceptScore W4387635823C41008148 @default.
- W4387635823 hasConceptScore W4387635823C55493867 @default.
- W4387635823 hasConceptScore W4387635823C63479239 @default.
- W4387635823 hasConceptScore W4387635823C71139939 @default.
- W4387635823 hasConceptScore W4387635823C81917197 @default.
- W4387635823 hasConceptScore W4387635823C98045186 @default.
- W4387635823 hasLocation W43876358231 @default.
- W4387635823 hasOpenAccess W4387635823 @default.
- W4387635823 hasPrimaryLocation W43876358231 @default.
- W4387635823 hasRelatedWork W2028665553 @default.
- W4387635823 hasRelatedWork W2086519370 @default.
- W4387635823 hasRelatedWork W2087343574 @default.
- W4387635823 hasRelatedWork W2105860728 @default.
- W4387635823 hasRelatedWork W2130974462 @default.
- W4387635823 hasRelatedWork W2378211422 @default.
- W4387635823 hasRelatedWork W2389015757 @default.
- W4387635823 hasRelatedWork W2535915176 @default.
- W4387635823 hasRelatedWork W4321353415 @default.
- W4387635823 hasRelatedWork W972276598 @default.
- W4387635823 isParatext "false" @default.
- W4387635823 isRetracted "false" @default.
- W4387635823 workType "article" @default.