Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387294588> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W4387294588 abstract "Large multimodal models (LMMs) extend large language models (LLMs) with multi-sensory skills, such as visual understanding, to achieve stronger generic intelligence. In this paper, we analyze the latest model, GPT-4V(ision), to deepen the understanding of LMMs. The analysis focuses on the intriguing tasks that GPT-4V can perform, containing test samples to probe the quality and genericity of GPT-4V's capabilities, its supported inputs and working modes, and the effective ways to prompt the model. In our approach to exploring GPT-4V, we curate and organize a collection of carefully designed qualitative samples spanning a variety of domains and tasks. Observations from these samples demonstrate that GPT-4V's unprecedented ability in processing arbitrarily interleaved multimodal inputs and the genericity of its capabilities together make GPT-4V a powerful multimodal generalist system. Furthermore, GPT-4V's unique capability of understanding visual markers drawn on input images can give rise to new human-computer interaction methods such as visual referring prompting. We conclude the report with in-depth discussions on the emerging application scenarios and the future research directions for GPT-4V-based systems. We hope that this preliminary exploration will inspire future research on the next-generation multimodal task formulation, new ways to exploit and enhance LMMs to solve real-world problems, and gaining better understanding of multimodal foundation models. Finally, we acknowledge that the model under our study is solely the product of OpenAI's innovative work, and they should be fully credited for its development. Please see the GPT-4V contributions paper for the authorship and credit attribution: https://cdn.openai.com/contributions/gpt-4v.pdf" @default.
- W4387294588 created "2023-10-03" @default.
- W4387294588 creator A5025592561 @default.
- W4387294588 creator A5028783832 @default.
- W4387294588 creator A5048295582 @default.
- W4387294588 creator A5050209478 @default.
- W4387294588 creator A5057643713 @default.
- W4387294588 creator A5065235694 @default.
- W4387294588 creator A5074764224 @default.
- W4387294588 date "2023-09-29" @default.
- W4387294588 modified "2023-10-14" @default.
- W4387294588 title "The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)" @default.
- W4387294588 doi "https://doi.org/10.48550/arxiv.2309.17421" @default.
- W4387294588 hasPublicationYear "2023" @default.
- W4387294588 type Work @default.
- W4387294588 citedByCount "0" @default.
- W4387294588 crossrefType "posted-content" @default.
- W4387294588 hasAuthorship W4387294588A5025592561 @default.
- W4387294588 hasAuthorship W4387294588A5028783832 @default.
- W4387294588 hasAuthorship W4387294588A5048295582 @default.
- W4387294588 hasAuthorship W4387294588A5050209478 @default.
- W4387294588 hasAuthorship W4387294588A5057643713 @default.
- W4387294588 hasAuthorship W4387294588A5065235694 @default.
- W4387294588 hasAuthorship W4387294588A5074764224 @default.
- W4387294588 hasBestOaLocation W43872945881 @default.
- W4387294588 hasConcept C111472728 @default.
- W4387294588 hasConcept C119857082 @default.
- W4387294588 hasConcept C127413603 @default.
- W4387294588 hasConcept C136197465 @default.
- W4387294588 hasConcept C138885662 @default.
- W4387294588 hasConcept C154945302 @default.
- W4387294588 hasConcept C165696696 @default.
- W4387294588 hasConcept C201995342 @default.
- W4387294588 hasConcept C2522767166 @default.
- W4387294588 hasConcept C2779530757 @default.
- W4387294588 hasConcept C2780451532 @default.
- W4387294588 hasConcept C38652104 @default.
- W4387294588 hasConcept C41008148 @default.
- W4387294588 hasConceptScore W4387294588C111472728 @default.
- W4387294588 hasConceptScore W4387294588C119857082 @default.
- W4387294588 hasConceptScore W4387294588C127413603 @default.
- W4387294588 hasConceptScore W4387294588C136197465 @default.
- W4387294588 hasConceptScore W4387294588C138885662 @default.
- W4387294588 hasConceptScore W4387294588C154945302 @default.
- W4387294588 hasConceptScore W4387294588C165696696 @default.
- W4387294588 hasConceptScore W4387294588C201995342 @default.
- W4387294588 hasConceptScore W4387294588C2522767166 @default.
- W4387294588 hasConceptScore W4387294588C2779530757 @default.
- W4387294588 hasConceptScore W4387294588C2780451532 @default.
- W4387294588 hasConceptScore W4387294588C38652104 @default.
- W4387294588 hasConceptScore W4387294588C41008148 @default.
- W4387294588 hasLocation W43872945881 @default.
- W4387294588 hasOpenAccess W4387294588 @default.
- W4387294588 hasPrimaryLocation W43872945881 @default.
- W4387294588 hasRelatedWork W1496222301 @default.
- W4387294588 hasRelatedWork W1590307681 @default.
- W4387294588 hasRelatedWork W1671936420 @default.
- W4387294588 hasRelatedWork W2353836703 @default.
- W4387294588 hasRelatedWork W2358353312 @default.
- W4387294588 hasRelatedWork W3207760230 @default.
- W4387294588 hasRelatedWork W41015297 @default.
- W4387294588 hasRelatedWork W4280645561 @default.
- W4387294588 hasRelatedWork W4285370786 @default.
- W4387294588 hasRelatedWork W4312814274 @default.
- W4387294588 isParatext "false" @default.
- W4387294588 isRetracted "false" @default.
- W4387294588 workType "article" @default.