Matches in SemOpenAlex for { <https://semopenalex.org/work/W4367367040> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W4367367040 abstract "Large language models (LLMs) have demonstrated impressive zero-shot abilities on a variety of open-ended tasks, while recent research has also explored the use of LLMs for multi-modal generation. In this study, we introduce mPLUG-Owl, a novel training paradigm that equips LLMs with multi-modal abilities through modularized learning of foundation LLM, a visual knowledge module, and a visual abstractor module. This approach can support multiple modalities and facilitate diverse unimodal and multimodal abilities through modality collaboration. The training paradigm of mPLUG-Owl involves a two-stage method for aligning image and text, which learns visual knowledge with the assistance of LLM while maintaining and even improving the generation abilities of LLM. In the first stage, the visual knowledge module and abstractor module are trained with a frozen LLM module to align the image and text. In the second stage, language-only and multi-modal supervised datasets are used to jointly fine-tune a low-rank adaption (LoRA) module on LLM and the abstractor module by freezing the visual knowledge module. We carefully build a visually-related instruction evaluation set OwlEval. Experimental results show that our model outperforms existing multi-modal models, demonstrating mPLUG-Owl's impressive instruction and visual understanding ability, multi-turn conversation ability, and knowledge reasoning ability. Besides, we observe some unexpected and exciting abilities such as multi-image correlation and scene text understanding, which makes it possible to leverage it for harder real scenarios, such as vision-only document comprehension. Our code, pre-trained model, instruction-tuned models, and evaluation set are available at https://github.com/X-PLUG/mPLUG-Owl. The online demo is available at https://www.modelscope.cn/studios/damo/mPLUG-Owl." @default.
- W4367367040 created "2023-04-30" @default.
- W4367367040 creator A5005965903 @default.
- W4367367040 creator A5007613197 @default.
- W4367367040 creator A5010446607 @default.
- W4367367040 creator A5013145898 @default.
- W4367367040 creator A5019498452 @default.
- W4367367040 creator A5019915855 @default.
- W4367367040 creator A5028401090 @default.
- W4367367040 creator A5033665900 @default.
- W4367367040 creator A5041344732 @default.
- W4367367040 creator A5047337082 @default.
- W4367367040 creator A5055809682 @default.
- W4367367040 creator A5060065671 @default.
- W4367367040 creator A5061127174 @default.
- W4367367040 creator A5065789784 @default.
- W4367367040 creator A5084741576 @default.
- W4367367040 creator A5091465907 @default.
- W4367367040 creator A5091560903 @default.
- W4367367040 date "2023-04-27" @default.
- W4367367040 modified "2023-09-27" @default.
- W4367367040 title "mPLUG-Owl: Modularization Empowers Large Language Models with Multimodality" @default.
- W4367367040 doi "https://doi.org/10.48550/arxiv.2304.14178" @default.
- W4367367040 hasPublicationYear "2023" @default.
- W4367367040 type Work @default.
- W4367367040 citedByCount "0" @default.
- W4367367040 crossrefType "posted-content" @default.
- W4367367040 hasAuthorship W4367367040A5005965903 @default.
- W4367367040 hasAuthorship W4367367040A5007613197 @default.
- W4367367040 hasAuthorship W4367367040A5010446607 @default.
- W4367367040 hasAuthorship W4367367040A5013145898 @default.
- W4367367040 hasAuthorship W4367367040A5019498452 @default.
- W4367367040 hasAuthorship W4367367040A5019915855 @default.
- W4367367040 hasAuthorship W4367367040A5028401090 @default.
- W4367367040 hasAuthorship W4367367040A5033665900 @default.
- W4367367040 hasAuthorship W4367367040A5041344732 @default.
- W4367367040 hasAuthorship W4367367040A5047337082 @default.
- W4367367040 hasAuthorship W4367367040A5055809682 @default.
- W4367367040 hasAuthorship W4367367040A5060065671 @default.
- W4367367040 hasAuthorship W4367367040A5061127174 @default.
- W4367367040 hasAuthorship W4367367040A5065789784 @default.
- W4367367040 hasAuthorship W4367367040A5084741576 @default.
- W4367367040 hasAuthorship W4367367040A5091465907 @default.
- W4367367040 hasAuthorship W4367367040A5091560903 @default.
- W4367367040 hasBestOaLocation W43673670401 @default.
- W4367367040 hasConcept C138885662 @default.
- W4367367040 hasConcept C144024400 @default.
- W4367367040 hasConcept C153083717 @default.
- W4367367040 hasConcept C154945302 @default.
- W4367367040 hasConcept C185592680 @default.
- W4367367040 hasConcept C188027245 @default.
- W4367367040 hasConcept C204321447 @default.
- W4367367040 hasConcept C2777200299 @default.
- W4367367040 hasConcept C2779903281 @default.
- W4367367040 hasConcept C36289849 @default.
- W4367367040 hasConcept C41008148 @default.
- W4367367040 hasConcept C41895202 @default.
- W4367367040 hasConcept C71139939 @default.
- W4367367040 hasConceptScore W4367367040C138885662 @default.
- W4367367040 hasConceptScore W4367367040C144024400 @default.
- W4367367040 hasConceptScore W4367367040C153083717 @default.
- W4367367040 hasConceptScore W4367367040C154945302 @default.
- W4367367040 hasConceptScore W4367367040C185592680 @default.
- W4367367040 hasConceptScore W4367367040C188027245 @default.
- W4367367040 hasConceptScore W4367367040C204321447 @default.
- W4367367040 hasConceptScore W4367367040C2777200299 @default.
- W4367367040 hasConceptScore W4367367040C2779903281 @default.
- W4367367040 hasConceptScore W4367367040C36289849 @default.
- W4367367040 hasConceptScore W4367367040C41008148 @default.
- W4367367040 hasConceptScore W4367367040C41895202 @default.
- W4367367040 hasConceptScore W4367367040C71139939 @default.
- W4367367040 hasLocation W43673670401 @default.
- W4367367040 hasOpenAccess W4367367040 @default.
- W4367367040 hasPrimaryLocation W43673670401 @default.
- W4367367040 hasRelatedWork W2073045008 @default.
- W4367367040 hasRelatedWork W2274593003 @default.
- W4367367040 hasRelatedWork W2368651715 @default.
- W4367367040 hasRelatedWork W2496949096 @default.
- W4367367040 hasRelatedWork W2611614995 @default.
- W4367367040 hasRelatedWork W3007282427 @default.
- W4367367040 hasRelatedWork W3034860057 @default.
- W4367367040 hasRelatedWork W3107474891 @default.
- W4367367040 hasRelatedWork W4287722271 @default.
- W4367367040 hasRelatedWork W1853743098 @default.
- W4367367040 isParatext "false" @default.
- W4367367040 isRetracted "false" @default.
- W4367367040 workType "article" @default.