Matches in SemOpenAlex for { <https://semopenalex.org/work/W4378086438> ?p ?o ?g. }
- W4378086438 endingPage "382" @default.
- W4378086438 startingPage "313" @default.
- W4378086438 abstract "Abstract Foundation Models are able to model not only tokens of natural language but also token elements of arbitrary sequences. For images, square image patches can be represented as tokens; for videos, we can define tubelets that span an image patch across multiple frames. Subsequently, the proven self-attention algorithms can be applied to these tokens. Most importantly, several modalities like text and images can be processed in the same sequence allowing, for instance, the generation of images from text and text descriptions from video. In addition, the models are scalable to very large networks and huge datasets. The following multimedia types are covered in the subsequent sections. Speech recognition and text-to-speech models describe the translation of spoken language into text and vice versa. Image processing has the task to interpret images, describe them by captions, and generate new images according to textual descriptions. Video interpretation aims at recognizing action in videos and describing them through text. Furthermore, new videos can be created according to a textual description. Dynamical system trajectories characterize sequential decision problems, which can be simulated and controlled. DNA and protein sequences can be analyzed with Foundation Models to predict the structure and properties of the corresponding molecules." @default.
- W4378086438 created "2023-05-25" @default.
- W4378086438 creator A5025445408 @default.
- W4378086438 creator A5080184077 @default.
- W4378086438 date "2023-01-01" @default.
- W4378086438 modified "2023-09-29" @default.
- W4378086438 title "Foundation Models for Speech, Images, Videos, and Control" @default.
- W4378086438 cites W1494198834 @default.
- W4378086438 cites W1901129140 @default.
- W4378086438 cites W1956340063 @default.
- W4378086438 cites W2127141656 @default.
- W4378086438 cites W2183341477 @default.
- W4378086438 cites W2194775991 @default.
- W4378086438 cites W2425121537 @default.
- W4378086438 cites W2489434015 @default.
- W4378086438 cites W2549139847 @default.
- W4378086438 cites W2560730294 @default.
- W4378086438 cites W2625366777 @default.
- W4378086438 cites W2886641317 @default.
- W4378086438 cites W2896348597 @default.
- W4378086438 cites W2903739847 @default.
- W4378086438 cites W2916979304 @default.
- W4378086438 cites W2936774411 @default.
- W4378086438 cites W2962711930 @default.
- W4378086438 cites W2962785568 @default.
- W4378086438 cites W2962843773 @default.
- W4378086438 cites W2962892438 @default.
- W4378086438 cites W2962974533 @default.
- W4378086438 cites W2963300588 @default.
- W4378086438 cites W2964243274 @default.
- W4378086438 cites W2981851019 @default.
- W4378086438 cites W2984008963 @default.
- W4378086438 cites W3026041220 @default.
- W4378086438 cites W3035029089 @default.
- W4378086438 cites W3035574324 @default.
- W4378086438 cites W3041561163 @default.
- W4378086438 cites W3091588028 @default.
- W4378086438 cites W3096109555 @default.
- W4378086438 cites W3097777922 @default.
- W4378086438 cites W3100732527 @default.
- W4378086438 cites W3103780890 @default.
- W4378086438 cites W3104152799 @default.
- W4378086438 cites W3106784008 @default.
- W4378086438 cites W3126942607 @default.
- W4378086438 cites W3127238141 @default.
- W4378086438 cites W3133511908 @default.
- W4378086438 cites W3138516171 @default.
- W4378086438 cites W3139408490 @default.
- W4378086438 cites W3142316150 @default.
- W4378086438 cites W3144701084 @default.
- W4378086438 cites W3148101939 @default.
- W4378086438 cites W3160525311 @default.
- W4378086438 cites W3161109662 @default.
- W4378086438 cites W3167366812 @default.
- W4378086438 cites W3170405112 @default.
- W4378086438 cites W3172942063 @default.
- W4378086438 cites W3173220247 @default.
- W4378086438 cites W3173241699 @default.
- W4378086438 cites W3173290664 @default.
- W4378086438 cites W3174476431 @default.
- W4378086438 cites W3174525637 @default.
- W4378086438 cites W3176196997 @default.
- W4378086438 cites W3180059462 @default.
- W4378086438 cites W3180355996 @default.
- W4378086438 cites W3186179742 @default.
- W4378086438 cites W3195809285 @default.
- W4378086438 cites W3196936439 @default.
- W4378086438 cites W3202536355 @default.
- W4378086438 cites W3203949114 @default.
- W4378086438 cites W3206384369 @default.
- W4378086438 cites W3207290297 @default.
- W4378086438 cites W3207758636 @default.
- W4378086438 cites W3207918547 @default.
- W4378086438 cites W3209059054 @default.
- W4378086438 cites W3212516020 @default.
- W4378086438 cites W3213191779 @default.
- W4378086438 cites W3213549365 @default.
- W4378086438 cites W3215495615 @default.
- W4378086438 cites W4200500394 @default.
- W4378086438 cites W4206706211 @default.
- W4378086438 cites W4210352519 @default.
- W4378086438 cites W4225013417 @default.
- W4378086438 cites W4225323055 @default.
- W4378086438 cites W4225683910 @default.
- W4378086438 cites W4226033575 @default.
- W4378086438 cites W4281485151 @default.
- W4378086438 cites W4282963182 @default.
- W4378086438 cites W4288421316 @default.
- W4378086438 cites W4297841719 @default.
- W4378086438 cites W4312044727 @default.
- W4378086438 cites W4312358791 @default.
- W4378086438 cites W4312372834 @default.
- W4378086438 cites W4312407537 @default.
- W4378086438 cites W4312658081 @default.
- W4378086438 cites W4312933868 @default.
- W4378086438 cites W4313069509 @default.
- W4378086438 cites W639708223 @default.
- W4378086438 doi "https://doi.org/10.1007/978-3-031-23190-2_7" @default.