Matches in SemOpenAlex for { <https://semopenalex.org/work/W4312457846> ?p ?o ?g. }
- W4312457846 abstract "Despite the potential of multi-modal pre-training to learn highly discriminative feature representations from complementary data modalities, current progress is being slowed by the lack of large-scale modality-diverse datasets. By leveraging the natural suitability of E-commerce, where different modalities capture complementary semantic information, we contribute a large-scale multi-modal pretraining dataset M5Product. The dataset comprises 5 modalities (image, text, table, video, and audio), covers over 6,000 categories and 5,000 attributes, and is 500× larger than the largest publicly available dataset with a similar number of modalities. Furthermore, M5Product contains incomplete modality pairs and noise while also having a long-tailed distribution, resembling most real-world problems. We further propose Self-harmonized ContrAstive LEarning (SCALE), a novel pretraining framework that integrates the different modalities into a unified model through an adaptive feature fusion mechanism, where the importance of each modality is learned directly from the modality embeddings and impacts the inter-modality contrastive learning and masked tasks within a multi-modal transformer model. We evaluate the current multi-modal pre-training state-of-the-art approaches and benchmark their ability to learn from unlabeled data when faced with the large number of modalities in the M5Product dataset. We conduct extensive experiments on four downstream tasks and demonstrate the superiority of our SCALE model, providing insights into the importance of dataset scale and diversity. Dataset and codes are available at <sup xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>1</sup> <sup xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>1</sup> https://xiaodongsuper.github.io/M5Product_dataset/." @default.
- W4312457846 created "2023-01-04" @default.
- W4312457846 creator A5015744133 @default.
- W4312457846 creator A5023225712 @default.
- W4312457846 creator A5037966954 @default.
- W4312457846 creator A5047878798 @default.
- W4312457846 creator A5049388503 @default.
- W4312457846 creator A5064374603 @default.
- W4312457846 creator A5069125276 @default.
- W4312457846 creator A5082181196 @default.
- W4312457846 creator A5087043856 @default.
- W4312457846 date "2022-06-01" @default.
- W4312457846 modified "2023-10-05" @default.
- W4312457846 title "M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining" @default.
- W4312457846 cites W1933349210 @default.
- W4312457846 cites W1974647172 @default.
- W4312457846 cites W2007972815 @default.
- W4312457846 cites W2018567642 @default.
- W4312457846 cites W2035821475 @default.
- W4312457846 cites W2103075368 @default.
- W4312457846 cites W2185175083 @default.
- W4312457846 cites W2194775991 @default.
- W4312457846 cites W2277195237 @default.
- W4312457846 cites W2425121537 @default.
- W4312457846 cites W2554234430 @default.
- W4312457846 cites W2600067905 @default.
- W4312457846 cites W2606965845 @default.
- W4312457846 cites W2606982687 @default.
- W4312457846 cites W2745461083 @default.
- W4312457846 cites W2760516584 @default.
- W4312457846 cites W2886641317 @default.
- W4312457846 cites W2952132648 @default.
- W4312457846 cites W2963530300 @default.
- W4312457846 cites W2963541336 @default.
- W4312457846 cites W2963890755 @default.
- W4312457846 cites W2970231061 @default.
- W4312457846 cites W2980316355 @default.
- W4312457846 cites W2981851019 @default.
- W4312457846 cites W2989322838 @default.
- W4312457846 cites W3035485997 @default.
- W4312457846 cites W3090449556 @default.
- W4312457846 cites W3100801259 @default.
- W4312457846 cites W3105232955 @default.
- W4312457846 cites W3176641147 @default.
- W4312457846 cites W3202384916 @default.
- W4312457846 cites W639708223 @default.
- W4312457846 doi "https://doi.org/10.1109/cvpr52688.2022.02057" @default.
- W4312457846 hasPublicationYear "2022" @default.
- W4312457846 type Work @default.
- W4312457846 citedByCount "3" @default.
- W4312457846 countsByYear W43124578462022 @default.
- W4312457846 countsByYear W43124578462023 @default.
- W4312457846 crossrefType "proceedings-article" @default.
- W4312457846 hasAuthorship W4312457846A5015744133 @default.
- W4312457846 hasAuthorship W4312457846A5023225712 @default.
- W4312457846 hasAuthorship W4312457846A5037966954 @default.
- W4312457846 hasAuthorship W4312457846A5047878798 @default.
- W4312457846 hasAuthorship W4312457846A5049388503 @default.
- W4312457846 hasAuthorship W4312457846A5064374603 @default.
- W4312457846 hasAuthorship W4312457846A5069125276 @default.
- W4312457846 hasAuthorship W4312457846A5082181196 @default.
- W4312457846 hasAuthorship W4312457846A5087043856 @default.
- W4312457846 hasBestOaLocation W43124578462 @default.
- W4312457846 hasConcept C119857082 @default.
- W4312457846 hasConcept C121332964 @default.
- W4312457846 hasConcept C13280743 @default.
- W4312457846 hasConcept C138885662 @default.
- W4312457846 hasConcept C144024400 @default.
- W4312457846 hasConcept C154945302 @default.
- W4312457846 hasConcept C185592680 @default.
- W4312457846 hasConcept C185798385 @default.
- W4312457846 hasConcept C188027245 @default.
- W4312457846 hasConcept C204321447 @default.
- W4312457846 hasConcept C205649164 @default.
- W4312457846 hasConcept C2776401178 @default.
- W4312457846 hasConcept C2778755073 @default.
- W4312457846 hasConcept C2779903281 @default.
- W4312457846 hasConcept C2780226545 @default.
- W4312457846 hasConcept C36289849 @default.
- W4312457846 hasConcept C41008148 @default.
- W4312457846 hasConcept C41895202 @default.
- W4312457846 hasConcept C59404180 @default.
- W4312457846 hasConcept C62520636 @default.
- W4312457846 hasConcept C71139939 @default.
- W4312457846 hasConcept C97931131 @default.
- W4312457846 hasConceptScore W4312457846C119857082 @default.
- W4312457846 hasConceptScore W4312457846C121332964 @default.
- W4312457846 hasConceptScore W4312457846C13280743 @default.
- W4312457846 hasConceptScore W4312457846C138885662 @default.
- W4312457846 hasConceptScore W4312457846C144024400 @default.
- W4312457846 hasConceptScore W4312457846C154945302 @default.
- W4312457846 hasConceptScore W4312457846C185592680 @default.
- W4312457846 hasConceptScore W4312457846C185798385 @default.
- W4312457846 hasConceptScore W4312457846C188027245 @default.
- W4312457846 hasConceptScore W4312457846C204321447 @default.
- W4312457846 hasConceptScore W4312457846C205649164 @default.
- W4312457846 hasConceptScore W4312457846C2776401178 @default.
- W4312457846 hasConceptScore W4312457846C2778755073 @default.
- W4312457846 hasConceptScore W4312457846C2779903281 @default.
- W4312457846 hasConceptScore W4312457846C2780226545 @default.