Matches in SemOpenAlex for { <https://semopenalex.org/work/W3136816596> ?p ?o ?g. }
- W3136816596 abstract "Multimodal pre-training has propelled great advancement in vision-and-language research. These large-scale pre-trained models, although successful, fatefully suffer from slow inference speed due to enormous computation cost mainly from cross-modal attention in Transformer architecture. When applied to real-life applications, such latency and computation demand severely deter the practical use of pre-trained models. In this paper, we study Image-text retrieval (ITR), the most mature scenario of V+L application, which has been widely studied even prior to the emergence of recent pre-trained models. We propose a simple yet highly effective approach, LightningDOT that accelerates the inference time of ITR by thousands of times, without sacrificing accuracy. LightningDOT removes the time-consuming cross-modal attention by pre-training on three novel learning objectives, extracting feature indexes offline, and employing instant dot-product matching with further re-ranking, which significantly speeds up retrieval process. In fact, LightningDOT achieves new state of the art across multiple ITR benchmarks such as Flickr30k, COCO and Multi30K, outperforming existing pre-trained models that consume 1000x magnitude of computational hours. Code and pre-training checkpoints are available at https://github.com/intersun/LightningDOT." @default.
- W3136816596 created "2021-03-29" @default.
- W3136816596 creator A5023777406 @default.
- W3136816596 creator A5028783832 @default.
- W3136816596 creator A5034826937 @default.
- W3136816596 creator A5037467245 @default.
- W3136816596 creator A5039183544 @default.
- W3136816596 creator A5077322975 @default.
- W3136816596 date "2021-03-15" @default.
- W3136816596 modified "2023-10-17" @default.
- W3136816596 title "LightningDOT: Pre-training Visual-Semantic Embeddings for Real-Time Image-Text Retrieval" @default.
- W3136816596 cites W1527575280 @default.
- W3136816596 cites W1861492603 @default.
- W3136816596 cites W1889081078 @default.
- W3136816596 cites W1924770834 @default.
- W3136816596 cites W1933349210 @default.
- W3136816596 cites W2109586012 @default.
- W3136816596 cites W2277195237 @default.
- W3136816596 cites W2489434015 @default.
- W3136816596 cites W2606473278 @default.
- W3136816596 cites W2611029872 @default.
- W3136816596 cites W2613718673 @default.
- W3136816596 cites W2745461083 @default.
- W3136816596 cites W2886641317 @default.
- W3136816596 cites W2908510526 @default.
- W3136816596 cites W2962784628 @default.
- W3136816596 cites W2962964995 @default.
- W3136816596 cites W2963331233 @default.
- W3136816596 cites W2963341956 @default.
- W3136816596 cites W2963403868 @default.
- W3136816596 cites W2963467339 @default.
- W3136816596 cites W2963527096 @default.
- W3136816596 cites W2963778889 @default.
- W3136816596 cites W2964120214 @default.
- W3136816596 cites W2965373594 @default.
- W3136816596 cites W2968124245 @default.
- W3136816596 cites W2970231061 @default.
- W3136816596 cites W2970454332 @default.
- W3136816596 cites W2970597249 @default.
- W3136816596 cites W2970608575 @default.
- W3136816596 cites W2973978812 @default.
- W3136816596 cites W2987249037 @default.
- W3136816596 cites W2994818707 @default.
- W3136816596 cites W2996035354 @default.
- W3136816596 cites W2996428491 @default.
- W3136816596 cites W2997591391 @default.
- W3136816596 cites W2997786945 @default.
- W3136816596 cites W2998702515 @default.
- W3136816596 cites W3014611590 @default.
- W3136816596 cites W3015354748 @default.
- W3136816596 cites W3027879771 @default.
- W3136816596 cites W3034727271 @default.
- W3136816596 cites W3035552787 @default.
- W3136816596 cites W3038572442 @default.
- W3136816596 cites W3082274269 @default.
- W3136816596 cites W3087273879 @default.
- W3136816596 cites W3090449556 @default.
- W3136816596 cites W3099700870 @default.
- W3136816596 cites W3102995547 @default.
- W3136816596 cites W3105966348 @default.
- W3136816596 cites W3135367836 @default.
- W3136816596 doi "https://doi.org/10.48550/arxiv.2103.08784" @default.
- W3136816596 hasPublicationYear "2021" @default.
- W3136816596 type Work @default.
- W3136816596 sameAs 3136816596 @default.
- W3136816596 citedByCount "0" @default.
- W3136816596 crossrefType "posted-content" @default.
- W3136816596 hasAuthorship W3136816596A5023777406 @default.
- W3136816596 hasAuthorship W3136816596A5028783832 @default.
- W3136816596 hasAuthorship W3136816596A5034826937 @default.
- W3136816596 hasAuthorship W3136816596A5037467245 @default.
- W3136816596 hasAuthorship W3136816596A5039183544 @default.
- W3136816596 hasAuthorship W3136816596A5077322975 @default.
- W3136816596 hasBestOaLocation W31368165961 @default.
- W3136816596 hasConcept C111919701 @default.
- W3136816596 hasConcept C11413529 @default.
- W3136816596 hasConcept C119857082 @default.
- W3136816596 hasConcept C121332964 @default.
- W3136816596 hasConcept C138885662 @default.
- W3136816596 hasConcept C154945302 @default.
- W3136816596 hasConcept C165696696 @default.
- W3136816596 hasConcept C165801399 @default.
- W3136816596 hasConcept C177264268 @default.
- W3136816596 hasConcept C185592680 @default.
- W3136816596 hasConcept C188027245 @default.
- W3136816596 hasConcept C199360897 @default.
- W3136816596 hasConcept C2776214188 @default.
- W3136816596 hasConcept C2776401178 @default.
- W3136816596 hasConcept C2776760102 @default.
- W3136816596 hasConcept C38652104 @default.
- W3136816596 hasConcept C41008148 @default.
- W3136816596 hasConcept C41895202 @default.
- W3136816596 hasConcept C45374587 @default.
- W3136816596 hasConcept C62520636 @default.
- W3136816596 hasConcept C66322947 @default.
- W3136816596 hasConcept C68339613 @default.
- W3136816596 hasConcept C71139939 @default.
- W3136816596 hasConceptScore W3136816596C111919701 @default.
- W3136816596 hasConceptScore W3136816596C11413529 @default.
- W3136816596 hasConceptScore W3136816596C119857082 @default.