Matches in SemOpenAlex for { <https://semopenalex.org/work/W3177777394> ?p ?o ?g. }
- W3177777394 abstract "Since visual perception can give rich information beyond text descriptions for world understanding, there has been increasing interest in leveraging visual grounding for language learning. Recently, vokenization (Tan and Bansal, 2020) has attracted attention by using the predictions of a text-to-image retrieval model as labels for language model supervision. Despite its success, the method suffers from approximation error of using finite image labels and the lack of vocabulary diversity of a small image-text dataset. To overcome these limitations, we present VidLanKD, a video-language knowledge distillation method for improving language understanding. We train a multi-modal teacher model on a video-text dataset, and then transfer its knowledge to a student language model with a text dataset. To avoid approximation error, we propose to use different knowledge distillation objectives. In addition, the use of a large-scale video-text dataset helps learn diverse and richer vocabularies. In our experiments, VidLanKD achieves consistent improvements over text-only language models and vokenization models, on several downstream language understanding tasks including GLUE, SQuAD, and SWAG. We also demonstrate the improved world knowledge, physical reasoning, and temporal reasoning capabilities of our model by evaluating on the GLUE-diagnostics, PIQA, and TRACIE datasets. Lastly, we present comprehensive ablation studies as well as visualizations of the learned text-to-video grounding results of our teacher and student language models. Our code and models are available at: https://github.com/zinengtang/VidLanKD" @default.
- W3177777394 created "2021-07-19" @default.
- W3177777394 creator A5001987532 @default.
- W3177777394 creator A5022885986 @default.
- W3177777394 creator A5052864910 @default.
- W3177777394 creator A5074267258 @default.
- W3177777394 date "2021-07-06" @default.
- W3177777394 modified "2023-09-27" @default.
- W3177777394 title "VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer" @default.
- W3177777394 cites W1821462560 @default.
- W3177777394 cites W1843891098 @default.
- W3177777394 cites W1861492603 @default.
- W3177777394 cites W1889081078 @default.
- W3177777394 cites W1927052826 @default.
- W3177777394 cites W2048343491 @default.
- W3177777394 cites W2108598243 @default.
- W3177777394 cites W2126579184 @default.
- W3177777394 cites W2194775991 @default.
- W3177777394 cites W2212660284 @default.
- W3177777394 cites W2250976127 @default.
- W3177777394 cites W2342225866 @default.
- W3177777394 cites W24089286 @default.
- W3177777394 cites W2425121537 @default.
- W3177777394 cites W2427527485 @default.
- W3177777394 cites W2463507112 @default.
- W3177777394 cites W2508418541 @default.
- W3177777394 cites W2606974598 @default.
- W3177777394 cites W2620998106 @default.
- W3177777394 cites W2731516819 @default.
- W3177777394 cites W2787017828 @default.
- W3177777394 cites W2787560479 @default.
- W3177777394 cites W2808847742 @default.
- W3177777394 cites W2883429621 @default.
- W3177777394 cites W2886424491 @default.
- W3177777394 cites W2896457183 @default.
- W3177777394 cites W2899771611 @default.
- W3177777394 cites W2908510526 @default.
- W3177777394 cites W2931316642 @default.
- W3177777394 cites W2944815030 @default.
- W3177777394 cites W2948859046 @default.
- W3177777394 cites W2950761309 @default.
- W3177777394 cites W2952132648 @default.
- W3177777394 cites W2962934715 @default.
- W3177777394 cites W2962982906 @default.
- W3177777394 cites W2963264012 @default.
- W3177777394 cites W2963310665 @default.
- W3177777394 cites W2963323070 @default.
- W3177777394 cites W2963403868 @default.
- W3177777394 cites W2963468606 @default.
- W3177777394 cites W2963494889 @default.
- W3177777394 cites W2963846996 @default.
- W3177777394 cites W2964121744 @default.
- W3177777394 cites W2965373594 @default.
- W3177777394 cites W2970231061 @default.
- W3177777394 cites W2970364616 @default.
- W3177777394 cites W2970597249 @default.
- W3177777394 cites W2970608575 @default.
- W3177777394 cites W2971274815 @default.
- W3177777394 cites W2981694290 @default.
- W3177777394 cites W2995558462 @default.
- W3177777394 cites W2995607862 @default.
- W3177777394 cites W2996035354 @default.
- W3177777394 cites W2996428491 @default.
- W3177777394 cites W2997006708 @default.
- W3177777394 cites W2997591391 @default.
- W3177777394 cites W2998356391 @default.
- W3177777394 cites W2998617917 @default.
- W3177777394 cites W3005700362 @default.
- W3177777394 cites W3020712669 @default.
- W3177777394 cites W3023633125 @default.
- W3177777394 cites W3034723486 @default.
- W3177777394 cites W3034978746 @default.
- W3177777394 cites W3035390927 @default.
- W3177777394 cites W3035635319 @default.
- W3177777394 cites W3082274269 @default.
- W3177777394 cites W3092787421 @default.
- W3177777394 cites W3094502228 @default.
- W3177777394 cites W3101821705 @default.
- W3177777394 cites W3104033643 @default.
- W3177777394 cites W3110909889 @default.
- W3177777394 cites W3135367836 @default.
- W3177777394 cites W3167366812 @default.
- W3177777394 cites W3169993339 @default.
- W3177777394 cites W3177224328 @default.
- W3177777394 cites W753847829 @default.
- W3177777394 cites W3188447078 @default.
- W3177777394 hasPublicationYear "2021" @default.
- W3177777394 type Work @default.
- W3177777394 sameAs 3177777394 @default.
- W3177777394 citedByCount "0" @default.
- W3177777394 crossrefType "posted-content" @default.
- W3177777394 hasAuthorship W3177777394A5001987532 @default.
- W3177777394 hasAuthorship W3177777394A5022885986 @default.
- W3177777394 hasAuthorship W3177777394A5052864910 @default.
- W3177777394 hasAuthorship W3177777394A5074267258 @default.
- W3177777394 hasBestOaLocation W31777773941 @default.
- W3177777394 hasConcept C119857082 @default.
- W3177777394 hasConcept C137293760 @default.
- W3177777394 hasConcept C138885662 @default.
- W3177777394 hasConcept C154945302 @default.