SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287240280> ?p ?o ?g. }

Showing items 1 to 78 of 78 with 100 items per page.

W4287240280 abstract "Despite exciting progress in pre-training for visual-linguistic (VL) representations, very few aspire to a small VL model. In this paper, we study knowledge distillation (KD) to effectively compress a transformer-based large VL model into a small VL model. The major challenge arises from the inconsistent regional visual tokens extracted from different detectors of Teacher and Student, resulting in the misalignment of hidden representations and attention distributions. To address the problem, we retrain and adapt the Teacher by using the same region proposals from Student's detector while the features are from Teacher's own object detector. With aligned network inputs, the adapted Teacher is capable of transferring the knowledge through the intermediate representations. Specifically, we use the mean square error loss to mimic the attention distribution inside the transformer block and present a token-wise noise contrastive loss to align the hidden state by contrasting with negative representations stored in a sample queue. To this end, we show that our proposed distillation significantly improves the performance of small VL models on image captioning and visual question answering tasks. It reaches 120.8 in CIDEr score on COCO captioning, an improvement of 5.1 over its non-distilled counterpart; and an accuracy of 69.8 on VQA 2.0, a 0.8 gain from the baseline. Our extensive experiments and ablations confirm the effectiveness of VL distillation in both pre-training and fine-tuning stages." @default.
W4287240280 created "2022-07-25" @default.
W4287240280 creator A5006711184 @default.
W4287240280 creator A5016326637 @default.
W4287240280 creator A5025592561 @default.
W4287240280 creator A5027851405 @default.
W4287240280 creator A5048295582 @default.
W4287240280 creator A5073435344 @default.
W4287240280 date "2021-04-05" @default.
W4287240280 modified "2023-10-16" @default.
W4287240280 title "Compressing Visual-linguistic Model via Knowledge Distillation" @default.
W4287240280 hasPublicationYear "2021" @default.
W4287240280 type Work @default.
W4287240280 citedByCount "0" @default.
W4287240280 crossrefType "posted-content" @default.
W4287240280 hasAuthorship W4287240280A5006711184 @default.
W4287240280 hasAuthorship W4287240280A5016326637 @default.
W4287240280 hasAuthorship W4287240280A5025592561 @default.
W4287240280 hasAuthorship W4287240280A5027851405 @default.
W4287240280 hasAuthorship W4287240280A5048295582 @default.
W4287240280 hasAuthorship W4287240280A5073435344 @default.
W4287240280 hasBestOaLocation W42872402801 @default.
W4287240280 hasConcept C115961682 @default.
W4287240280 hasConcept C119599485 @default.
W4287240280 hasConcept C119857082 @default.
W4287240280 hasConcept C127413603 @default.
W4287240280 hasConcept C137293760 @default.
W4287240280 hasConcept C154945302 @default.
W4287240280 hasConcept C157657479 @default.
W4287240280 hasConcept C165801399 @default.
W4287240280 hasConcept C178790620 @default.
W4287240280 hasConcept C185592680 @default.
W4287240280 hasConcept C204030448 @default.
W4287240280 hasConcept C204321447 @default.
W4287240280 hasConcept C28490314 @default.
W4287240280 hasConcept C38652104 @default.
W4287240280 hasConcept C41008148 @default.
W4287240280 hasConcept C44291984 @default.
W4287240280 hasConcept C48145219 @default.
W4287240280 hasConcept C66322947 @default.
W4287240280 hasConcept C76155785 @default.
W4287240280 hasConcept C94915269 @default.
W4287240280 hasConceptScore W4287240280C115961682 @default.
W4287240280 hasConceptScore W4287240280C119599485 @default.
W4287240280 hasConceptScore W4287240280C119857082 @default.
W4287240280 hasConceptScore W4287240280C127413603 @default.
W4287240280 hasConceptScore W4287240280C137293760 @default.
W4287240280 hasConceptScore W4287240280C154945302 @default.
W4287240280 hasConceptScore W4287240280C157657479 @default.
W4287240280 hasConceptScore W4287240280C165801399 @default.
W4287240280 hasConceptScore W4287240280C178790620 @default.
W4287240280 hasConceptScore W4287240280C185592680 @default.
W4287240280 hasConceptScore W4287240280C204030448 @default.
W4287240280 hasConceptScore W4287240280C204321447 @default.
W4287240280 hasConceptScore W4287240280C28490314 @default.
W4287240280 hasConceptScore W4287240280C38652104 @default.
W4287240280 hasConceptScore W4287240280C41008148 @default.
W4287240280 hasConceptScore W4287240280C44291984 @default.
W4287240280 hasConceptScore W4287240280C48145219 @default.
W4287240280 hasConceptScore W4287240280C66322947 @default.
W4287240280 hasConceptScore W4287240280C76155785 @default.
W4287240280 hasConceptScore W4287240280C94915269 @default.
W4287240280 hasLocation W42872402801 @default.
W4287240280 hasOpenAccess W4287240280 @default.
W4287240280 hasPrimaryLocation W42872402801 @default.
W4287240280 hasRelatedWork W12168553 @default.
W4287240280 hasRelatedWork W14230040 @default.
W4287240280 hasRelatedWork W1745277 @default.
W4287240280 hasRelatedWork W2308727 @default.
W4287240280 hasRelatedWork W4629839 @default.
W4287240280 hasRelatedWork W6143937 @default.
W4287240280 hasRelatedWork W6657867 @default.
W4287240280 hasRelatedWork W7401400 @default.
W4287240280 hasRelatedWork W7946549 @default.
W4287240280 hasRelatedWork W9280962 @default.
W4287240280 isParatext "false" @default.
W4287240280 isRetracted "false" @default.
W4287240280 workType "article" @default.