Matches in SemOpenAlex for { <https://semopenalex.org/work/W3183277358> ?p ?o ?g. }
- W3183277358 abstract "Self-supervised pre-training of large-scale transformer models on text corpora followed by finetuning has achieved state-of-the-art on a number of natural language processing tasks. Recently, Lu et al. (2021, arXiv:2103.05247) claimed that frozen pretrained transformers (FPTs) match or outperform training from scratch as well as unfrozen (fine-tuned) pretrained transformers in a set of transfer tasks to other modalities. In our work, we find that this result is, in fact, an artifact of not tuning the learning rates. After carefully redesigning the empirical setup, we find that when tuning learning rates properly, pretrained transformers do outperform or match training from scratch in all of our tasks, but only as long as the entire model is finetuned. Thus, while transfer from pretrained language models to other modalities does indeed provide gains and hints at exciting possibilities for future work, properly tuning hyperparameters is important for arriving at robust findings." @default.
- W3183277358 created "2021-08-02" @default.
- W3183277358 creator A5001224830 @default.
- W3183277358 creator A5059094093 @default.
- W3183277358 creator A5074182631 @default.
- W3183277358 creator A5079315903 @default.
- W3183277358 date "2021-07-26" @default.
- W3183277358 modified "2023-10-01" @default.
- W3183277358 title "Don't Sweep your Learning Rate under the Rug: A Closer Look at Cross-modal Transfer of Pretrained Transformers." @default.
- W3183277358 cites W1566289585 @default.
- W3183277358 cites W2113459411 @default.
- W3183277358 cites W2250539671 @default.
- W3183277358 cites W2611669587 @default.
- W3183277358 cites W2626778328 @default.
- W3183277358 cites W2889498145 @default.
- W3183277358 cites W2951599627 @default.
- W3183277358 cites W2963012544 @default.
- W3183277358 cites W2963168530 @default.
- W3183277358 cites W2963310665 @default.
- W3183277358 cites W2963341956 @default.
- W3183277358 cites W2963393721 @default.
- W3183277358 cites W2963748792 @default.
- W3183277358 cites W2964121744 @default.
- W3183277358 cites W2970476646 @default.
- W3183277358 cites W2980149079 @default.
- W3183277358 cites W2990704537 @default.
- W3183277358 cites W3030163527 @default.
- W3183277358 cites W3098824823 @default.
- W3183277358 cites W3118608800 @default.
- W3183277358 cites W3121592593 @default.
- W3183277358 cites W3134307371 @default.
- W3183277358 cites W3169283738 @default.
- W3183277358 hasPublicationYear "2021" @default.
- W3183277358 type Work @default.
- W3183277358 sameAs 3183277358 @default.
- W3183277358 citedByCount "0" @default.
- W3183277358 crossrefType "posted-content" @default.
- W3183277358 hasAuthorship W3183277358A5001224830 @default.
- W3183277358 hasAuthorship W3183277358A5059094093 @default.
- W3183277358 hasAuthorship W3183277358A5074182631 @default.
- W3183277358 hasAuthorship W3183277358A5079315903 @default.
- W3183277358 hasConcept C119599485 @default.
- W3183277358 hasConcept C119857082 @default.
- W3183277358 hasConcept C127413603 @default.
- W3183277358 hasConcept C144024400 @default.
- W3183277358 hasConcept C150899416 @default.
- W3183277358 hasConcept C154945302 @default.
- W3183277358 hasConcept C165801399 @default.
- W3183277358 hasConcept C185592680 @default.
- W3183277358 hasConcept C188027245 @default.
- W3183277358 hasConcept C199360897 @default.
- W3183277358 hasConcept C2779903281 @default.
- W3183277358 hasConcept C2781235140 @default.
- W3183277358 hasConcept C36289849 @default.
- W3183277358 hasConcept C41008148 @default.
- W3183277358 hasConcept C66322947 @default.
- W3183277358 hasConcept C71139939 @default.
- W3183277358 hasConcept C8642999 @default.
- W3183277358 hasConceptScore W3183277358C119599485 @default.
- W3183277358 hasConceptScore W3183277358C119857082 @default.
- W3183277358 hasConceptScore W3183277358C127413603 @default.
- W3183277358 hasConceptScore W3183277358C144024400 @default.
- W3183277358 hasConceptScore W3183277358C150899416 @default.
- W3183277358 hasConceptScore W3183277358C154945302 @default.
- W3183277358 hasConceptScore W3183277358C165801399 @default.
- W3183277358 hasConceptScore W3183277358C185592680 @default.
- W3183277358 hasConceptScore W3183277358C188027245 @default.
- W3183277358 hasConceptScore W3183277358C199360897 @default.
- W3183277358 hasConceptScore W3183277358C2779903281 @default.
- W3183277358 hasConceptScore W3183277358C2781235140 @default.
- W3183277358 hasConceptScore W3183277358C36289849 @default.
- W3183277358 hasConceptScore W3183277358C41008148 @default.
- W3183277358 hasConceptScore W3183277358C66322947 @default.
- W3183277358 hasConceptScore W3183277358C71139939 @default.
- W3183277358 hasConceptScore W3183277358C8642999 @default.
- W3183277358 hasLocation W31832773581 @default.
- W3183277358 hasOpenAccess W3183277358 @default.
- W3183277358 hasPrimaryLocation W31832773581 @default.
- W3183277358 hasRelatedWork W2911224109 @default.
- W3183277358 hasRelatedWork W2970352191 @default.
- W3183277358 hasRelatedWork W2975429091 @default.
- W3183277358 hasRelatedWork W2990812820 @default.
- W3183277358 hasRelatedWork W3023285645 @default.
- W3183277358 hasRelatedWork W3042266831 @default.
- W3183277358 hasRelatedWork W3083109176 @default.
- W3183277358 hasRelatedWork W3087148478 @default.
- W3183277358 hasRelatedWork W3092266824 @default.
- W3183277358 hasRelatedWork W3095308078 @default.
- W3183277358 hasRelatedWork W3097013020 @default.
- W3183277358 hasRelatedWork W3104215796 @default.
- W3183277358 hasRelatedWork W3126822054 @default.
- W3183277358 hasRelatedWork W3128632573 @default.
- W3183277358 hasRelatedWork W3154923735 @default.
- W3183277358 hasRelatedWork W3173617765 @default.
- W3183277358 hasRelatedWork W3194542637 @default.
- W3183277358 hasRelatedWork W3199258042 @default.
- W3183277358 hasRelatedWork W3199761064 @default.
- W3183277358 hasRelatedWork W3205616434 @default.
- W3183277358 isParatext "false" @default.
- W3183277358 isRetracted "false" @default.