Matches in SemOpenAlex for { <https://semopenalex.org/work/W3162276117> ?p ?o ?g. }
- W3162276117 abstract "In the era of pre-trained language models, Transformers are the de facto choice of model architectures. While recent research has shown promise in entirely convolutional, or CNN, architectures, they have not been explored using the pre-train-fine-tune paradigm. In the context of language models, are convolutional models competitive to Transformers when pre-trained? This paper investigates this research question and presents several interesting findings. Across an extensive set of experiments on 8 datasets/tasks, we find that CNN-based pre-trained models are competitive and outperform their Transformer counterpart in certain scenarios, albeit with caveats. Overall, the findings outlined in this paper suggest that conflating pre-training and architectural advances is misguided and that both advances should be considered independently. We believe our research paves the way for a healthy amount of optimism in alternative architectures." @default.
- W3162276117 created "2021-05-24" @default.
- W3162276117 creator A5000115067 @default.
- W3162276117 creator A5008444234 @default.
- W3162276117 creator A5031888985 @default.
- W3162276117 creator A5036477705 @default.
- W3162276117 creator A5057786383 @default.
- W3162276117 creator A5058345225 @default.
- W3162276117 creator A5080286262 @default.
- W3162276117 date "2021-05-07" @default.
- W3162276117 modified "2023-10-18" @default.
- W3162276117 title "Are Pre-trained Convolutions Better than Pre-trained Transformers?" @default.
- W3162276117 cites W1832693441 @default.
- W3162276117 cites W2064675550 @default.
- W3162276117 cites W2070246124 @default.
- W3162276117 cites W2113459411 @default.
- W3162276117 cites W2170240176 @default.
- W3162276117 cites W2250539671 @default.
- W3162276117 cites W2251939518 @default.
- W3162276117 cites W2427527485 @default.
- W3162276117 cites W2531409750 @default.
- W3162276117 cites W2540646130 @default.
- W3162276117 cites W2607892599 @default.
- W3162276117 cites W2613904329 @default.
- W3162276117 cites W2787560479 @default.
- W3162276117 cites W2792764867 @default.
- W3162276117 cites W2797328513 @default.
- W3162276117 cites W2899099596 @default.
- W3162276117 cites W2900096133 @default.
- W3162276117 cites W2908336025 @default.
- W3162276117 cites W2920807444 @default.
- W3162276117 cites W2944815030 @default.
- W3162276117 cites W2949888546 @default.
- W3162276117 cites W2950133940 @default.
- W3162276117 cites W2952729433 @default.
- W3162276117 cites W2953333557 @default.
- W3162276117 cites W2963341956 @default.
- W3162276117 cites W2963403868 @default.
- W3162276117 cites W2963756346 @default.
- W3162276117 cites W2965373594 @default.
- W3162276117 cites W2966892770 @default.
- W3162276117 cites W2970597249 @default.
- W3162276117 cites W2981852735 @default.
- W3162276117 cites W2982399380 @default.
- W3162276117 cites W2996428491 @default.
- W3162276117 cites W3011574394 @default.
- W3162276117 cites W3011718307 @default.
- W3162276117 cites W3013571468 @default.
- W3162276117 cites W3021293129 @default.
- W3162276117 cites W3030163527 @default.
- W3162276117 cites W3034696692 @default.
- W3162276117 cites W3085139254 @default.
- W3162276117 cites W3092866762 @default.
- W3162276117 cites W3121592593 @default.
- W3162276117 doi "https://doi.org/10.48550/arxiv.2105.03322" @default.
- W3162276117 hasPublicationYear "2021" @default.
- W3162276117 type Work @default.
- W3162276117 sameAs 3162276117 @default.
- W3162276117 citedByCount "13" @default.
- W3162276117 countsByYear W31622761172021 @default.
- W3162276117 countsByYear W31622761172022 @default.
- W3162276117 countsByYear W31622761172023 @default.
- W3162276117 crossrefType "posted-content" @default.
- W3162276117 hasAuthorship W3162276117A5000115067 @default.
- W3162276117 hasAuthorship W3162276117A5008444234 @default.
- W3162276117 hasAuthorship W3162276117A5031888985 @default.
- W3162276117 hasAuthorship W3162276117A5036477705 @default.
- W3162276117 hasAuthorship W3162276117A5057786383 @default.
- W3162276117 hasAuthorship W3162276117A5058345225 @default.
- W3162276117 hasAuthorship W3162276117A5080286262 @default.
- W3162276117 hasBestOaLocation W31622761171 @default.
- W3162276117 hasConcept C119599485 @default.
- W3162276117 hasConcept C119857082 @default.
- W3162276117 hasConcept C127413603 @default.
- W3162276117 hasConcept C137293760 @default.
- W3162276117 hasConcept C154945302 @default.
- W3162276117 hasConcept C165801399 @default.
- W3162276117 hasConcept C17744445 @default.
- W3162276117 hasConcept C199539241 @default.
- W3162276117 hasConcept C2992317946 @default.
- W3162276117 hasConcept C41008148 @default.
- W3162276117 hasConcept C66322947 @default.
- W3162276117 hasConceptScore W3162276117C119599485 @default.
- W3162276117 hasConceptScore W3162276117C119857082 @default.
- W3162276117 hasConceptScore W3162276117C127413603 @default.
- W3162276117 hasConceptScore W3162276117C137293760 @default.
- W3162276117 hasConceptScore W3162276117C154945302 @default.
- W3162276117 hasConceptScore W3162276117C165801399 @default.
- W3162276117 hasConceptScore W3162276117C17744445 @default.
- W3162276117 hasConceptScore W3162276117C199539241 @default.
- W3162276117 hasConceptScore W3162276117C2992317946 @default.
- W3162276117 hasConceptScore W3162276117C41008148 @default.
- W3162276117 hasConceptScore W3162276117C66322947 @default.
- W3162276117 hasLocation W31622761171 @default.
- W3162276117 hasOpenAccess W3162276117 @default.
- W3162276117 hasPrimaryLocation W31622761171 @default.
- W3162276117 hasRelatedWork W2961085424 @default.
- W3162276117 hasRelatedWork W2996568036 @default.
- W3162276117 hasRelatedWork W3084737437 @default.
- W3162276117 hasRelatedWork W3107474891 @default.