Matches in SemOpenAlex for { <https://semopenalex.org/work/W3033188311> ?p ?o ?g. }
- W3033188311 abstract "With the success of language pretraining, it is highly desirable to develop more efficient architectures of good scalability that can exploit the abundant unlabeled data at a lower cost. To improve the efficiency, we examine the much-overlooked redundancy in maintaining a full-length token-level presentation, especially for tasks that only require a single-vector presentation of the sequence. With this intuition, we propose Funnel-Transformer which gradually compresses the sequence of hidden states to a shorter one and hence reduces the computation cost. More importantly, by re-investing the saved FLOPs from length reduction in constructing a deeper or wider model, we further improve the model capacity. In addition, to perform token-level predictions as required by common pretraining objectives, Funnel-Transformer is able to recover a deep representation for each token from the reduced hidden sequence via a decoder. Empirically, with comparable or fewer FLOPs, Funnel-Transformer outperforms the standard Transformer on a wide variety of sequence-level prediction tasks, including text classification, language understanding, and reading comprehension. The code and pretrained checkpoints are available at this https URL." @default.
- W3033188311 created "2020-06-12" @default.
- W3033188311 creator A5015579277 @default.
- W3033188311 creator A5041434096 @default.
- W3033188311 creator A5088551093 @default.
- W3033188311 creator A5091869105 @default.
- W3033188311 date "2020-06-05" @default.
- W3033188311 modified "2023-09-27" @default.
- W3033188311 title "Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing" @default.
- W3033188311 cites W1523493493 @default.
- W3033188311 cites W1901129140 @default.
- W3033188311 cites W2251849926 @default.
- W3033188311 cites W2606964149 @default.
- W3033188311 cites W2811124557 @default.
- W3033188311 cites W2908336025 @default.
- W3033188311 cites W2911109671 @default.
- W3033188311 cites W2912521296 @default.
- W3033188311 cites W2945918281 @default.
- W3033188311 cites W2952468927 @default.
- W3033188311 cites W2962739339 @default.
- W3033188311 cites W2963012544 @default.
- W3033188311 cites W2963175980 @default.
- W3033188311 cites W2963310665 @default.
- W3033188311 cites W2963341956 @default.
- W3033188311 cites W2963403868 @default.
- W3033188311 cites W2963854351 @default.
- W3033188311 cites W2965373594 @default.
- W3033188311 cites W2970528773 @default.
- W3033188311 cites W2970597249 @default.
- W3033188311 cites W2975059944 @default.
- W3033188311 cites W2980360762 @default.
- W3033188311 cites W3000103182 @default.
- W3033188311 cites W3013571468 @default.
- W3033188311 cites W3019527251 @default.
- W3033188311 cites W3030520226 @default.
- W3033188311 cites W3034457371 @default.
- W3033188311 cites W3034999214 @default.
- W3033188311 cites W3082274269 @default.
- W3033188311 cites W3105163367 @default.
- W3033188311 hasPublicationYear "2020" @default.
- W3033188311 type Work @default.
- W3033188311 sameAs 3033188311 @default.
- W3033188311 citedByCount "0" @default.
- W3033188311 crossrefType "posted-content" @default.
- W3033188311 hasAuthorship W3033188311A5015579277 @default.
- W3033188311 hasAuthorship W3033188311A5041434096 @default.
- W3033188311 hasAuthorship W3033188311A5088551093 @default.
- W3033188311 hasAuthorship W3033188311A5091869105 @default.
- W3033188311 hasConcept C111919701 @default.
- W3033188311 hasConcept C119599485 @default.
- W3033188311 hasConcept C127413603 @default.
- W3033188311 hasConcept C137293760 @default.
- W3033188311 hasConcept C152124472 @default.
- W3033188311 hasConcept C154945302 @default.
- W3033188311 hasConcept C165696696 @default.
- W3033188311 hasConcept C165801399 @default.
- W3033188311 hasConcept C31258907 @default.
- W3033188311 hasConcept C38652104 @default.
- W3033188311 hasConcept C41008148 @default.
- W3033188311 hasConcept C48044578 @default.
- W3033188311 hasConcept C48145219 @default.
- W3033188311 hasConcept C66322947 @default.
- W3033188311 hasConcept C77088390 @default.
- W3033188311 hasConceptScore W3033188311C111919701 @default.
- W3033188311 hasConceptScore W3033188311C119599485 @default.
- W3033188311 hasConceptScore W3033188311C127413603 @default.
- W3033188311 hasConceptScore W3033188311C137293760 @default.
- W3033188311 hasConceptScore W3033188311C152124472 @default.
- W3033188311 hasConceptScore W3033188311C154945302 @default.
- W3033188311 hasConceptScore W3033188311C165696696 @default.
- W3033188311 hasConceptScore W3033188311C165801399 @default.
- W3033188311 hasConceptScore W3033188311C31258907 @default.
- W3033188311 hasConceptScore W3033188311C38652104 @default.
- W3033188311 hasConceptScore W3033188311C41008148 @default.
- W3033188311 hasConceptScore W3033188311C48044578 @default.
- W3033188311 hasConceptScore W3033188311C48145219 @default.
- W3033188311 hasConceptScore W3033188311C66322947 @default.
- W3033188311 hasConceptScore W3033188311C77088390 @default.
- W3033188311 hasLocation W30331883111 @default.
- W3033188311 hasOpenAccess W3033188311 @default.
- W3033188311 hasPrimaryLocation W30331883111 @default.
- W3033188311 hasRelatedWork W2172165802 @default.
- W3033188311 hasRelatedWork W2338908902 @default.
- W3033188311 hasRelatedWork W2784755822 @default.
- W3033188311 hasRelatedWork W2804989836 @default.
- W3033188311 hasRelatedWork W2885892937 @default.
- W3033188311 hasRelatedWork W2898044751 @default.
- W3033188311 hasRelatedWork W2970401203 @default.
- W3033188311 hasRelatedWork W2991151250 @default.
- W3033188311 hasRelatedWork W3016112239 @default.
- W3033188311 hasRelatedWork W3033182847 @default.
- W3033188311 hasRelatedWork W3102892879 @default.
- W3033188311 hasRelatedWork W3119334387 @default.
- W3033188311 hasRelatedWork W3147053158 @default.
- W3033188311 hasRelatedWork W3159727696 @default.
- W3033188311 hasRelatedWork W3164045210 @default.
- W3033188311 hasRelatedWork W3175941285 @default.
- W3033188311 hasRelatedWork W3180037928 @default.
- W3033188311 hasRelatedWork W3189953165 @default.
- W3033188311 hasRelatedWork W3208397797 @default.