Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385571290> ?p ?o ?g. }
Showing items 1 to 61 of
61
with 100 items per page.
- W4385571290 abstract "The BERT model and its variants have made great achievements in many downstream natural language processing tasks. The achievements of these models, however, demand highly expensive pre-training computation cost. To address this pre-training efficiency issue, the ELECTRA model is proposed to use a discriminator to perform replaced token detection (RTD) task, that is, to classify whether each input token is original or replaced by a generator. The RTD task performed by the ELECTRA accelerates pre-training so substantially, such that it is very challenging to further improve the pre-training efficiency established by the ELECTRA by using or adding other pre-training tasks, as the recent comprehensive study of Bajaj et al. (2022) summarizes. To further advance this pre-training efficiency frontier, in this paper we propose to extend the RTD task into a task of ranking input tokens according to K different quality levels. Essentially, we generalize the binary classifier in the ELECTRA into a K-level ranker to undertake a more precise task with negligible additional computation cost. Our extensive experiments show that our proposed method is able to outperform the state-of-the-art pre-training efficient models including ELECTRA in downstream GLUE tasks given the same computation cost." @default.
- W4385571290 created "2023-08-05" @default.
- W4385571290 creator A5010446607 @default.
- W4385571290 creator A5047856952 @default.
- W4385571290 creator A5061710815 @default.
- W4385571290 creator A5062755510 @default.
- W4385571290 date "2023-01-01" @default.
- W4385571290 modified "2023-09-26" @default.
- W4385571290 title "PEER: Pre-training ELECTRA Extended by Ranking" @default.
- W4385571290 doi "https://doi.org/10.18653/v1/2023.findings-acl.405" @default.
- W4385571290 hasPublicationYear "2023" @default.
- W4385571290 type Work @default.
- W4385571290 citedByCount "0" @default.
- W4385571290 crossrefType "proceedings-article" @default.
- W4385571290 hasAuthorship W4385571290A5010446607 @default.
- W4385571290 hasAuthorship W4385571290A5047856952 @default.
- W4385571290 hasAuthorship W4385571290A5061710815 @default.
- W4385571290 hasAuthorship W4385571290A5062755510 @default.
- W4385571290 hasBestOaLocation W43855712901 @default.
- W4385571290 hasConcept C11413529 @default.
- W4385571290 hasConcept C119857082 @default.
- W4385571290 hasConcept C127413603 @default.
- W4385571290 hasConcept C154945302 @default.
- W4385571290 hasConcept C201995342 @default.
- W4385571290 hasConcept C2779803651 @default.
- W4385571290 hasConcept C2780451532 @default.
- W4385571290 hasConcept C38652104 @default.
- W4385571290 hasConcept C41008148 @default.
- W4385571290 hasConcept C45374587 @default.
- W4385571290 hasConcept C48145219 @default.
- W4385571290 hasConcept C76155785 @default.
- W4385571290 hasConcept C94915269 @default.
- W4385571290 hasConceptScore W4385571290C11413529 @default.
- W4385571290 hasConceptScore W4385571290C119857082 @default.
- W4385571290 hasConceptScore W4385571290C127413603 @default.
- W4385571290 hasConceptScore W4385571290C154945302 @default.
- W4385571290 hasConceptScore W4385571290C201995342 @default.
- W4385571290 hasConceptScore W4385571290C2779803651 @default.
- W4385571290 hasConceptScore W4385571290C2780451532 @default.
- W4385571290 hasConceptScore W4385571290C38652104 @default.
- W4385571290 hasConceptScore W4385571290C41008148 @default.
- W4385571290 hasConceptScore W4385571290C45374587 @default.
- W4385571290 hasConceptScore W4385571290C48145219 @default.
- W4385571290 hasConceptScore W4385571290C76155785 @default.
- W4385571290 hasConceptScore W4385571290C94915269 @default.
- W4385571290 hasLocation W43855712901 @default.
- W4385571290 hasOpenAccess W4385571290 @default.
- W4385571290 hasPrimaryLocation W43855712901 @default.
- W4385571290 hasRelatedWork W1985412924 @default.
- W4385571290 hasRelatedWork W2375389409 @default.
- W4385571290 hasRelatedWork W2488051804 @default.
- W4385571290 hasRelatedWork W2961085424 @default.
- W4385571290 hasRelatedWork W4280544492 @default.
- W4385571290 hasRelatedWork W4285260836 @default.
- W4385571290 hasRelatedWork W4286629047 @default.
- W4385571290 hasRelatedWork W4306321456 @default.
- W4385571290 hasRelatedWork W4306674287 @default.
- W4385571290 hasRelatedWork W4224009465 @default.
- W4385571290 isParatext "false" @default.
- W4385571290 isRetracted "false" @default.
- W4385571290 workType "article" @default.