Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313483387> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4313483387 abstract "Learning to predict masked tokens in a sequence has been shown to be a powerful pretraining objective for large language models. After training, such masked language models can provide distributions of tokens conditioned on bidirectional context. In this paper, we show that contrary to popular assumptions, such bidirectional conditionals often demonstrate considerable inconsistencies, i.e., they cannot be derived from a coherent joint distribution when considered together. We empirically quantify such inconsistencies in the simple scenario of bigram comparison for two common styles of masked language models: T5-style and BERT-style. For example, we show that T5 models often confuse their own preference regarding two similar bigrams. We show that inconsistencies exist ubiquitously in masked language models of diverse sizes and configurations, from RoBERTa-base to GLM-130B. As an initial attempt to address this issue during the inference phase, we propose Ensemble of Conditionals, a self-ensemble algorithm that jointly considers many inconsistent conditionals directly produced by the MLM to synthesize a distribution that is used as the model's final output. Such ensembling improves open-source SOTA results on LAMBADA." @default.
- W4313483387 created "2023-01-06" @default.
- W4313483387 creator A5029277113 @default.
- W4313483387 creator A5052080211 @default.
- W4313483387 date "2022-12-30" @default.
- W4313483387 modified "2023-10-12" @default.
- W4313483387 title "On the Inconsistencies of Conditionals Learned by Masked Language Models" @default.
- W4313483387 doi "https://doi.org/10.48550/arxiv.2301.00068" @default.
- W4313483387 hasPublicationYear "2022" @default.
- W4313483387 type Work @default.
- W4313483387 citedByCount "1" @default.
- W4313483387 countsByYear W43134833872023 @default.
- W4313483387 crossrefType "posted-content" @default.
- W4313483387 hasAuthorship W4313483387A5029277113 @default.
- W4313483387 hasAuthorship W4313483387A5052080211 @default.
- W4313483387 hasBestOaLocation W43134833871 @default.
- W4313483387 hasConcept C108757681 @default.
- W4313483387 hasConcept C111472728 @default.
- W4313483387 hasConcept C119857082 @default.
- W4313483387 hasConcept C137293760 @default.
- W4313483387 hasConcept C137546455 @default.
- W4313483387 hasConcept C138885662 @default.
- W4313483387 hasConcept C151730666 @default.
- W4313483387 hasConcept C154945302 @default.
- W4313483387 hasConcept C166957645 @default.
- W4313483387 hasConcept C204321447 @default.
- W4313483387 hasConcept C2776214188 @default.
- W4313483387 hasConcept C2776445246 @default.
- W4313483387 hasConcept C2778112365 @default.
- W4313483387 hasConcept C2779343474 @default.
- W4313483387 hasConcept C2780586882 @default.
- W4313483387 hasConcept C41008148 @default.
- W4313483387 hasConcept C54355233 @default.
- W4313483387 hasConcept C86803240 @default.
- W4313483387 hasConcept C95457728 @default.
- W4313483387 hasConceptScore W4313483387C108757681 @default.
- W4313483387 hasConceptScore W4313483387C111472728 @default.
- W4313483387 hasConceptScore W4313483387C119857082 @default.
- W4313483387 hasConceptScore W4313483387C137293760 @default.
- W4313483387 hasConceptScore W4313483387C137546455 @default.
- W4313483387 hasConceptScore W4313483387C138885662 @default.
- W4313483387 hasConceptScore W4313483387C151730666 @default.
- W4313483387 hasConceptScore W4313483387C154945302 @default.
- W4313483387 hasConceptScore W4313483387C166957645 @default.
- W4313483387 hasConceptScore W4313483387C204321447 @default.
- W4313483387 hasConceptScore W4313483387C2776214188 @default.
- W4313483387 hasConceptScore W4313483387C2776445246 @default.
- W4313483387 hasConceptScore W4313483387C2778112365 @default.
- W4313483387 hasConceptScore W4313483387C2779343474 @default.
- W4313483387 hasConceptScore W4313483387C2780586882 @default.
- W4313483387 hasConceptScore W4313483387C41008148 @default.
- W4313483387 hasConceptScore W4313483387C54355233 @default.
- W4313483387 hasConceptScore W4313483387C86803240 @default.
- W4313483387 hasConceptScore W4313483387C95457728 @default.
- W4313483387 hasLocation W43134833871 @default.
- W4313483387 hasLocation W43134833872 @default.
- W4313483387 hasOpenAccess W4313483387 @default.
- W4313483387 hasPrimaryLocation W43134833871 @default.
- W4313483387 hasRelatedWork W1500873938 @default.
- W4313483387 hasRelatedWork W1515897616 @default.
- W4313483387 hasRelatedWork W1700330385 @default.
- W4313483387 hasRelatedWork W2002221802 @default.
- W4313483387 hasRelatedWork W2020757772 @default.
- W4313483387 hasRelatedWork W2041167939 @default.
- W4313483387 hasRelatedWork W2105076537 @default.
- W4313483387 hasRelatedWork W2131111393 @default.
- W4313483387 hasRelatedWork W2250909759 @default.
- W4313483387 hasRelatedWork W2562995433 @default.
- W4313483387 isParatext "false" @default.
- W4313483387 isRetracted "false" @default.
- W4313483387 workType "article" @default.