Matches in SemOpenAlex for { <https://semopenalex.org/work/W4304185307> ?p ?o ?g. }
- W4304185307 endingPage "591" @default.
- W4304185307 startingPage "591" @default.
- W4304185307 abstract "We consider the problem of solving Natural Language Understanding (NLU) tasks characterized by domain-specific data. An effective approach consists of pre-training Transformer-based language models from scratch using domain-specific data before fine-tuning them on the task at hand. A low domain-specific data volume is problematic in this context, given that the performance of language models relies heavily on the abundance of data during pre-training. To study this problem, we create a benchmark replicating realistic field use of language models to classify aviation occurrences extracted from the Aviation Safety Reporting System (ASRS) corpus. We compare two language models on this new benchmark: ASRS-CMFS, a compact model inspired from RoBERTa, pre-trained from scratch using only little domain-specific data, and the regular RoBERTa model, with no domain-specific pre-training. The RoBERTa model benefits from its size advantage, while the ASRS-CMFS benefits from the pre-training from scratch strategy. We find no compelling statistical evidence that RoBERTa outperforms ASRS-CMFS, but we show that ASRS-CMFS is more compute-efficient than RoBERTa. We suggest that pre-training a compact model from scratch is a good strategy for solving domain-specific NLU tasks using Transformer-based language models in the context of domain-specific data scarcity." @default.
- W4304185307 created "2022-10-11" @default.
- W4304185307 creator A5018831676 @default.
- W4304185307 creator A5082223661 @default.
- W4304185307 creator A5083638199 @default.
- W4304185307 date "2022-10-11" @default.
- W4304185307 modified "2023-10-03" @default.
- W4304185307 title "ASRS-CMFS vs. RoBERTa: Comparing Two Pre-Trained Language Models to Predict Anomalies in Aviation Occurrence Reports with a Low Volume of In-Domain Data Available" @default.
- W4304185307 cites W1499582806 @default.
- W4304185307 cites W2120530657 @default.
- W4304185307 cites W2460755166 @default.
- W4304185307 cites W2781900029 @default.
- W4304185307 cites W2799054028 @default.
- W4304185307 cites W2965373594 @default.
- W4304185307 cites W2970442950 @default.
- W4304185307 cites W2972241794 @default.
- W4304185307 cites W2979826702 @default.
- W4304185307 cites W2999309192 @default.
- W4304185307 cites W3046375318 @default.
- W4304185307 cites W3105069964 @default.
- W4304185307 cites W3105424285 @default.
- W4304185307 cites W3113833985 @default.
- W4304185307 cites W3127899369 @default.
- W4304185307 cites W3166904074 @default.
- W4304185307 cites W3214897310 @default.
- W4304185307 cites W4244007374 @default.
- W4304185307 cites W4286850742 @default.
- W4304185307 cites W4287240854 @default.
- W4304185307 cites W4287643567 @default.
- W4304185307 cites W4287645952 @default.
- W4304185307 cites W4287824654 @default.
- W4304185307 cites W4287867774 @default.
- W4304185307 cites W4288089799 @default.
- W4304185307 cites W4385245566 @default.
- W4304185307 doi "https://doi.org/10.3390/aerospace9100591" @default.
- W4304185307 hasPublicationYear "2022" @default.
- W4304185307 type Work @default.
- W4304185307 citedByCount "3" @default.
- W4304185307 countsByYear W43041853072023 @default.
- W4304185307 crossrefType "journal-article" @default.
- W4304185307 hasAuthorship W4304185307A5018831676 @default.
- W4304185307 hasAuthorship W4304185307A5082223661 @default.
- W4304185307 hasAuthorship W4304185307A5083638199 @default.
- W4304185307 hasBestOaLocation W43041853071 @default.
- W4304185307 hasConcept C111919701 @default.
- W4304185307 hasConcept C119599485 @default.
- W4304185307 hasConcept C119857082 @default.
- W4304185307 hasConcept C124101348 @default.
- W4304185307 hasConcept C127413603 @default.
- W4304185307 hasConcept C13280743 @default.
- W4304185307 hasConcept C134306372 @default.
- W4304185307 hasConcept C137293760 @default.
- W4304185307 hasConcept C146978453 @default.
- W4304185307 hasConcept C151730666 @default.
- W4304185307 hasConcept C154945302 @default.
- W4304185307 hasConcept C165801399 @default.
- W4304185307 hasConcept C185798385 @default.
- W4304185307 hasConcept C204321447 @default.
- W4304185307 hasConcept C205649164 @default.
- W4304185307 hasConcept C2779343474 @default.
- W4304185307 hasConcept C2781235140 @default.
- W4304185307 hasConcept C33923547 @default.
- W4304185307 hasConcept C36503486 @default.
- W4304185307 hasConcept C41008148 @default.
- W4304185307 hasConcept C66322947 @default.
- W4304185307 hasConcept C74448152 @default.
- W4304185307 hasConcept C86803240 @default.
- W4304185307 hasConceptScore W4304185307C111919701 @default.
- W4304185307 hasConceptScore W4304185307C119599485 @default.
- W4304185307 hasConceptScore W4304185307C119857082 @default.
- W4304185307 hasConceptScore W4304185307C124101348 @default.
- W4304185307 hasConceptScore W4304185307C127413603 @default.
- W4304185307 hasConceptScore W4304185307C13280743 @default.
- W4304185307 hasConceptScore W4304185307C134306372 @default.
- W4304185307 hasConceptScore W4304185307C137293760 @default.
- W4304185307 hasConceptScore W4304185307C146978453 @default.
- W4304185307 hasConceptScore W4304185307C151730666 @default.
- W4304185307 hasConceptScore W4304185307C154945302 @default.
- W4304185307 hasConceptScore W4304185307C165801399 @default.
- W4304185307 hasConceptScore W4304185307C185798385 @default.
- W4304185307 hasConceptScore W4304185307C204321447 @default.
- W4304185307 hasConceptScore W4304185307C205649164 @default.
- W4304185307 hasConceptScore W4304185307C2779343474 @default.
- W4304185307 hasConceptScore W4304185307C2781235140 @default.
- W4304185307 hasConceptScore W4304185307C33923547 @default.
- W4304185307 hasConceptScore W4304185307C36503486 @default.
- W4304185307 hasConceptScore W4304185307C41008148 @default.
- W4304185307 hasConceptScore W4304185307C66322947 @default.
- W4304185307 hasConceptScore W4304185307C74448152 @default.
- W4304185307 hasConceptScore W4304185307C86803240 @default.
- W4304185307 hasIssue "10" @default.
- W4304185307 hasLocation W43041853071 @default.
- W4304185307 hasLocation W43041853072 @default.
- W4304185307 hasLocation W43041853073 @default.
- W4304185307 hasLocation W43041853074 @default.
- W4304185307 hasLocation W43041853075 @default.
- W4304185307 hasOpenAccess W4304185307 @default.
- W4304185307 hasPrimaryLocation W43041853071 @default.