Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313291246> ?p ?o ?g. }
- W4313291246 endingPage "838" @default.
- W4313291246 startingPage "825" @default.
- W4313291246 abstract "With the accelerating development of science and technology, the academic papers being published in various fields are increasing rapidly. Academic papers specially in science and technology fields are a crucial media for researchers who develop new technologies by identifying knowledge regarding the latest technological trends and conduct derivative studies in science and technology. Therefore, the continual collection of extensive academic papers, structuring of metadata, and construction of databases are significant tasks. However, research on automatic metadata extraction from Korean papers is not being actively conducted currently owing to insufficient Korean training data. We automatically constructed the largest labeled corpus in South Korea to date from 315,320 PDF papers belonging to 503 Korean academic journals and this labeled corpus can be used for training the models of automatic extraction for 12 metadata types from PDF papers. This labeled corpus is available at <uri xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>https://doi.org/10.23057/48</uri> . Moreover, we developed inspection process and guidelines for the automatically constructed data and performed a full inspection of the validation and testing data. The reliability of the inspected data was verified through the inter-annotator agreement measurement. Using our corpus, we trained and evaluated the BERT based transfer learning model to verify its reliability. Furthermore, we proposed new training methods that can improve the metadata extraction performance of Korean papers, and through these methods, we developed KorSciBERT-ME-J and KorSciBERT-ME-J+C models. The KorSciBERT-ME-J showed the highest performance with an F1 score of 99.36%, as well as robust performance in automatic metadata extraction from Korean academic papers in various formats." @default.
- W4313291246 created "2023-01-06" @default.
- W4313291246 creator A5009196523 @default.
- W4313291246 creator A5018584807 @default.
- W4313291246 creator A5041026063 @default.
- W4313291246 creator A5042948850 @default.
- W4313291246 creator A5046389164 @default.
- W4313291246 creator A5060324897 @default.
- W4313291246 creator A5073681256 @default.
- W4313291246 date "2023-01-01" @default.
- W4313291246 modified "2023-09-30" @default.
- W4313291246 title "Annotated Open Corpus Construction and BERT-Based Approach for Automatic Metadata Extraction From Korean Academic Papers" @default.
- W4313291246 cites W1603719052 @default.
- W4313291246 cites W1762166282 @default.
- W4313291246 cites W1981791873 @default.
- W4313291246 cites W2000731427 @default.
- W4313291246 cites W2015814595 @default.
- W4313291246 cites W2053154970 @default.
- W4313291246 cites W2080928448 @default.
- W4313291246 cites W2107068745 @default.
- W4313291246 cites W2145657314 @default.
- W4313291246 cites W2158755884 @default.
- W4313291246 cites W2164777277 @default.
- W4313291246 cites W2626407095 @default.
- W4313291246 cites W2781934638 @default.
- W4313291246 cites W2911489562 @default.
- W4313291246 cites W2918555272 @default.
- W4313291246 cites W2937573492 @default.
- W4313291246 cites W2963469006 @default.
- W4313291246 cites W2970771982 @default.
- W4313291246 cites W2991170427 @default.
- W4313291246 cites W3102927624 @default.
- W4313291246 cites W3213294137 @default.
- W4313291246 cites W4225358744 @default.
- W4313291246 cites W4239510810 @default.
- W4313291246 cites W4253723135 @default.
- W4313291246 cites W4285222470 @default.
- W4313291246 cites W791527587 @default.
- W4313291246 doi "https://doi.org/10.1109/access.2022.3233228" @default.
- W4313291246 hasPublicationYear "2023" @default.
- W4313291246 type Work @default.
- W4313291246 citedByCount "0" @default.
- W4313291246 crossrefType "journal-article" @default.
- W4313291246 hasAuthorship W4313291246A5009196523 @default.
- W4313291246 hasAuthorship W4313291246A5018584807 @default.
- W4313291246 hasAuthorship W4313291246A5041026063 @default.
- W4313291246 hasAuthorship W4313291246A5042948850 @default.
- W4313291246 hasAuthorship W4313291246A5046389164 @default.
- W4313291246 hasAuthorship W4313291246A5060324897 @default.
- W4313291246 hasAuthorship W4313291246A5073681256 @default.
- W4313291246 hasBestOaLocation W43132912461 @default.
- W4313291246 hasConcept C111919701 @default.
- W4313291246 hasConcept C120567893 @default.
- W4313291246 hasConcept C121332964 @default.
- W4313291246 hasConcept C136764020 @default.
- W4313291246 hasConcept C154945302 @default.
- W4313291246 hasConcept C163258240 @default.
- W4313291246 hasConcept C17744445 @default.
- W4313291246 hasConcept C195807954 @default.
- W4313291246 hasConcept C199539241 @default.
- W4313291246 hasConcept C204321447 @default.
- W4313291246 hasConcept C23123220 @default.
- W4313291246 hasConcept C2777466982 @default.
- W4313291246 hasConcept C2779473830 @default.
- W4313291246 hasConcept C41008148 @default.
- W4313291246 hasConcept C43214815 @default.
- W4313291246 hasConcept C62520636 @default.
- W4313291246 hasConcept C93518851 @default.
- W4313291246 hasConcept C98045186 @default.
- W4313291246 hasConceptScore W4313291246C111919701 @default.
- W4313291246 hasConceptScore W4313291246C120567893 @default.
- W4313291246 hasConceptScore W4313291246C121332964 @default.
- W4313291246 hasConceptScore W4313291246C136764020 @default.
- W4313291246 hasConceptScore W4313291246C154945302 @default.
- W4313291246 hasConceptScore W4313291246C163258240 @default.
- W4313291246 hasConceptScore W4313291246C17744445 @default.
- W4313291246 hasConceptScore W4313291246C195807954 @default.
- W4313291246 hasConceptScore W4313291246C199539241 @default.
- W4313291246 hasConceptScore W4313291246C204321447 @default.
- W4313291246 hasConceptScore W4313291246C23123220 @default.
- W4313291246 hasConceptScore W4313291246C2777466982 @default.
- W4313291246 hasConceptScore W4313291246C2779473830 @default.
- W4313291246 hasConceptScore W4313291246C41008148 @default.
- W4313291246 hasConceptScore W4313291246C43214815 @default.
- W4313291246 hasConceptScore W4313291246C62520636 @default.
- W4313291246 hasConceptScore W4313291246C93518851 @default.
- W4313291246 hasConceptScore W4313291246C98045186 @default.
- W4313291246 hasFunder F4320322105 @default.
- W4313291246 hasFunder F4320328359 @default.
- W4313291246 hasLocation W43132912461 @default.
- W4313291246 hasOpenAccess W4313291246 @default.
- W4313291246 hasPrimaryLocation W43132912461 @default.
- W4313291246 hasRelatedWork W1590308178 @default.
- W4313291246 hasRelatedWork W159132833 @default.
- W4313291246 hasRelatedWork W1788528807 @default.
- W4313291246 hasRelatedWork W2048135704 @default.
- W4313291246 hasRelatedWork W2153799433 @default.
- W4313291246 hasRelatedWork W2393978999 @default.