Matches in SemOpenAlex for { <https://semopenalex.org/work/W2890881297> ?p ?o ?g. }
- W2890881297 abstract "Many important entity types in web documents, such as dates, times, email addresses, and course numbers, follow or closely resemble patterns that can be described by Regular Expressions (REs). Due to a vast diversity of web documents and ways in which they are being generated, even seemingly straightforward tasks such as identifying mentions of date in a document become very challenging. It is reasonable to claim that it is impossible to create a RE that is capable of identifying such entities from web documents with perfect precision and recall. Rather than abandoning REs as a go-to approach for entity detection, this paper explores ways to combine the expressive power of REs, ability of deep learning to learn from large data, and human-in-the loop approach into a new integrated framework for entity identification from web data. The framework starts by creating or collecting the existing REs for a particular type of an entity. Those REs are then used over a large document corpus to collect weak labels for the entity mentions and a neural network is trained to predict those RE-generated weak labels. Finally, a human expert is asked to label a small set of documents and the neural network is fine tuned on those documents. The experimental evaluation on several entity identification problems shows that the proposed framework achieves impressive accuracy, while requiring very modest human effort." @default.
- W2890881297 created "2018-09-27" @default.
- W2890881297 creator A5006021168 @default.
- W2890881297 creator A5046910090 @default.
- W2890881297 creator A5057346703 @default.
- W2890881297 creator A5059847153 @default.
- W2890881297 date "2018-01-01" @default.
- W2890881297 modified "2023-09-25" @default.
- W2890881297 title "Regular Expression Guided Entity Mention Mining from Noisy Web Data" @default.
- W2890881297 cites W1493490255 @default.
- W2890881297 cites W1508480967 @default.
- W2890881297 cites W1775135849 @default.
- W2890881297 cites W1832693441 @default.
- W2890881297 cites W1940872118 @default.
- W2890881297 cites W1964189668 @default.
- W2890881297 cites W1987538593 @default.
- W2890881297 cites W2020278455 @default.
- W2890881297 cites W2026213144 @default.
- W2890881297 cites W2038941723 @default.
- W2890881297 cites W2039532210 @default.
- W2890881297 cites W2047477415 @default.
- W2890881297 cites W2059383863 @default.
- W2890881297 cites W2080666934 @default.
- W2890881297 cites W2097998348 @default.
- W2890881297 cites W2106950427 @default.
- W2890881297 cites W2134150392 @default.
- W2890881297 cites W2138857742 @default.
- W2890881297 cites W2148540243 @default.
- W2890881297 cites W2155541015 @default.
- W2890881297 cites W2265846598 @default.
- W2890881297 cites W2275294428 @default.
- W2890881297 cites W2296283641 @default.
- W2890881297 cites W2311110368 @default.
- W2890881297 cites W2597655663 @default.
- W2890881297 cites W2602288119 @default.
- W2890881297 cites W2604184171 @default.
- W2890881297 cites W2799010330 @default.
- W2890881297 cites W2963625095 @default.
- W2890881297 cites W2963703197 @default.
- W2890881297 cites W2964284687 @default.
- W2890881297 cites W2964343412 @default.
- W2890881297 cites W3106003309 @default.
- W2890881297 cites W72959484 @default.
- W2890881297 doi "https://doi.org/10.18653/v1/d18-1224" @default.
- W2890881297 hasPublicationYear "2018" @default.
- W2890881297 type Work @default.
- W2890881297 sameAs 2890881297 @default.
- W2890881297 citedByCount "20" @default.
- W2890881297 countsByYear W28908812972019 @default.
- W2890881297 countsByYear W28908812972020 @default.
- W2890881297 countsByYear W28908812972021 @default.
- W2890881297 countsByYear W28908812972022 @default.
- W2890881297 countsByYear W28908812972023 @default.
- W2890881297 crossrefType "proceedings-article" @default.
- W2890881297 hasAuthorship W2890881297A5006021168 @default.
- W2890881297 hasAuthorship W2890881297A5046910090 @default.
- W2890881297 hasAuthorship W2890881297A5057346703 @default.
- W2890881297 hasAuthorship W2890881297A5059847153 @default.
- W2890881297 hasBestOaLocation W28908812971 @default.
- W2890881297 hasConcept C116834253 @default.
- W2890881297 hasConcept C121329065 @default.
- W2890881297 hasConcept C136764020 @default.
- W2890881297 hasConcept C154945302 @default.
- W2890881297 hasConcept C162324750 @default.
- W2890881297 hasConcept C177264268 @default.
- W2890881297 hasConcept C187736073 @default.
- W2890881297 hasConcept C199360897 @default.
- W2890881297 hasConcept C204321447 @default.
- W2890881297 hasConcept C23123220 @default.
- W2890881297 hasConcept C2779135771 @default.
- W2890881297 hasConcept C2780451532 @default.
- W2890881297 hasConcept C41008148 @default.
- W2890881297 hasConcept C4554734 @default.
- W2890881297 hasConcept C59822182 @default.
- W2890881297 hasConcept C81669768 @default.
- W2890881297 hasConcept C86803240 @default.
- W2890881297 hasConcept C96711827 @default.
- W2890881297 hasConceptScore W2890881297C116834253 @default.
- W2890881297 hasConceptScore W2890881297C121329065 @default.
- W2890881297 hasConceptScore W2890881297C136764020 @default.
- W2890881297 hasConceptScore W2890881297C154945302 @default.
- W2890881297 hasConceptScore W2890881297C162324750 @default.
- W2890881297 hasConceptScore W2890881297C177264268 @default.
- W2890881297 hasConceptScore W2890881297C187736073 @default.
- W2890881297 hasConceptScore W2890881297C199360897 @default.
- W2890881297 hasConceptScore W2890881297C204321447 @default.
- W2890881297 hasConceptScore W2890881297C23123220 @default.
- W2890881297 hasConceptScore W2890881297C2779135771 @default.
- W2890881297 hasConceptScore W2890881297C2780451532 @default.
- W2890881297 hasConceptScore W2890881297C41008148 @default.
- W2890881297 hasConceptScore W2890881297C4554734 @default.
- W2890881297 hasConceptScore W2890881297C59822182 @default.
- W2890881297 hasConceptScore W2890881297C81669768 @default.
- W2890881297 hasConceptScore W2890881297C86803240 @default.
- W2890881297 hasConceptScore W2890881297C96711827 @default.
- W2890881297 hasLocation W28908812971 @default.
- W2890881297 hasOpenAccess W2890881297 @default.
- W2890881297 hasPrimaryLocation W28908812971 @default.
- W2890881297 hasRelatedWork W1978990931 @default.
- W2890881297 hasRelatedWork W2045347339 @default.