Matches in SemOpenAlex for { <https://semopenalex.org/work/W2017231497> ?p ?o ?g. }
Showing items 1 to 68 of
68
with 100 items per page.
- W2017231497 abstract "In this paper, we address the problem of author extraction (AE) from user generated content (UGC) pages. Most existing solutions for web information extraction, including AE, adopt supervised approaches, which require expensive manual annotation. We propose a novel unsupervised approach for automatically collecting and labeling training data based on two key observations of author names: (1) people tend to use a single name across sites if their preferred names are available; (2) people tend to create unique usernames to easily distinguish themselves from others, e.g. travelbug61. Our AE solution only requires features extracted from a single UGC page instead of relying on clues from multiple UGC pages. We conducted extensive experiments. (1) The evaluation of automatically labeled author field data shows 95.0% precision. (2) Our method achieves an F1 score of 96.1%, which significantly outperforms a state-of-the-art supervised approach with single page features (F1 score: 68.4%) and has a comparable performance to its multiple page solution (F1 score: 95.4%). (3) We also examine the robustness of our approach on various UGC pages from forums and review sites, and achieve promising results as well." @default.
- W2017231497 created "2016-06-24" @default.
- W2017231497 creator A5035282947 @default.
- W2017231497 creator A5040999303 @default.
- W2017231497 creator A5060313515 @default.
- W2017231497 creator A5090151187 @default.
- W2017231497 date "2012-10-29" @default.
- W2017231497 modified "2023-09-24" @default.
- W2017231497 title "An unsupervised method for author extraction from web pages containing user-generated content" @default.
- W2017231497 cites W1601308271 @default.
- W2017231497 cites W1973483159 @default.
- W2017231497 cites W2008733342 @default.
- W2017231497 cites W2021314079 @default.
- W2017231497 cites W2029331575 @default.
- W2017231497 cites W2083563309 @default.
- W2017231497 cites W2143309843 @default.
- W2017231497 cites W2153635508 @default.
- W2017231497 cites W2154445423 @default.
- W2017231497 cites W2158551114 @default.
- W2017231497 doi "https://doi.org/10.1145/2396761.2398647" @default.
- W2017231497 hasPublicationYear "2012" @default.
- W2017231497 type Work @default.
- W2017231497 sameAs 2017231497 @default.
- W2017231497 citedByCount "3" @default.
- W2017231497 countsByYear W20172314972013 @default.
- W2017231497 countsByYear W20172314972014 @default.
- W2017231497 crossrefType "proceedings-article" @default.
- W2017231497 hasAuthorship W2017231497A5035282947 @default.
- W2017231497 hasAuthorship W2017231497A5040999303 @default.
- W2017231497 hasAuthorship W2017231497A5060313515 @default.
- W2017231497 hasAuthorship W2017231497A5090151187 @default.
- W2017231497 hasConcept C134306372 @default.
- W2017231497 hasConcept C136764020 @default.
- W2017231497 hasConcept C185592680 @default.
- W2017231497 hasConcept C21959979 @default.
- W2017231497 hasConcept C23123220 @default.
- W2017231497 hasConcept C2778152352 @default.
- W2017231497 hasConcept C33923547 @default.
- W2017231497 hasConcept C41008148 @default.
- W2017231497 hasConcept C43617362 @default.
- W2017231497 hasConcept C4725764 @default.
- W2017231497 hasConceptScore W2017231497C134306372 @default.
- W2017231497 hasConceptScore W2017231497C136764020 @default.
- W2017231497 hasConceptScore W2017231497C185592680 @default.
- W2017231497 hasConceptScore W2017231497C21959979 @default.
- W2017231497 hasConceptScore W2017231497C23123220 @default.
- W2017231497 hasConceptScore W2017231497C2778152352 @default.
- W2017231497 hasConceptScore W2017231497C33923547 @default.
- W2017231497 hasConceptScore W2017231497C41008148 @default.
- W2017231497 hasConceptScore W2017231497C43617362 @default.
- W2017231497 hasConceptScore W2017231497C4725764 @default.
- W2017231497 hasLocation W20172314971 @default.
- W2017231497 hasOpenAccess W2017231497 @default.
- W2017231497 hasPrimaryLocation W20172314971 @default.
- W2017231497 hasRelatedWork W1545545132 @default.
- W2017231497 hasRelatedWork W1840312346 @default.
- W2017231497 hasRelatedWork W2044968286 @default.
- W2017231497 hasRelatedWork W2110357112 @default.
- W2017231497 hasRelatedWork W2144190808 @default.
- W2017231497 hasRelatedWork W240673130 @default.
- W2017231497 hasRelatedWork W2411679502 @default.
- W2017231497 hasRelatedWork W3216588747 @default.
- W2017231497 hasRelatedWork W2513545296 @default.
- W2017231497 hasRelatedWork W2592441986 @default.
- W2017231497 isParatext "false" @default.
- W2017231497 isRetracted "false" @default.
- W2017231497 magId "2017231497" @default.
- W2017231497 workType "article" @default.