Matches in SemOpenAlex for { <https://semopenalex.org/work/W2076008912> ?p ?o ?g. }
Showing items 1 to 88 of
88
with 100 items per page.
- W2076008912 abstract "A major challenge in indexing unstructured hypertext databases is to automatically extract meta-data that enables structured search using topic taxonomies, circumvents keyword ambiguity, and improves the quality of search and profile-based routing and filtering. Therefore, an accurate classifier is an essential component of a hypertext database. Hyperlinks pose new problems not addressed in the extensive text classification literature. Links clearly contain high-quality semantic clues that are lost upon a purely term-based classifier, but exploiting link information is non-trivial because it is noisy. Naive use of terms in the link neighborhood of a document can even degrade accuracy. Our contribution is to propose robust statistical models and a relaxation labeling technique for better classification by exploiting link information in a small neighborhood around documents. Our technique also adapts gracefully to the fraction of neighboring documents having known topics. We experimented with pre-classified samples from Yahoo!1 and the US Patent Database2. In previous work, we developed a text classifier that misclassified only 13% of the documents in the well-known Reuters benchmark; this was comparable to the best results ever obtained. This classifier misclassified 36% of the patents, indicating that classifying hypertext can be more difficult than classifying text. Naively using terms in neighboring documents increased error to 38%; our hypertext classifier reduced it to 21%. Results with the Yahoo! sample were more dramatic: the text classifier showed 68% error, whereas our hypertext classifier reduced this to only 21%." @default.
- W2076008912 created "2016-06-24" @default.
- W2076008912 creator A5008781655 @default.
- W2076008912 creator A5031458509 @default.
- W2076008912 creator A5076056716 @default.
- W2076008912 date "1998-06-01" @default.
- W2076008912 modified "2023-10-16" @default.
- W2076008912 title "Enhanced hypertext categorization using hyperlinks" @default.
- W2076008912 cites W1967322151 @default.
- W2076008912 cites W1973610950 @default.
- W2076008912 cites W1973939477 @default.
- W2076008912 cites W1989393439 @default.
- W2076008912 cites W1995252078 @default.
- W2076008912 cites W1999138184 @default.
- W2076008912 cites W2014901406 @default.
- W2076008912 cites W2024968650 @default.
- W2076008912 cites W2029373189 @default.
- W2076008912 cites W2060458867 @default.
- W2076008912 cites W2063735123 @default.
- W2076008912 cites W2074676569 @default.
- W2076008912 cites W2083244749 @default.
- W2076008912 cites W2085908709 @default.
- W2076008912 cites W2093161342 @default.
- W2076008912 cites W2094934653 @default.
- W2076008912 cites W2095150974 @default.
- W2076008912 cites W2100838228 @default.
- W2076008912 cites W2108835275 @default.
- W2076008912 cites W2196501509 @default.
- W2076008912 cites W2440833291 @default.
- W2076008912 doi "https://doi.org/10.1145/276304.276332" @default.
- W2076008912 hasPublicationYear "1998" @default.
- W2076008912 type Work @default.
- W2076008912 sameAs 2076008912 @default.
- W2076008912 citedByCount "634" @default.
- W2076008912 countsByYear W20760089122012 @default.
- W2076008912 countsByYear W20760089122013 @default.
- W2076008912 countsByYear W20760089122014 @default.
- W2076008912 countsByYear W20760089122015 @default.
- W2076008912 countsByYear W20760089122016 @default.
- W2076008912 countsByYear W20760089122017 @default.
- W2076008912 countsByYear W20760089122018 @default.
- W2076008912 countsByYear W20760089122019 @default.
- W2076008912 countsByYear W20760089122020 @default.
- W2076008912 countsByYear W20760089122021 @default.
- W2076008912 countsByYear W20760089122022 @default.
- W2076008912 countsByYear W20760089122023 @default.
- W2076008912 crossrefType "proceedings-article" @default.
- W2076008912 hasAuthorship W2076008912A5008781655 @default.
- W2076008912 hasAuthorship W2076008912A5031458509 @default.
- W2076008912 hasAuthorship W2076008912A5076056716 @default.
- W2076008912 hasBestOaLocation W20760089121 @default.
- W2076008912 hasConcept C136764020 @default.
- W2076008912 hasConcept C154945302 @default.
- W2076008912 hasConcept C162215914 @default.
- W2076008912 hasConcept C21959979 @default.
- W2076008912 hasConcept C23123220 @default.
- W2076008912 hasConcept C30088001 @default.
- W2076008912 hasConcept C41008148 @default.
- W2076008912 hasConcept C75165309 @default.
- W2076008912 hasConcept C94124525 @default.
- W2076008912 hasConcept C95623464 @default.
- W2076008912 hasConceptScore W2076008912C136764020 @default.
- W2076008912 hasConceptScore W2076008912C154945302 @default.
- W2076008912 hasConceptScore W2076008912C162215914 @default.
- W2076008912 hasConceptScore W2076008912C21959979 @default.
- W2076008912 hasConceptScore W2076008912C23123220 @default.
- W2076008912 hasConceptScore W2076008912C30088001 @default.
- W2076008912 hasConceptScore W2076008912C41008148 @default.
- W2076008912 hasConceptScore W2076008912C75165309 @default.
- W2076008912 hasConceptScore W2076008912C94124525 @default.
- W2076008912 hasConceptScore W2076008912C95623464 @default.
- W2076008912 hasLocation W20760089121 @default.
- W2076008912 hasOpenAccess W2076008912 @default.
- W2076008912 hasPrimaryLocation W20760089121 @default.
- W2076008912 hasRelatedWork W1593066723 @default.
- W2076008912 hasRelatedWork W1988192941 @default.
- W2076008912 hasRelatedWork W2051102072 @default.
- W2076008912 hasRelatedWork W2074301807 @default.
- W2076008912 hasRelatedWork W2275637146 @default.
- W2076008912 hasRelatedWork W2403512859 @default.
- W2076008912 hasRelatedWork W2566164015 @default.
- W2076008912 hasRelatedWork W2988234774 @default.
- W2076008912 hasRelatedWork W3158912095 @default.
- W2076008912 hasRelatedWork W392148851 @default.
- W2076008912 isParatext "false" @default.
- W2076008912 isRetracted "false" @default.
- W2076008912 magId "2076008912" @default.
- W2076008912 workType "article" @default.