Matches in SemOpenAlex for { <https://semopenalex.org/work/W2100958137> ?p ?o ?g. }
Showing items 1 to 65 of
65
with 100 items per page.
- W2100958137 abstract "Users of Web search engines are often forced to sift through the long ordered list of document returned by the engines. The IR community has explored document clustering as an alternative method of organizing retrieval results, but clustering has yet to be deployed on the major search engines. The paper articulates the unique requirements of Web document clustering and reports on the first evaluation of clustering methods in this domain. A key requirement is that the methods create their clusters based on the short snippets returned by Web search engines. Surprisingly, we find that clusters based on snippets are almost as good as clusters created using the full text of Web documents. To satisfy the stringent requirements of the Web domain, we introduce an incremental, linear time (in the document collection size) algorithm called Suffix Tree Clustering (STC). which creates clusters based on phrases shared between documents. We show that STC is faster than standard clustering methods in this domain, and argue that Web document clustering via STC is both feasible and potentially beneficial." @default.
- W2100958137 created "2016-06-24" @default.
- W2100958137 creator A5069363205 @default.
- W2100958137 creator A5083075229 @default.
- W2100958137 date "1998-08-01" @default.
- W2100958137 modified "2023-10-16" @default.
- W2100958137 title "Web document clustering" @default.
- W2100958137 cites W1975152892 @default.
- W2100958137 cites W2008375984 @default.
- W2100958137 cites W2022828110 @default.
- W2100958137 cites W2040058125 @default.
- W2100958137 cites W2046878543 @default.
- W2100958137 cites W2059513841 @default.
- W2100958137 cites W2074449313 @default.
- W2100958137 cites W2145036943 @default.
- W2100958137 cites W2216446631 @default.
- W2100958137 cites W2533248932 @default.
- W2100958137 cites W4241122026 @default.
- W2100958137 doi "https://doi.org/10.1145/290941.290956" @default.
- W2100958137 hasPublicationYear "1998" @default.
- W2100958137 type Work @default.
- W2100958137 sameAs 2100958137 @default.
- W2100958137 citedByCount "958" @default.
- W2100958137 countsByYear W21009581372012 @default.
- W2100958137 countsByYear W21009581372013 @default.
- W2100958137 countsByYear W21009581372014 @default.
- W2100958137 countsByYear W21009581372015 @default.
- W2100958137 countsByYear W21009581372016 @default.
- W2100958137 countsByYear W21009581372017 @default.
- W2100958137 countsByYear W21009581372018 @default.
- W2100958137 countsByYear W21009581372019 @default.
- W2100958137 countsByYear W21009581372020 @default.
- W2100958137 countsByYear W21009581372021 @default.
- W2100958137 countsByYear W21009581372022 @default.
- W2100958137 countsByYear W21009581372023 @default.
- W2100958137 crossrefType "proceedings-article" @default.
- W2100958137 hasAuthorship W2100958137A5069363205 @default.
- W2100958137 hasAuthorship W2100958137A5083075229 @default.
- W2100958137 hasConcept C136764020 @default.
- W2100958137 hasConcept C154945302 @default.
- W2100958137 hasConcept C23123220 @default.
- W2100958137 hasConcept C41008148 @default.
- W2100958137 hasConcept C73555534 @default.
- W2100958137 hasConceptScore W2100958137C136764020 @default.
- W2100958137 hasConceptScore W2100958137C154945302 @default.
- W2100958137 hasConceptScore W2100958137C23123220 @default.
- W2100958137 hasConceptScore W2100958137C41008148 @default.
- W2100958137 hasConceptScore W2100958137C73555534 @default.
- W2100958137 hasLocation W21009581371 @default.
- W2100958137 hasOpenAccess W2100958137 @default.
- W2100958137 hasPrimaryLocation W21009581371 @default.
- W2100958137 hasRelatedWork W1999627569 @default.
- W2100958137 hasRelatedWork W2115485936 @default.
- W2100958137 hasRelatedWork W2144190808 @default.
- W2100958137 hasRelatedWork W2153015554 @default.
- W2100958137 hasRelatedWork W2357241418 @default.
- W2100958137 hasRelatedWork W2366644548 @default.
- W2100958137 hasRelatedWork W2376314740 @default.
- W2100958137 hasRelatedWork W2384888906 @default.
- W2100958137 hasRelatedWork W2748952813 @default.
- W2100958137 hasRelatedWork W763609066 @default.
- W2100958137 isParatext "false" @default.
- W2100958137 isRetracted "false" @default.
- W2100958137 magId "2100958137" @default.
- W2100958137 workType "article" @default.