Matches in SemOpenAlex for { <https://semopenalex.org/work/W2072489225> ?p ?o ?g. }
- W2072489225 abstract "The widespread use of templates on the Web is considered harmful for two main reasons. Not only do they compromise the relevance judgment of many web IR and web mining methods such as clustering and classification, but they also negatively impact the performance and resource usage of tools that process web pages. In this paper we present a new method that efficiently and accurately removes templates found in collections of web pages. Our method works in two steps. First, the costly process of template detection is performed over a small set of sample pages. Then, the derived template is removed from the remaining pages in the collection. This leads to substantial performance gains when compared to previous approaches that combine template detection and removal. We show, through an experimental evaluation, that our approach is effective for identifying terms occurring in templates - obtaining F-measure values around 0.9, and that it also boosts the accuracy of web page clustering and classification methods." @default.
- W2072489225 created "2016-06-24" @default.
- W2072489225 creator A5005439190 @default.
- W2072489225 creator A5006773757 @default.
- W2072489225 creator A5015763962 @default.
- W2072489225 creator A5015781017 @default.
- W2072489225 creator A5066214235 @default.
- W2072489225 creator A5078104555 @default.
- W2072489225 date "2006-01-01" @default.
- W2072489225 modified "2023-09-23" @default.
- W2072489225 title "A fast and robust method for web page template detection and removal" @default.
- W2072489225 cites W1628571627 @default.
- W2072489225 cites W1975009259 @default.
- W2072489225 cites W1989338554 @default.
- W2072489225 cites W2005124845 @default.
- W2072489225 cites W2006119904 @default.
- W2072489225 cites W2006997130 @default.
- W2072489225 cites W2009346361 @default.
- W2072489225 cites W2021791566 @default.
- W2072489225 cites W2040075907 @default.
- W2072489225 cites W2048001624 @default.
- W2072489225 cites W2049461910 @default.
- W2072489225 cites W2049781914 @default.
- W2072489225 cites W2057294899 @default.
- W2072489225 cites W2085533860 @default.
- W2072489225 cites W2087303323 @default.
- W2072489225 cites W2108584692 @default.
- W2072489225 cites W2117209866 @default.
- W2072489225 cites W2129595335 @default.
- W2072489225 cites W2161675403 @default.
- W2072489225 cites W2167859982 @default.
- W2072489225 cites W2169786955 @default.
- W2072489225 doi "https://doi.org/10.1145/1183614.1183654" @default.
- W2072489225 hasPublicationYear "2006" @default.
- W2072489225 type Work @default.
- W2072489225 sameAs 2072489225 @default.
- W2072489225 citedByCount "87" @default.
- W2072489225 countsByYear W20724892252012 @default.
- W2072489225 countsByYear W20724892252013 @default.
- W2072489225 countsByYear W20724892252014 @default.
- W2072489225 countsByYear W20724892252015 @default.
- W2072489225 countsByYear W20724892252016 @default.
- W2072489225 countsByYear W20724892252017 @default.
- W2072489225 countsByYear W20724892252018 @default.
- W2072489225 countsByYear W20724892252019 @default.
- W2072489225 countsByYear W20724892252020 @default.
- W2072489225 countsByYear W20724892252021 @default.
- W2072489225 countsByYear W20724892252022 @default.
- W2072489225 countsByYear W20724892252023 @default.
- W2072489225 crossrefType "proceedings-article" @default.
- W2072489225 hasAuthorship W2072489225A5005439190 @default.
- W2072489225 hasAuthorship W2072489225A5006773757 @default.
- W2072489225 hasAuthorship W2072489225A5015763962 @default.
- W2072489225 hasAuthorship W2072489225A5015781017 @default.
- W2072489225 hasAuthorship W2072489225A5066214235 @default.
- W2072489225 hasAuthorship W2072489225A5078104555 @default.
- W2072489225 hasConcept C111919701 @default.
- W2072489225 hasConcept C124101348 @default.
- W2072489225 hasConcept C136764020 @default.
- W2072489225 hasConcept C154945302 @default.
- W2072489225 hasConcept C158154518 @default.
- W2072489225 hasConcept C177264268 @default.
- W2072489225 hasConcept C17744445 @default.
- W2072489225 hasConcept C185592680 @default.
- W2072489225 hasConcept C198531522 @default.
- W2072489225 hasConcept C199360897 @default.
- W2072489225 hasConcept C199539241 @default.
- W2072489225 hasConcept C21959979 @default.
- W2072489225 hasConcept C23123220 @default.
- W2072489225 hasConcept C41008148 @default.
- W2072489225 hasConcept C43617362 @default.
- W2072489225 hasConcept C73555534 @default.
- W2072489225 hasConcept C82714645 @default.
- W2072489225 hasConcept C98045186 @default.
- W2072489225 hasConceptScore W2072489225C111919701 @default.
- W2072489225 hasConceptScore W2072489225C124101348 @default.
- W2072489225 hasConceptScore W2072489225C136764020 @default.
- W2072489225 hasConceptScore W2072489225C154945302 @default.
- W2072489225 hasConceptScore W2072489225C158154518 @default.
- W2072489225 hasConceptScore W2072489225C177264268 @default.
- W2072489225 hasConceptScore W2072489225C17744445 @default.
- W2072489225 hasConceptScore W2072489225C185592680 @default.
- W2072489225 hasConceptScore W2072489225C198531522 @default.
- W2072489225 hasConceptScore W2072489225C199360897 @default.
- W2072489225 hasConceptScore W2072489225C199539241 @default.
- W2072489225 hasConceptScore W2072489225C21959979 @default.
- W2072489225 hasConceptScore W2072489225C23123220 @default.
- W2072489225 hasConceptScore W2072489225C41008148 @default.
- W2072489225 hasConceptScore W2072489225C43617362 @default.
- W2072489225 hasConceptScore W2072489225C73555534 @default.
- W2072489225 hasConceptScore W2072489225C82714645 @default.
- W2072489225 hasConceptScore W2072489225C98045186 @default.
- W2072489225 hasLocation W20724892251 @default.
- W2072489225 hasOpenAccess W2072489225 @default.
- W2072489225 hasPrimaryLocation W20724892251 @default.
- W2072489225 hasRelatedWork W1509467138 @default.
- W2072489225 hasRelatedWork W1584628001 @default.
- W2072489225 hasRelatedWork W1834987204 @default.
- W2072489225 hasRelatedWork W1996655143 @default.
- W2072489225 hasRelatedWork W2036641180 @default.