Matches in SemOpenAlex for { <https://semopenalex.org/work/W2095851275> ?p ?o ?g. }
- W2095851275 abstract "With the proliferation of web spam and questionable content with virtually infinite auto-generated structure, large-scale web crawlers now require low-complexity ranking methods to effectively budget their limited resources and allocate the majority of bandwidth to reputable sites. To shed light on Internet-wide spam avoidance, we study the domain-level graph from a 6.3B-page web crawl and compare several agnostic topology-based ranking algorithms on this dataset. We first propose a new methodology for comparing the various rankings and then show that in-degree BFS-based techniques decisively outperform classic PageRank-style methods. However, since BFS requires several orders of magnitude higher overhead and is generally infeasible for real-time use, we propose a fast, accurate, and scalable estimation method that can achieve much better crawl prioritization in practice, especially in applications with limited hardware resources." @default.
- W2095851275 created "2016-06-24" @default.
- W2095851275 creator A5027609501 @default.
- W2095851275 creator A5076734500 @default.
- W2095851275 creator A5080290452 @default.
- W2095851275 date "2011-04-01" @default.
- W2095851275 modified "2023-09-27" @default.
- W2095851275 title "Agnostic topology-based spam avoidance in large-scale web crawls" @default.
- W2095851275 cites W110443600 @default.
- W2095851275 cites W135168798 @default.
- W2095851275 cites W1529464143 @default.
- W2095851275 cites W1674850363 @default.
- W2095851275 cites W1797803111 @default.
- W2095851275 cites W1845137714 @default.
- W2095851275 cites W1966912174 @default.
- W2095851275 cites W2000273502 @default.
- W2095851275 cites W2000333294 @default.
- W2095851275 cites W2007687650 @default.
- W2095851275 cites W2013531639 @default.
- W2095851275 cites W2013723863 @default.
- W2095851275 cites W202878612 @default.
- W2095851275 cites W2039976898 @default.
- W2095851275 cites W2051804774 @default.
- W2095851275 cites W2060697433 @default.
- W2095851275 cites W2066636486 @default.
- W2095851275 cites W2081470778 @default.
- W2095851275 cites W2087686687 @default.
- W2095851275 cites W2098660810 @default.
- W2095851275 cites W2099092406 @default.
- W2095851275 cites W2107428549 @default.
- W2095851275 cites W2114413252 @default.
- W2095851275 cites W2118942057 @default.
- W2095851275 cites W2127455097 @default.
- W2095851275 cites W2130242957 @default.
- W2095851275 cites W2130610812 @default.
- W2095851275 cites W2140279085 @default.
- W2095851275 cites W2143917438 @default.
- W2095851275 cites W2145990704 @default.
- W2095851275 cites W2164542999 @default.
- W2095851275 cites W2169263924 @default.
- W2095851275 cites W2169270715 @default.
- W2095851275 cites W75458008 @default.
- W2095851275 doi "https://doi.org/10.1109/infcom.2011.5935303" @default.
- W2095851275 hasPublicationYear "2011" @default.
- W2095851275 type Work @default.
- W2095851275 sameAs 2095851275 @default.
- W2095851275 citedByCount "1" @default.
- W2095851275 countsByYear W20958512752012 @default.
- W2095851275 crossrefType "proceedings-article" @default.
- W2095851275 hasAuthorship W2095851275A5027609501 @default.
- W2095851275 hasAuthorship W2095851275A5076734500 @default.
- W2095851275 hasAuthorship W2095851275A5080290452 @default.
- W2095851275 hasBestOaLocation W20958512752 @default.
- W2095851275 hasConcept C110875604 @default.
- W2095851275 hasConcept C111919701 @default.
- W2095851275 hasConcept C124101348 @default.
- W2095851275 hasConcept C132525143 @default.
- W2095851275 hasConcept C136764020 @default.
- W2095851275 hasConcept C13743948 @default.
- W2095851275 hasConcept C189430467 @default.
- W2095851275 hasConcept C23123220 @default.
- W2095851275 hasConcept C2776257435 @default.
- W2095851275 hasConcept C2779172887 @default.
- W2095851275 hasConcept C2779960059 @default.
- W2095851275 hasConcept C31258907 @default.
- W2095851275 hasConcept C41008148 @default.
- W2095851275 hasConcept C48044578 @default.
- W2095851275 hasConcept C77088390 @default.
- W2095851275 hasConcept C80444323 @default.
- W2095851275 hasConceptScore W2095851275C110875604 @default.
- W2095851275 hasConceptScore W2095851275C111919701 @default.
- W2095851275 hasConceptScore W2095851275C124101348 @default.
- W2095851275 hasConceptScore W2095851275C132525143 @default.
- W2095851275 hasConceptScore W2095851275C136764020 @default.
- W2095851275 hasConceptScore W2095851275C13743948 @default.
- W2095851275 hasConceptScore W2095851275C189430467 @default.
- W2095851275 hasConceptScore W2095851275C23123220 @default.
- W2095851275 hasConceptScore W2095851275C2776257435 @default.
- W2095851275 hasConceptScore W2095851275C2779172887 @default.
- W2095851275 hasConceptScore W2095851275C2779960059 @default.
- W2095851275 hasConceptScore W2095851275C31258907 @default.
- W2095851275 hasConceptScore W2095851275C41008148 @default.
- W2095851275 hasConceptScore W2095851275C48044578 @default.
- W2095851275 hasConceptScore W2095851275C77088390 @default.
- W2095851275 hasConceptScore W2095851275C80444323 @default.
- W2095851275 hasLocation W20958512751 @default.
- W2095851275 hasLocation W20958512752 @default.
- W2095851275 hasOpenAccess W2095851275 @default.
- W2095851275 hasPrimaryLocation W20958512751 @default.
- W2095851275 hasRelatedWork W1501466029 @default.
- W2095851275 hasRelatedWork W1602553487 @default.
- W2095851275 hasRelatedWork W2056064491 @default.
- W2095851275 hasRelatedWork W2130476896 @default.
- W2095851275 hasRelatedWork W2169033019 @default.
- W2095851275 hasRelatedWork W2286791290 @default.
- W2095851275 hasRelatedWork W2327244120 @default.
- W2095851275 hasRelatedWork W3120511008 @default.
- W2095851275 hasRelatedWork W3207515464 @default.
- W2095851275 hasRelatedWork W2499948410 @default.
- W2095851275 isParatext "false" @default.