Matches in SemOpenAlex for { <https://semopenalex.org/work/W2894811571> ?p ?o ?g. }
Showing items 1 to 95 of
95
with 100 items per page.
- W2894811571 abstract "The importance of an efficient and scalable document similarity detection system is undeniable nowadays. Search engines need batch text similarity measures to detect duplicated and near-duplicated web pages in their indexes in order to prevent indexing a web page multiple times. Furthermore, in the scoring phase, search engines need similarity measures to detect duplicated contents on web pages so as to increase the quality of their results. In this paper, a new approach to batch text similarity detection is proposed by combining some ideas from dimensionality reduction techniques and information gain theory. The new approach is focused on search engines need to detect duplicated and near-duplicated web pages. The new approach is evaluated on the NEWS20 dataset and the results show that the new approach is faster than the cosine text similarity algorithm in terms of speed and performance. On top of that, It is faster and more accurate than the other rival method, Simhash similarity algorithm." @default.
- W2894811571 created "2018-10-12" @default.
- W2894811571 creator A5032645435 @default.
- W2894811571 creator A5051951445 @default.
- W2894811571 date "2018-10-07" @default.
- W2894811571 modified "2023-09-23" @default.
- W2894811571 title "Multi-reference Cosine: A New Approach to Text Similarity Measurement in Large Collections." @default.
- W2894811571 cites W1515859748 @default.
- W2894811571 cites W1971226872 @default.
- W2894811571 cites W1987365175 @default.
- W2894811571 cites W1991555183 @default.
- W2894811571 cites W2007842132 @default.
- W2894811571 cites W2013761541 @default.
- W2894811571 cites W2019076926 @default.
- W2894811571 cites W2084528001 @default.
- W2894811571 cites W2111549955 @default.
- W2894811571 cites W2143996849 @default.
- W2894811571 cites W2145349611 @default.
- W2894811571 cites W2152565070 @default.
- W2894811571 cites W2270234620 @default.
- W2894811571 cites W2537973009 @default.
- W2894811571 cites W2779237383 @default.
- W2894811571 cites W2886643495 @default.
- W2894811571 cites W8870360 @default.
- W2894811571 hasPublicationYear "2018" @default.
- W2894811571 type Work @default.
- W2894811571 sameAs 2894811571 @default.
- W2894811571 citedByCount "1" @default.
- W2894811571 countsByYear W28948115712018 @default.
- W2894811571 crossrefType "posted-content" @default.
- W2894811571 hasAuthorship W2894811571A5032645435 @default.
- W2894811571 hasAuthorship W2894811571A5051951445 @default.
- W2894811571 hasConcept C103278499 @default.
- W2894811571 hasConcept C115961682 @default.
- W2894811571 hasConcept C116738811 @default.
- W2894811571 hasConcept C124101348 @default.
- W2894811571 hasConcept C136764020 @default.
- W2894811571 hasConcept C153180895 @default.
- W2894811571 hasConcept C154945302 @default.
- W2894811571 hasConcept C178009071 @default.
- W2894811571 hasConcept C21959979 @default.
- W2894811571 hasConcept C23123220 @default.
- W2894811571 hasConcept C2524010 @default.
- W2894811571 hasConcept C2780762811 @default.
- W2894811571 hasConcept C33923547 @default.
- W2894811571 hasConcept C41008148 @default.
- W2894811571 hasConcept C48044578 @default.
- W2894811571 hasConcept C75165309 @default.
- W2894811571 hasConcept C77088390 @default.
- W2894811571 hasConcept C97854310 @default.
- W2894811571 hasConceptScore W2894811571C103278499 @default.
- W2894811571 hasConceptScore W2894811571C115961682 @default.
- W2894811571 hasConceptScore W2894811571C116738811 @default.
- W2894811571 hasConceptScore W2894811571C124101348 @default.
- W2894811571 hasConceptScore W2894811571C136764020 @default.
- W2894811571 hasConceptScore W2894811571C153180895 @default.
- W2894811571 hasConceptScore W2894811571C154945302 @default.
- W2894811571 hasConceptScore W2894811571C178009071 @default.
- W2894811571 hasConceptScore W2894811571C21959979 @default.
- W2894811571 hasConceptScore W2894811571C23123220 @default.
- W2894811571 hasConceptScore W2894811571C2524010 @default.
- W2894811571 hasConceptScore W2894811571C2780762811 @default.
- W2894811571 hasConceptScore W2894811571C33923547 @default.
- W2894811571 hasConceptScore W2894811571C41008148 @default.
- W2894811571 hasConceptScore W2894811571C48044578 @default.
- W2894811571 hasConceptScore W2894811571C75165309 @default.
- W2894811571 hasConceptScore W2894811571C77088390 @default.
- W2894811571 hasConceptScore W2894811571C97854310 @default.
- W2894811571 hasLocation W28948115711 @default.
- W2894811571 hasOpenAccess W2894811571 @default.
- W2894811571 hasPrimaryLocation W28948115711 @default.
- W2894811571 hasRelatedWork W1493309334 @default.
- W2894811571 hasRelatedWork W1571338321 @default.
- W2894811571 hasRelatedWork W1966234817 @default.
- W2894811571 hasRelatedWork W1979821252 @default.
- W2894811571 hasRelatedWork W2006373804 @default.
- W2894811571 hasRelatedWork W2010746350 @default.
- W2894811571 hasRelatedWork W2049677883 @default.
- W2894811571 hasRelatedWork W2100658491 @default.
- W2894811571 hasRelatedWork W2109803107 @default.
- W2894811571 hasRelatedWork W2121166119 @default.
- W2894811571 hasRelatedWork W2131815873 @default.
- W2894811571 hasRelatedWork W2295318420 @default.
- W2894811571 hasRelatedWork W2361034710 @default.
- W2894811571 hasRelatedWork W2795133481 @default.
- W2894811571 hasRelatedWork W2894561747 @default.
- W2894811571 hasRelatedWork W2908099798 @default.
- W2894811571 hasRelatedWork W3003963173 @default.
- W2894811571 hasRelatedWork W3089188193 @default.
- W2894811571 hasRelatedWork W3184394705 @default.
- W2894811571 hasRelatedWork W2559815012 @default.
- W2894811571 isParatext "false" @default.
- W2894811571 isRetracted "false" @default.
- W2894811571 magId "2894811571" @default.
- W2894811571 workType "article" @default.