Matches in SemOpenAlex for { <https://semopenalex.org/work/W1974927094> ?p ?o ?g. }
Showing items 1 to 70 of
70
with 100 items per page.
- W1974927094 abstract "We suggest a way for locating duplicates and plagiarisms in a text collection using an R-measure, which is the normalized sum of the lengths of all suffixes of the text repeated in other documents of the collection. The R-measure can be effectively computed using the suffix array data structure. Additionally, the computation procedure can be improved to locate the sets of duplicate or plagiarised documents. We applied the technique to several standard text collections and found that they contained a significant number of duplicate and plagiarised documents. Another reformulation of the method leads to an algorithm that can be applied to supervised multi-class categorization. We illustrate the approach using the recently available Reuters Corpus Volume 1 (RCV1). The results show that the method outperforms SVM at multi-class categorization, and interestingly, that results correlate strongly with compression-based methods." @default.
- W1974927094 created "2016-06-24" @default.
- W1974927094 creator A5025875274 @default.
- W1974927094 creator A5050167629 @default.
- W1974927094 date "2003-07-28" @default.
- W1974927094 modified "2023-09-27" @default.
- W1974927094 title "A repetition based measure for verification of text collections and for text categorization" @default.
- W1974927094 cites W1608648538 @default.
- W1974927094 cites W1646816143 @default.
- W1974927094 cites W1726445723 @default.
- W1974927094 cites W1995875735 @default.
- W1974927094 cites W2092371627 @default.
- W1974927094 cites W2098162425 @default.
- W1974927094 cites W2158874082 @default.
- W1974927094 doi "https://doi.org/10.1145/860435.860456" @default.
- W1974927094 hasPublicationYear "2003" @default.
- W1974927094 type Work @default.
- W1974927094 sameAs 1974927094 @default.
- W1974927094 citedByCount "68" @default.
- W1974927094 countsByYear W19749270942012 @default.
- W1974927094 countsByYear W19749270942013 @default.
- W1974927094 countsByYear W19749270942014 @default.
- W1974927094 countsByYear W19749270942015 @default.
- W1974927094 countsByYear W19749270942016 @default.
- W1974927094 countsByYear W19749270942017 @default.
- W1974927094 countsByYear W19749270942018 @default.
- W1974927094 countsByYear W19749270942019 @default.
- W1974927094 countsByYear W19749270942020 @default.
- W1974927094 crossrefType "proceedings-article" @default.
- W1974927094 hasAuthorship W1974927094A5025875274 @default.
- W1974927094 hasAuthorship W1974927094A5050167629 @default.
- W1974927094 hasConcept C124101348 @default.
- W1974927094 hasConcept C138885662 @default.
- W1974927094 hasConcept C154945302 @default.
- W1974927094 hasConcept C204321447 @default.
- W1974927094 hasConcept C23123220 @default.
- W1974927094 hasConcept C2776141515 @default.
- W1974927094 hasConcept C2780009758 @default.
- W1974927094 hasConcept C2986744138 @default.
- W1974927094 hasConcept C41008148 @default.
- W1974927094 hasConcept C41895202 @default.
- W1974927094 hasConcept C94124525 @default.
- W1974927094 hasConceptScore W1974927094C124101348 @default.
- W1974927094 hasConceptScore W1974927094C138885662 @default.
- W1974927094 hasConceptScore W1974927094C154945302 @default.
- W1974927094 hasConceptScore W1974927094C204321447 @default.
- W1974927094 hasConceptScore W1974927094C23123220 @default.
- W1974927094 hasConceptScore W1974927094C2776141515 @default.
- W1974927094 hasConceptScore W1974927094C2780009758 @default.
- W1974927094 hasConceptScore W1974927094C2986744138 @default.
- W1974927094 hasConceptScore W1974927094C41008148 @default.
- W1974927094 hasConceptScore W1974927094C41895202 @default.
- W1974927094 hasConceptScore W1974927094C94124525 @default.
- W1974927094 hasLocation W19749270941 @default.
- W1974927094 hasOpenAccess W1974927094 @default.
- W1974927094 hasPrimaryLocation W19749270941 @default.
- W1974927094 hasRelatedWork W120966433 @default.
- W1974927094 hasRelatedWork W2100661451 @default.
- W1974927094 hasRelatedWork W2262858430 @default.
- W1974927094 hasRelatedWork W2365213443 @default.
- W1974927094 hasRelatedWork W2366911255 @default.
- W1974927094 hasRelatedWork W2385170969 @default.
- W1974927094 hasRelatedWork W3013319096 @default.
- W1974927094 hasRelatedWork W3026652378 @default.
- W1974927094 hasRelatedWork W3107474891 @default.
- W1974927094 hasRelatedWork W1793353708 @default.
- W1974927094 isParatext "false" @default.
- W1974927094 isRetracted "false" @default.
- W1974927094 magId "1974927094" @default.
- W1974927094 workType "article" @default.