Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226366628> ?p ?o ?g. }
Showing items 1 to 99 of
99
with 100 items per page.
- W4226366628 abstract "Comparative analysis of Clostridioides difficile whole-genome sequencing (WGS) data enables fine scaled investigation of transmission and is increasingly becoming part of routine surveillance. However, these analyses are constrained by the computational requirements of the large volumes of data involved. By decomposing WGS reads or assemblies into k-mers and using the dimensionality reduction technique MinHash, it is possible to rapidly approximate genomic distances without alignment. Here we assessed the performance of MinHash, as implemented by sourmash, in predicting single nucleotide differences between genomes (SNPs) and C. difficile ribotypes (RTs). For a set of 1905 diverse C. difficile genomes (differing by 0-168 519 SNPs), using sourmash to screen for closely related genomes, at a sensitivity of 100 % for pairs ≤10 SNPs, sourmash reduced the number of pairs from 1 813 560 overall to 161 934, i.e. by 91 %, with a positive predictive value of 32 % to correctly identify pairs ≤10 SNPs (maximum SNP distance 4144). At a sensitivity of 95 %, pairs were reduced by 94 % to 108 266 and PPV increased to 45 % (maximum SNP distance 1009). Increasing the MinHash sketch size above 2000 produced minimal performance improvement. We also explored a MinHash similarity-based ribotype prediction method. Genomes with known ribotypes (n=3937) were split into a training set (2937) and test set (1000) randomly. The training set was used to construct a sourmash index against which genomes from the test set were compared. If the closest five genomes in the index had the same ribotype this was taken to predict the searched genome's ribotype. Using our MinHash ribotype index, predicted ribotypes were correct in 780/1000 (78 %) genomes, incorrect in 20 (2 %), and indeterminant in 200 (20 %). Relaxing the classifier to 4/5 closest matches with the same RT improved the correct predictions to 87 %. Using MinHash it is possible to subsample C. difficile genome k-mer hashes and use them to approximate small genomic differences within minutes, significantly reducing the search space for further analysis." @default.
- W4226366628 created "2022-05-05" @default.
- W4226366628 creator A5012297229 @default.
- W4226366628 creator A5027928189 @default.
- W4226366628 creator A5044933203 @default.
- W4226366628 creator A5064800233 @default.
- W4226366628 date "2022-04-06" @default.
- W4226366628 modified "2023-09-26" @default.
- W4226366628 title "K-mer based prediction of Clostridioides difficile relatedness and ribotypes" @default.
- W4226366628 cites W1954100204 @default.
- W4226366628 cites W2007890464 @default.
- W4226366628 cites W2036897871 @default.
- W4226366628 cites W2092229850 @default.
- W4226366628 cites W2104491546 @default.
- W4226366628 cites W2107772251 @default.
- W4226366628 cites W2107853781 @default.
- W4226366628 cites W2122673596 @default.
- W4226366628 cites W2125343055 @default.
- W4226366628 cites W2149753281 @default.
- W4226366628 cites W2157539385 @default.
- W4226366628 cites W2160969485 @default.
- W4226366628 cites W2394763400 @default.
- W4226366628 cites W2519890620 @default.
- W4226366628 cites W2528399578 @default.
- W4226366628 cites W2551091376 @default.
- W4226366628 cites W2574133781 @default.
- W4226366628 cites W2618169018 @default.
- W4226366628 cites W2745326949 @default.
- W4226366628 cites W2796063798 @default.
- W4226366628 cites W282286282 @default.
- W4226366628 cites W2837231096 @default.
- W4226366628 cites W2871836482 @default.
- W4226366628 cites W2884834061 @default.
- W4226366628 cites W2895137000 @default.
- W4226366628 cites W2911654744 @default.
- W4226366628 cites W2922101311 @default.
- W4226366628 cites W2949236085 @default.
- W4226366628 cites W2950150251 @default.
- W4226366628 cites W2950964142 @default.
- W4226366628 cites W2951426855 @default.
- W4226366628 cites W2953600701 @default.
- W4226366628 cites W2987650093 @default.
- W4226366628 cites W2989052591 @default.
- W4226366628 cites W2992400060 @default.
- W4226366628 cites W3002979846 @default.
- W4226366628 cites W3045684808 @default.
- W4226366628 cites W3089125281 @default.
- W4226366628 doi "https://doi.org/10.1099/mgen.0.000804" @default.
- W4226366628 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/35384833" @default.
- W4226366628 hasPublicationYear "2022" @default.
- W4226366628 type Work @default.
- W4226366628 citedByCount "0" @default.
- W4226366628 crossrefType "journal-article" @default.
- W4226366628 hasAuthorship W4226366628A5012297229 @default.
- W4226366628 hasAuthorship W4226366628A5027928189 @default.
- W4226366628 hasAuthorship W4226366628A5044933203 @default.
- W4226366628 hasAuthorship W4226366628A5064800233 @default.
- W4226366628 hasBestOaLocation W42263666281 @default.
- W4226366628 hasConcept C104317684 @default.
- W4226366628 hasConcept C135763542 @default.
- W4226366628 hasConcept C139275648 @default.
- W4226366628 hasConcept C141231307 @default.
- W4226366628 hasConcept C153209595 @default.
- W4226366628 hasConcept C54355233 @default.
- W4226366628 hasConcept C70721500 @default.
- W4226366628 hasConcept C86803240 @default.
- W4226366628 hasConceptScore W4226366628C104317684 @default.
- W4226366628 hasConceptScore W4226366628C135763542 @default.
- W4226366628 hasConceptScore W4226366628C139275648 @default.
- W4226366628 hasConceptScore W4226366628C141231307 @default.
- W4226366628 hasConceptScore W4226366628C153209595 @default.
- W4226366628 hasConceptScore W4226366628C54355233 @default.
- W4226366628 hasConceptScore W4226366628C70721500 @default.
- W4226366628 hasConceptScore W4226366628C86803240 @default.
- W4226366628 hasFunder F4320319990 @default.
- W4226366628 hasIssue "4" @default.
- W4226366628 hasLocation W42263666281 @default.
- W4226366628 hasLocation W42263666282 @default.
- W4226366628 hasLocation W42263666283 @default.
- W4226366628 hasLocation W42263666284 @default.
- W4226366628 hasLocation W42263666285 @default.
- W4226366628 hasLocation W42263666286 @default.
- W4226366628 hasLocation W42263666287 @default.
- W4226366628 hasOpenAccess W4226366628 @default.
- W4226366628 hasPrimaryLocation W42263666281 @default.
- W4226366628 hasRelatedWork W1199774223 @default.
- W4226366628 hasRelatedWork W1517693310 @default.
- W4226366628 hasRelatedWork W1901150897 @default.
- W4226366628 hasRelatedWork W1966708416 @default.
- W4226366628 hasRelatedWork W2084659082 @default.
- W4226366628 hasRelatedWork W2107417504 @default.
- W4226366628 hasRelatedWork W2145432896 @default.
- W4226366628 hasRelatedWork W2327017833 @default.
- W4226366628 hasRelatedWork W2379353909 @default.
- W4226366628 hasRelatedWork W860817473 @default.
- W4226366628 hasVolume "8" @default.
- W4226366628 isParatext "false" @default.
- W4226366628 isRetracted "false" @default.
- W4226366628 workType "article" @default.