Matches in SemOpenAlex for { <https://semopenalex.org/work/W4381489623> ?p ?o ?g. }
Showing items 1 to 95 of
95
with 100 items per page.
- W4381489623 abstract "Sketching methods offer computational biologists scalable techniques to analyze data sets that continue to grow in size. MinHash is one such technique to estimate set similarity that has enjoyed recent broad application. However, traditional MinHash has previously been shown to perform poorly when applied to sets of very dissimilar sizes. FracMinHash was recently introduced as a modification of MinHash to compensate for this lack of performance when set sizes differ. This approach has been successfully applied to metagenomic taxonomic profiling in the widely used tool sourmash gather. Although experimental evidence has been encouraging, FracMinHash has not yet been analyzed from a theoretical perspective. In this paper, we perform such an analysis to derive various statistics of FracMinHash, and prove that although FracMinHash is not unbiased (in the sense that its expected value is not equal to the quantity it attempts to estimate), this bias is easily corrected for both the containment and Jaccard index versions. Next, we show how FracMinHash can be used to compute point estimates as well as confidence intervals for evolutionary mutation distance between a pair of sequences by assuming a simple mutation model. We also investigate edge cases in which these analyses may fail to effectively warn the users of FracMinHash indicating the likelihood of such cases. Our analyses show that FracMinHash estimates the containment of a genome in a large metagenome more accurately and more precisely compared with traditional MinHash, and the point estimates and confidence intervals perform significantly better in estimating mutation distances." @default.
- W4381489623 created "2023-06-22" @default.
- W4381489623 creator A5002668669 @default.
- W4381489623 creator A5050472000 @default.
- W4381489623 creator A5073893845 @default.
- W4381489623 date "2023-06-21" @default.
- W4381489623 modified "2023-10-17" @default.
- W4381489623 title "Deriving confidence intervals for mutation rates across a wide range of evolutionary distances using FracMinHash" @default.
- W4381489623 cites W1954100204 @default.
- W4381489623 cites W2046299964 @default.
- W4381489623 cites W2112174618 @default.
- W4381489623 cites W2112491476 @default.
- W4381489623 cites W2115546424 @default.
- W4381489623 cites W2127768708 @default.
- W4381489623 cites W2161692256 @default.
- W4381489623 cites W2206071891 @default.
- W4381489623 cites W2755302678 @default.
- W4381489623 cites W2789843538 @default.
- W4381489623 cites W2950150251 @default.
- W4381489623 cites W2950883119 @default.
- W4381489623 cites W2950964375 @default.
- W4381489623 cites W2951254987 @default.
- W4381489623 cites W2955167895 @default.
- W4381489623 cites W3083852019 @default.
- W4381489623 cites W3119188679 @default.
- W4381489623 cites W3200103613 @default.
- W4381489623 cites W3200242814 @default.
- W4381489623 cites W4221104280 @default.
- W4381489623 cites W4225999358 @default.
- W4381489623 doi "https://doi.org/10.1101/gr.277651.123" @default.
- W4381489623 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/37344105" @default.
- W4381489623 hasPublicationYear "2023" @default.
- W4381489623 type Work @default.
- W4381489623 citedByCount "4" @default.
- W4381489623 countsByYear W43814896232023 @default.
- W4381489623 crossrefType "journal-article" @default.
- W4381489623 hasAuthorship W4381489623A5002668669 @default.
- W4381489623 hasAuthorship W4381489623A5050472000 @default.
- W4381489623 hasAuthorship W4381489623A5073893845 @default.
- W4381489623 hasBestOaLocation W43814896231 @default.
- W4381489623 hasConcept C104317684 @default.
- W4381489623 hasConcept C105795698 @default.
- W4381489623 hasConcept C119857082 @default.
- W4381489623 hasConcept C124101348 @default.
- W4381489623 hasConcept C15151743 @default.
- W4381489623 hasConcept C159985019 @default.
- W4381489623 hasConcept C177264268 @default.
- W4381489623 hasConcept C192562407 @default.
- W4381489623 hasConcept C199360897 @default.
- W4381489623 hasConcept C203519979 @default.
- W4381489623 hasConcept C204323151 @default.
- W4381489623 hasConcept C33923547 @default.
- W4381489623 hasConcept C41008148 @default.
- W4381489623 hasConcept C44249647 @default.
- W4381489623 hasConcept C54355233 @default.
- W4381489623 hasConcept C70721500 @default.
- W4381489623 hasConcept C73555534 @default.
- W4381489623 hasConcept C86803240 @default.
- W4381489623 hasConceptScore W4381489623C104317684 @default.
- W4381489623 hasConceptScore W4381489623C105795698 @default.
- W4381489623 hasConceptScore W4381489623C119857082 @default.
- W4381489623 hasConceptScore W4381489623C124101348 @default.
- W4381489623 hasConceptScore W4381489623C15151743 @default.
- W4381489623 hasConceptScore W4381489623C159985019 @default.
- W4381489623 hasConceptScore W4381489623C177264268 @default.
- W4381489623 hasConceptScore W4381489623C192562407 @default.
- W4381489623 hasConceptScore W4381489623C199360897 @default.
- W4381489623 hasConceptScore W4381489623C203519979 @default.
- W4381489623 hasConceptScore W4381489623C204323151 @default.
- W4381489623 hasConceptScore W4381489623C33923547 @default.
- W4381489623 hasConceptScore W4381489623C41008148 @default.
- W4381489623 hasConceptScore W4381489623C44249647 @default.
- W4381489623 hasConceptScore W4381489623C54355233 @default.
- W4381489623 hasConceptScore W4381489623C70721500 @default.
- W4381489623 hasConceptScore W4381489623C73555534 @default.
- W4381489623 hasConceptScore W4381489623C86803240 @default.
- W4381489623 hasFunder F4320306076 @default.
- W4381489623 hasFunder F4320332161 @default.
- W4381489623 hasLocation W43814896231 @default.
- W4381489623 hasLocation W43814896232 @default.
- W4381489623 hasOpenAccess W4381489623 @default.
- W4381489623 hasPrimaryLocation W43814896231 @default.
- W4381489623 hasRelatedWork W1910109602 @default.
- W4381489623 hasRelatedWork W2124986934 @default.
- W4381489623 hasRelatedWork W2182035395 @default.
- W4381489623 hasRelatedWork W2210738511 @default.
- W4381489623 hasRelatedWork W2211814384 @default.
- W4381489623 hasRelatedWork W2474302342 @default.
- W4381489623 hasRelatedWork W2620201621 @default.
- W4381489623 hasRelatedWork W2792591449 @default.
- W4381489623 hasRelatedWork W2971474170 @default.
- W4381489623 hasRelatedWork W73805934 @default.
- W4381489623 isParatext "false" @default.
- W4381489623 isRetracted "false" @default.
- W4381489623 workType "article" @default.