Matches in SemOpenAlex for { <https://semopenalex.org/work/W2913663886> ?p ?o ?g. }
Showing items 1 to 95 of
95
with 100 items per page.
- W2913663886 abstract "Abstract Motivation Variant discovery is crucial in medical and clinical research, especially in the setting of personalized medicine. As such, precision in variant identification is paramount. However, variants identified by current genomic analysis pipelines contain many false positives (i.e., incorrectly called variants). These can be potentially eliminated by applying state-of-the-art filtering tools, such as the Variant Quality Score Recalibration (VQSR) or the Hard Filtering (HF), both proposed by GATK. However, these methods are very user-dependent and fail to run in some cases. We propose VEF, a variant filtering tool based on ensemble methods that overcomes the main drawbacks of VQSR and the HF. Contrary to these methods, we treat filtering as a supervised learning problem. This is possible by using for training variant call data for which the set of “true” variants is known, i.e., a gold standard exists. Hence, we can classify each variant in the training VCF file as true or false using the gold standard, and further use the annotations of each variant as features for the classification problem. Once trained, VEF can be directly applied to filter the variants contained in a given VCF file. Analysis of several ensemble methods revealed random forest as offering the best performance, and hence VEF uses a random forest for the classification task. Results After training VEF on a Whole Genome Sequencing (WGS) Human dataset of sample NA12878 , we tested its performance on a WGS Human dataset of sample NA24385 . For these two samples, the set of high-confident variants has been produced and made available. Results show that the proposed filtering tool VEF consistently outperforms VQSR and HF. In addition, we show that VEF generalizes well even when some features have missing values, and when the training and testing datasets differ either in coverage or in the sequencing machine that was used to generate the data. Finally, since the training needs to be performed only once, there is a significant saving in running time when compared to VQSR (50 minutes versus 4 minutes approximately for filtering the SNPs of WGS Human sample NA24385). Code and scripts available at: github.com/ChuanyiZ/vef ." @default.
- W2913663886 created "2019-02-21" @default.
- W2913663886 creator A5010005244 @default.
- W2913663886 creator A5045294377 @default.
- W2913663886 date "2019-02-05" @default.
- W2913663886 modified "2023-09-23" @default.
- W2913663886 title "VEF: a Variant Filtering tool based on Ensemble methods" @default.
- W2913663886 cites W1678356000 @default.
- W2913663886 cites W1988790447 @default.
- W2913663886 cites W2058401000 @default.
- W2913663886 cites W2104549677 @default.
- W2913663886 cites W2108234281 @default.
- W2913663886 cites W2147733682 @default.
- W2913663886 cites W2149992227 @default.
- W2913663886 cites W2168133698 @default.
- W2913663886 cites W2791837701 @default.
- W2913663886 cites W2792286521 @default.
- W2913663886 cites W2867868648 @default.
- W2913663886 doi "https://doi.org/10.1101/540286" @default.
- W2913663886 hasPublicationYear "2019" @default.
- W2913663886 type Work @default.
- W2913663886 sameAs 2913663886 @default.
- W2913663886 citedByCount "1" @default.
- W2913663886 countsByYear W29136638862019 @default.
- W2913663886 crossrefType "posted-content" @default.
- W2913663886 hasAuthorship W2913663886A5010005244 @default.
- W2913663886 hasAuthorship W2913663886A5045294377 @default.
- W2913663886 hasBestOaLocation W29136638861 @default.
- W2913663886 hasConcept C105795698 @default.
- W2913663886 hasConcept C106131492 @default.
- W2913663886 hasConcept C116834253 @default.
- W2913663886 hasConcept C119857082 @default.
- W2913663886 hasConcept C124101348 @default.
- W2913663886 hasConcept C153180895 @default.
- W2913663886 hasConcept C154945302 @default.
- W2913663886 hasConcept C162324750 @default.
- W2913663886 hasConcept C169258074 @default.
- W2913663886 hasConcept C177264268 @default.
- W2913663886 hasConcept C185592680 @default.
- W2913663886 hasConcept C187736073 @default.
- W2913663886 hasConcept C198531522 @default.
- W2913663886 hasConcept C199360897 @default.
- W2913663886 hasConcept C2780451532 @default.
- W2913663886 hasConcept C2989486834 @default.
- W2913663886 hasConcept C31972630 @default.
- W2913663886 hasConcept C33923547 @default.
- W2913663886 hasConcept C40993552 @default.
- W2913663886 hasConcept C41008148 @default.
- W2913663886 hasConcept C43617362 @default.
- W2913663886 hasConcept C45942800 @default.
- W2913663886 hasConcept C59822182 @default.
- W2913663886 hasConcept C64869954 @default.
- W2913663886 hasConcept C86803240 @default.
- W2913663886 hasConceptScore W2913663886C105795698 @default.
- W2913663886 hasConceptScore W2913663886C106131492 @default.
- W2913663886 hasConceptScore W2913663886C116834253 @default.
- W2913663886 hasConceptScore W2913663886C119857082 @default.
- W2913663886 hasConceptScore W2913663886C124101348 @default.
- W2913663886 hasConceptScore W2913663886C153180895 @default.
- W2913663886 hasConceptScore W2913663886C154945302 @default.
- W2913663886 hasConceptScore W2913663886C162324750 @default.
- W2913663886 hasConceptScore W2913663886C169258074 @default.
- W2913663886 hasConceptScore W2913663886C177264268 @default.
- W2913663886 hasConceptScore W2913663886C185592680 @default.
- W2913663886 hasConceptScore W2913663886C187736073 @default.
- W2913663886 hasConceptScore W2913663886C198531522 @default.
- W2913663886 hasConceptScore W2913663886C199360897 @default.
- W2913663886 hasConceptScore W2913663886C2780451532 @default.
- W2913663886 hasConceptScore W2913663886C2989486834 @default.
- W2913663886 hasConceptScore W2913663886C31972630 @default.
- W2913663886 hasConceptScore W2913663886C33923547 @default.
- W2913663886 hasConceptScore W2913663886C40993552 @default.
- W2913663886 hasConceptScore W2913663886C41008148 @default.
- W2913663886 hasConceptScore W2913663886C43617362 @default.
- W2913663886 hasConceptScore W2913663886C45942800 @default.
- W2913663886 hasConceptScore W2913663886C59822182 @default.
- W2913663886 hasConceptScore W2913663886C64869954 @default.
- W2913663886 hasConceptScore W2913663886C86803240 @default.
- W2913663886 hasLocation W29136638861 @default.
- W2913663886 hasOpenAccess W2913663886 @default.
- W2913663886 hasPrimaryLocation W29136638861 @default.
- W2913663886 hasRelatedWork W2883828728 @default.
- W2913663886 hasRelatedWork W2913663886 @default.
- W2913663886 hasRelatedWork W2973799232 @default.
- W2913663886 hasRelatedWork W3005055299 @default.
- W2913663886 hasRelatedWork W3126015411 @default.
- W2913663886 hasRelatedWork W3170784702 @default.
- W2913663886 hasRelatedWork W4281560664 @default.
- W2913663886 hasRelatedWork W4282839226 @default.
- W2913663886 hasRelatedWork W4283016678 @default.
- W2913663886 hasRelatedWork W4297107051 @default.
- W2913663886 isParatext "false" @default.
- W2913663886 isRetracted "false" @default.
- W2913663886 magId "2913663886" @default.
- W2913663886 workType "article" @default.