Matches in SemOpenAlex for { <https://semopenalex.org/work/W2969569897> ?p ?o ?g. }
- W2969569897 endingPage "1094" @default.
- W2969569897 startingPage "1079" @default.
- W2969569897 abstract "Both multiplexing and target-enrichment technologies are key to reducing the cost of genetic testing using next-generation sequencing (NGS). Many diagnostic laboratories routinely handle thousands of targeted resequencing samples, leading to an increased risk of accidental sample mix-ups and cross contamination. Herein, we present a short DNA fragment that can be spiked into the original genomic DNA (gDNA) or whole blood sample and tracked through to the final targeted resequencing data. This DNA fragment comprises a 15-bp unique index sequence assembled with a 120-bp fixed sequence designed for recovery in a hybridization capture reaction. In a pilot study, the yield of the recovered probe was examined in a step-by-step genetic testing procedure, involving gDNA isolation from whole blood, library preparation for NGS, and capture hybridization. On the basis of the results, 10 fmol (6 × 109 molecules) and 10 amol (6 × 106 molecules) of the spike-in probe were estimated to be suitable for DNA and RNA probe–based library preparation and target enrichment from 200 ng (6.5 × 104 copies) gDNA, respectively. In fact, the number of NGS reads corresponding to the spike-in probe was almost equal to that corresponding to the genomic target regions and was sufficient for evaluating sample identification and cross-contamination events. Hence, this method may be useful for enhancing quality assurance in clinical genetic testing. Both multiplexing and target-enrichment technologies are key to reducing the cost of genetic testing using next-generation sequencing (NGS). Many diagnostic laboratories routinely handle thousands of targeted resequencing samples, leading to an increased risk of accidental sample mix-ups and cross contamination. Herein, we present a short DNA fragment that can be spiked into the original genomic DNA (gDNA) or whole blood sample and tracked through to the final targeted resequencing data. This DNA fragment comprises a 15-bp unique index sequence assembled with a 120-bp fixed sequence designed for recovery in a hybridization capture reaction. In a pilot study, the yield of the recovered probe was examined in a step-by-step genetic testing procedure, involving gDNA isolation from whole blood, library preparation for NGS, and capture hybridization. On the basis of the results, 10 fmol (6 × 109 molecules) and 10 amol (6 × 106 molecules) of the spike-in probe were estimated to be suitable for DNA and RNA probe–based library preparation and target enrichment from 200 ng (6.5 × 104 copies) gDNA, respectively. In fact, the number of NGS reads corresponding to the spike-in probe was almost equal to that corresponding to the genomic target regions and was sufficient for evaluating sample identification and cross-contamination events. Hence, this method may be useful for enhancing quality assurance in clinical genetic testing. Next-generation sequencing (NGS) is a powerful technology for interrogating the human genome and efficiently determining the molecular diagnosis of various inherited diseases.1Gonzaga-Jauregui C. Lupski J.R. Gibbs R.A. Human genome sequencing in health and disease.Annu Rev Med. 2012; 63: 35-61Crossref PubMed Scopus (340) Google Scholar, 2Gullapalli R.R. Desai K.V. Santana-Santos L. Kant J.A. Becich M.J. Next generation sequencing in clinical medicine: challenges and lessons for pathology and biomedical informatics.J Pathol Inform. 2012; 3: 40Crossref PubMed Google Scholar, 3Metzker M.L. Sequencing technologies: the next generation.Nat Rev Genet. 2010; 11: 31-46Crossref PubMed Scopus (5018) Google Scholar NGS provides applications for examining the entire human genome as well as more targeted regions. As for targeted resequencing, whole-exome sequencing is suitable for molecular diagnosis of heterogeneous diseases and discovery of novel disease-related genes.4Bamshad M.J. Ng S.B. Bigham A.W. Tabor H.K. Emond M.J. Nickerson D.A. Shendure J. Exome sequencing as a tool for Mendelian disease gene discovery.Nat Rev Genet. 2011; 12: 745-755Crossref PubMed Scopus (1266) Google Scholar Nevertheless, gene panel sequencing (GPS) offers several advantages over whole-exome sequencing.5Jamuar S.S. Tan E.C. Clinical application of next-generation sequencing for Mendelian diseases.Hum Genomics. 2015; 9: 10Crossref PubMed Scopus (66) Google Scholar, 6Kalia S.S. Adelman K. Bale S.J. Chung W.K. Eng C. Evans J.P. Herman G.E. Hufnagel S.B. Klein T.E. Korf B.R. McKelvey K.D. Ormond K.E. Richards C.S. Vlangos C.N. Watson M. Martin C.L. Miller D.T. Recommendations for reporting of secondary findings in clinical exome and genome sequencing, 2016 update (ACMG SF v2.0): a policy statement of the American College of Medical Genetics and Genomics.Genet Med. 2017; 19: 249-255Abstract Full Text Full Text PDF PubMed Scopus (1131) Google Scholar Despite the decreasing costs of NGS, whole-exome sequencing remains expensive for diagnostic purposes. In addition, whole-exome sequencing often imposes ethical issues by increasing the risk of incidental findings. Thus, GPS, which is individually customized by choosing a combination of coding exons to evaluate, has become the standard in clinical practice for evaluating monogenic disorders.7Lee T. Misaki M. Shimomura H. Tanaka Y. Yoshida S. Murayama K. Nakamura K. Fujiki R. Ohara O. Sasai H. Fukao T. Takeshima Y. Late-onset ornithine transcarbamylase deficiency caused by a somatic mosaic mutation.Hum Genome Var. 2018; 5: 22Crossref PubMed Scopus (9) Google Scholar, 8Maeda A. Yoshida A. Kawai K. Arai Y. Akiba R. Inaba A. Takagi S. Fujiki R. Hirami Y. Kurimoto Y. Ohara O. Takahashi M. Development of a molecular diagnostic test for retinitis pigmentosa in the Japanese population.Jpn J Ophthalmol. 2018; 62: 451-457Crossref PubMed Scopus (25) Google Scholar, 9Sasai H. Aoyama Y. Otsuka H. Abdelkreem E. Naiki Y. Kubota M. Sekine Y. Itoh M. Nakama M. Ohnishi H. Fujiki R. Ohara O. Fukao T. Heterozygous carriers of succinyl-CoA:3-oxoacid CoA transferase deficiency can develop severe ketoacidosis.J Inherit Metab Dis. 2017; 40: 845-852Crossref PubMed Scopus (7) Google Scholar, 10Tajima G. Hara K. Tsumura M. Kagawa R. Okada S. Sakura N. Maruyama S. Noguchi A. Awaya T. Ishige M. Ishige N. Musha I. Ajihara S. Ohtake A. Naito E. Hamada Y. Kono T. Asada T. Sasai H. Fukao T. Fujiki R. Ohara O. Bo R. Yamada K. Kobayashi H. Hasegawa Y. Yamaguchi S. Takayanagi M. Hata I. Shigematsu Y. Kobayashi M. Newborn screening for carnitine palmitoyltransferase II deficiency using (C16+C18:1)/C2: evaluation of additional indices for adequate sensitivity and lower false-positivity.Mol Genet Metab. 2017; 122: 67-75Crossref PubMed Scopus (20) Google Scholar, 11Takano C. Ishige M. Ogawa E. Usui H. Kagawa R. Tajima G. Fujiki R. Fukao T. Mizuta K. Fuchigami T. Takahashi S. A case of classical maple syrup urine disease that was successfully managed by living donor liver transplantation.Pediatr Transplant. 2017; 21 (e12948)Crossref PubMed Scopus (5) Google Scholar, 12Wada Y. Kikuchi A. Arai-Ichinoi N. Sakamoto O. Takezawa Y. Iwasawa S. Niihori T. Nyuzuki H. Nakajima Y. Ogawa E. Ishige M. Hirai H. Sasai H. Fujiki R. Shirota M. Funayama R. Yamamoto M. Ito T. Ohara O. Nakayama K. Aoki Y. Koshiba S. Fukao T. Kure S. Biallelic GALM pathogenic variants cause a novel type of galactosemia.Genet Med. 2018; 21: 1286-1294Abstract Full Text Full Text PDF PubMed Scopus (33) Google Scholar Currently, many diagnostic laboratories handle thousands of GPS samples. Sample mix-ups have become a serious problem, impeding appropriate clinical decisions. Although laboratories prepare elaborate tracking systems involving barcoding and automated sample handling, it is difficult to prevent accidental human errors completely, such as sample tube and plate swapping. In fact, reconfirmatory analyses of GenBank data suggest that erroneous registrations are not rare.13Shen Y.Y. Chen X. Murphy R.W. Assessing DNA barcoding as a tool for species identification and data quality control.PLoS One. 2013; 8: e57125Crossref PubMed Scopus (102) Google Scholar To prevent these critical errors, many diagnostic laboratories perform confirmatory routine procedures, such as Sanger sequencing of NGS results.14Matthijs G. Souche E. Alders M. Corveleyn A. Eck S. Feenstra I. Race V. Sistermans E. Sturm M. Weiss M. Yntema H. Bakker E. Scheffer H. Bauer P. EuroGentest European Society of Human GeneticsGuidelines for diagnostic next-generation sequencing.Eur J Hum Genet. 2016; 24: 2-5Crossref PubMed Scopus (359) Google Scholar, 15Rehm H.L. Bale S.J. Bayrak-Toydemir P. Berg J.S. Brown K.K. Deignan J.L. Friez M.J. Funke B.H. Hegde M.R. Lyon E. Working Group of the American College of Medical Genetics and Genomics Laboratory Quality Assurance CommitteeACMG clinical laboratory standards for next-generation sequencing.Genet Med. 2013; 15: 733-747Abstract Full Text Full Text PDF PubMed Scopus (674) Google Scholar This approach, however, is redundant and expensive and requires significant work that often severely delays producing results. Furthermore, such procedures are useless for variants overlooked because of accidental sample swapping because confirmation is restricted to clinically actionable variants in many cases. In addition to target enrichment, multiplexing is another key to reducing the costs and workload of clinical GPS.16Craig D.W. Pearson J.V. Szelinger S. Sekar A. Redman M. Corneveaux J.J. Pawlowski T.L. Laub T. Nunn G. Stephan D.A. Homer N. Huentelman M.J. Identification of genetic variants using bar-coded multiplexed sequencing.Nat Methods. 2008; 5: 887-893Crossref PubMed Scopus (254) Google Scholar Hybridization capture probes are the most expensive of the reagents used to prepare target-enriched NGS libraries. Therefore, library pooling before hybridization enables a reduction in costs in proportion to the number of pooled libraries.17Mamanova L. Coffey A.J. Scott C.E. Kozarewa I. Turner E.H. Kumar A. Howard E. Shendure J. Turner D.J. Target-enrichment strategies for next-generation sequencing.Nat Methods. 2010; 7: 111-118Crossref PubMed Scopus (848) Google Scholar, 18Meyer M. Stenzel U. Myles S. Prufer K. Hofreiter M. Targeted high-throughput sequencing of tagged nucleic acid samples.Nucleic Acids Res. 2007; 35: e97Crossref PubMed Scopus (157) Google Scholar However, multiplexing in a target-enrichment procedure often leads to an overlooked bias,19Larsson A.J.M. Stanley G. Sinha R. Weissman I.L. Sandberg R. Computational correction of index switching in multiplexed sequencing libraries.Nat Methods. 2018; 15: 305-307Crossref PubMed Scopus (44) Google Scholar, 20Ros-Freixedes R. Battagin M. Johnsson M. Gorjanc G. Mileham A.J. Rounsley S.D. Hickey J.M. Impact of index hopping and bias towards the reference allele on accuracy of genotype calls from low-coverage sequencing.Genet Sel Evol. 2018; 50: 64Crossref PubMed Scopus (21) Google Scholar as the multiplexed libraries are prone to incorporation of different index sequences from the other samples. Consequently, index cross contamination can lead to an influx of unexpected reads from the incorrect libraries. This unavoidable phenomenon is referred to as index hopping (switching). Index hopping principally results as an artifact of PCR from interference by incomplete NGS libraries. Given a small population of free indexed adapters, incomplete elongation of fragments appended to partial indexes of one or both termini that are annealed to each other can result in amplifying incorrect hybrid fragments by PCR. Sanger sequencing reveals only the overall properties of a template mixture and, therefore, cannot detect these small cross-contamination events. So far, indexing both terminal adapters (dual-index system) allows detection of index hopping, on the basis of the criteria that the output reads are appended to unexpected index combinations.21Kircher M. Sawyer S. Meyer M. Double indexing overcomes inaccuracies in multiplex sequencing on the Illumina platform.Nucleic Acids Res. 2012; 40: e3Crossref PubMed Scopus (650) Google Scholar The impact of index hopping remains largely unclear, but the rate of index hopping using Illumina's patterned flow cell technology is reported to be approximately 2% of the total output reads obtained.19Larsson A.J.M. Stanley G. Sinha R. Weissman I.L. Sandberg R. Computational correction of index switching in multiplexed sequencing libraries.Nat Methods. 2018; 15: 305-307Crossref PubMed Scopus (44) Google Scholar Of note, a significant increase in the capacity of NGS platforms accelerates the interest in sequencing multiple samples in parallel and often saturates the index variation used in each NGS run. If the detected indexes exist in the original experimental design, index hopping is no longer observed. As a consequence, the dual-index system becomes useless in highly multiplexed experiments. Thus, a comprehensive solution to ensure quality assurance in GPS data is urgently needed. Herein, we designed a short DNA fragment that can be spiked into original whole blood samples to detect accidental sample tube or plate mix-ups or even a small degree of cross contamination in the obtained GPS data. This was basically conceived from sample assurance spike-in sequencing, whereby a uniquely barcoded DNA fragment is spiked into the original sample.22Quail M.A. Smith M. Jackson D. Leonard S. Skelly T. Swerdlow H.P. Gu Y. Ellis P. SASI-Seq: sample assurance spike-ins, and highly differentiating 384 barcoding for Illumina sequencing.BMC Genomics. 2014; 15: 110Crossref PubMed Scopus (39) Google Scholar The index sequence is obtained at the same time that the sample is sequenced, thereby allowing unambiguous identification of the sample by the index information. However, it is still unclear whether this fragment is compatible for targeted resequencing procedures. To extend the application of sample assurance spike-in sequencing technology to GPS, 192 variations of unique index probes (15 bp) were designed. The 15-bp variable index sequence was combined with a 120-bp fixed sequence for assembly into a single fragment for recovery in a hybridization capture reaction. Next, to obtain sufficient GPS reads to detect low-level cross contamination, the amount of spike-in probe to add to whole blood samples was optimized. Finally, an analytical pipeline was designed to efficiently extract the index sequences corresponding to the spike-in probe in big NGS data, identify the sample source, and quantitatively detect the rate of cross contamination among the multiplexed samples. We hope this quality assurance method will significantly prevent clinical errors caused by human mishandlings. Whole blood was collected in BD Vacutainer blood collection tubes containing K2EDTA (BD, Franklin Lakes, NJ). Human genomic DNA (gDNA) was isolated from whole blood using the Maxwell RCS Whole Blood DNA Kit (Promega, Madison, WI) and Maxwell RCS Instrument (Promega). Two single-donor samples of anonymous human gDNA and the gDNA from seven specimens registered in the 1000 Genomes Project (NA12878, NA24385, NA24149, NA24143, NA24631, NA24694, and NA24695) were purchased from the BioChain Institute (Newark, CA) and the Coriell Institute23Abecasis G.R. Altshuler D. Auton A. Brooks L.D. Durbin R.M. Gibbs R.A. Hurles M.E. McVean G.A. 1000 Genomes Project ConsortiumA map of human genome variation from population-scale sequencing.Nature. 2010; 467: 1061-1073Crossref PubMed Scopus (5948) Google Scholar, 24Genomes Project C. Auton A. Brooks L.D. Durbin R.M. Garrison E.P. Kang H.M. Korbel J.O. Marchini J.L. McCarthy S. McVean G.A. Abecasis G.R. A global reference for human genetic variation.Nature. 2015; 526: 68-74Crossref PubMed Scopus (8527) Google Scholar (Camden, NJ), respectively. Before library preparation, gDNA was purified by isopropanol precipitation and resuspended in 10 mmol/L Tris-HCl (pH 8.0). gDNA quality was evaluated using the BioSpec-nano (Shimadzu, Kyoto, Japan) with the following quality metrics: A260/280 = 1.8–2.0 and A260/230 > 1.6. Furthermore, RNA contamination in the gDNA sample was assessed using the QuantiFluor One dsDNA system (Promega) and Quantus Fluorometer (Promega). Primer3PLUS software version 2.4.2 (https://primer3plus.com, last accessed January 23, 2017) was used to design PCR primer pairs specific to the PhiX genome. Primers designed to target genomic positions 4055 and 4174 were appended to the M13 forward (phiX4055-M13f; 5′-GTAAAACGACGGCCAGTTGCTATTGAGGCTTGTGGCA-3′) and reverse complimentary T7 terminal (phiX4174-T7tRC; 5′-CCGCTGAGCAATAACTAGCATACGCCCTGCATACGAAAAGA-3′) universal primer sequences, respectively. These primers, together with PrimerSTAR GXL Polymerase (Takara Bio, Shiga, Japan), were used to amplify a 120-bp fragment from the PhiX genome (Takara Bio). The fragment was joined to one of a series of synthesized oligonucleotides [5′-TAATACGACTCACTATAGG (index sequence)15 GTAAAACGACGGCCAGT-3′] (IDT, Coralville, IA), and individual full-length probes were amplified by PCR using the primer pair T7p (5′-TAATACGCTCACTATAGG-3′) and T7tRC (5′-CCGCTGAGCAATAACTAGCAT-3′). The amplified probes were independently purified using the DNA Clean & Concentrator kit (Zymo Research, Irvine, CA) and quantified using the KAPA SYBR FAST qPCR Kit (Kapa Biosystems, Wilmington, MA) with the T7p and T7tRC primers, on the Applied Biosystems 7500 real-time PCR system (Thermo Fisher Scientific, Waltham, MA). The probes are also commercially available through NIPPON Genetics Co, Ltd (Tokyo, Japan) or NIPPON Genetics Europe (Duren, Germany). First, variable concentrations of spike-in probes were mixed with gDNA. The gDNA and the mixed spike-in probes were subjected to two fragmentation methods: KAPA Frag Enzyme (30 minutes at 37°C; Kapa Biosystems) or Covaris S2 [peak incident Power = 175, duty factor = 10, cycles/burst = 200, duration = 360 seconds (150-bp protocol); peak incident power = 140, duty factor = 10, cycles/burst = 200, duration = 80 seconds (300-bp protocol); Covaris, Woburn, MA]. The fragmented gDNA was subjected to an NGS library preparation procedure using the KAPA HyperPlus Library Preparation Kit (Kapa Biosystems). Short lengths of libraries were removed in a postligation [0.8× solid phase reversible immobilization (SPRI)] step and post-PCR (1.0× SPRI) cleanup steps, as described in the manufacturers' instruction. The spike-in capture probe was synthesized as either a 120-mer biotinylated oligodeoxynucleotide (IDT), 5′-/biotin/TGCTATTGAGGCTTGTGGCATTTCTACTCTTTCTCAATCCCCAATGCTTGGCTTCCATAAGCAGATGGATAACCGCATCAAGCTCTTGGAAGAGATTCTGTCTTTTCGTATGCAGGGCGT-3′, or as a 120-mer biotinylated oligonucleotide (IDT), 5′-/biotin/UGCUAUUGAGGCUUGUGGCAUUUCUACUCUUUCUCAAUCCCCAAUGCUUGGCUUCCAUAAGCAGAUGGAUAACCGCAUCAAGCUCUUGGAAGAGAUUCUGUCUUUUCGUAUGCAGGGCGU-3′. T-based DNA oligonucleotide probes were used for DNA-based capture methods, whereas U-based RNA oligonucleotides were used for RNA-based capture methods. The spike-in probe was recovered using the xGen hybridization capture method, which was performed using the biotinylated DNA capture probe (400 fmol) and the xGen Hybridization and Wash Kit (IDT), according to the manufacturer's instructions. Alternatively, the spike-in probe was also recovered using the SureSelect XT or SureSelect XT-HS method, which is performed using the biotinylated RNA capture probe (400 fmol) and the SureSelectXT Reagent or SureSelectXT HS Reagent kit, respectively (Agilent Technologies, Santa Clara, CA). In the conventional method,25Gnirke A. Melnikov A. Maguire J. Rogov P. LeProust E.M. Brockman W. Fennell T. Giannoukos G. Fisher S. Russ C. Gabriel S. Jaffe D.B. Lander E.S. Nusbaum C. Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing.Nat Biotechnol. 2009; 27: 182-189Crossref PubMed Scopus (1040) Google Scholar the spike-in probes were recovered by the biotinylated DNA or RNA capture probe (400 fmol) using reagents conventionally used for liquid hybridization capture. In brief, 10 amol spike-in probe, 750 ng NGS library, 2.5 μg Cot1 DNA (Thermo Fisher Scientific), 2.5 μg salmon sperm DNA (Thermo Fisher Scientific), and synthesized blocking oligonucleotides [5′-CAAGCAGAAGACGGCATACGAGAT (index sequence)8 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-3′ and 5′-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT-3′] were combined, and the solution was concentrated to 7 μL. The spike-in probe and NGS library were denatured for 5 minutes at 95°C, incubated for 5 minutes at 65°C, and mixed with 6 μL probe mixture [capture probe, 400 fmol DNA/RNA capture probe and 20 U RNase inhibitor (if required)] and 13 μL 2 × hybridization buffer [10 × saline–sodium phosphate–EDTA (SSPE), 10 × Denhardt's Solution, 10 mmol/L EDTA (pH 8.0), and 0.2% SDS]. After 16 hours at 65°C, the hybridization mixture was mixed with 500 μg Dynabeads MyOne Streptavidin T1 (Thermo Fisher Scientific), which was suspended in 200 μL wash buffer [1 mol/L NaCl, 10 mmol/L Tris-HCl (pH 7.5), and 1 mmol/L EDTA]. After vortex mixing for 30 minutes at 25°C, the beads were washed with 0.1% SDS/1 × standard saline citrate, followed by three 10-minute washes at 65°C with 0.1% SDS/0.1 × standard saline citrate. The spike-in probes and libraries enriched on the beads were amplified by 16 cycles of PCR using the KAPA Library Amplification Kit (Kapa Biosystems). Only the enriched spike-in probes, not the other targeted genomic regions, were quantified using KAPA SYBR FAST qPCR Kits (Kapa Biosystems) with the T7p and T7tRC primers. To evaluate enrichment of the targeted genomic regions, the primer sets for quantitative PCR (qPCR) analysis were designed by ExonPrimer software (https://ihg.helmholtz-muenchen.de/ihg/ExonPrimer.html, last accessed January 23, 2017) (Table 1). qPCR amplification was performed using the Applied Biosystems 7500 real-time PCR system (Thermo Fisher Scientific). The cycling conditions for qPCR were as follows: denaturation for 5 seconds at 95°C, followed by annealing and elongation for 30 seconds at 60°C for 35 cycles.Table 1The Primer Sequences Used for Assessing Target EnrichmentRegion nameForward primer sequenceReverse primer sequencePCR product size, bpACSF3 E75′-AGCCTCCCCTTCAGTGTTTC-3′5′-GTGGATGGTGTAGGAGCAGG-3′119ACSF3 I75′-ATCTGGGAGTTTGAGGCTGC-3′5′-CGAGCAGGACACACACTCTT-3′82ACSF3 E85′-GACCCTCCGTGTTTCGAG-3′5′-CTCGGGGTCTCCTCCACTC-3′119ACSF3 I85′-GGCGCTGTTCTTATCTTGCG-3′5′-AGCGAGAACTTGTCTGTGGG-3′84KLHL7 E65′-ACGATGAACCTAATCGCCAG-3′5′-TCTTGAATAAGTGGTTCAGCTTG-3′115KLHL7 I65′-CAGGAATTTTTGGGCCAGGC-3′5′-CTCTGTCACTTGGGCTGGAG-3′144KLHL7 I95′-AATCTCACTACGTGCAGGGC-3′5′-ACGCTCATCCAGGTGAAGTG-3′90KLHL7 E105′-AGCCAGGAAGAATCATGGG-3′5′-CAGCAAACTCATCAGGAAAGTG-3′117E, exon; I, intron Open table in a new tab E, exon; I, intron The size distributions of the NGS libraries were analyzed with an MCE-202 MultiNA system (Shimadzu, Kyoto, Japan) and characterized with KAPA library quantification kits (Kapa Biosystems) and Applied Biosystems 7500 real-time PCR systems (Thermo Fisher Scientific). The libraries were applied to the NextSeq 500 Mid-Output flow cell (Illumina, San Diego, CA) and run on a 2 × 75 bp paired read mode. The index sequences of the spike-in probes were extracted using the cutadapt tool26Chen S. Zhou Y. Chen Y. Gu J. fastp: An ultra-fast all-in-one FASTQ preprocessor.Bioinformatics. 2018; 34: i884-i890Crossref PubMed Scopus (4923) Google Scholar version 1.7.1. The 5′ and 3′ sequences flanking the index [ie, TAATACGACTCACTATAGG (T7p) and GTAAAACGACGGCCAGT-(M13f)] were sequentially applied to the demultiplexed FASTQ files. To filter out improper index sequences with >1-base insertions or deletions, only sequences within the range of 14 to 16 bases were collected. After masking the bases with a phred-scaled base quality < 20 as N by fastq_masker tool of FASTX toolkit version 0.0.13 (http://hannonlab.cshl.edu/fastx_toolkit, last accessed April 9, 2019), the Levenshtein distances of the sequenced strings from the original 192 indexes were calculated using a general algorithm.27Buschmann T. Bystrykh L.V. Levenshtein error-correcting barcodes for multiplexed DNA sequencing.BMC Bioinformatics. 2013; 14: 272Crossref PubMed Scopus (65) Google Scholar Mathematically, for a sequenced string (a) and a given designed string (b) with lengths i and j, respectively, Levenshtein distance [D(i,j)] was calculated as follows:D(i,j)=max(i,j)[min(i,j)=0].(1) D(i,j)=min[D(i−1,j)+1;D(i,j−1)+1;D(i−1,j−1)+cost][min(i,j)≠0].(2) where cost = 0 if (ai = bj), 1 otherwise. To remove an improper index sequence, only sequences with less than three distance were sorted. Finally, on the basis of the determined distances, the sequence indexes were assigned to the nearest one of the known designed indexes. In parallel with the analysis of the spike-in probe index sequences, variants were detected with an analytical pipeline, as previously reported.28Fujiki R. Ikeda M. Yoshida A. Akiko M. Yao Y. Nishimura M. Matsushita K. Ichikawa T. Tanaka T. Morisaki H. Morisaki T. Ohara O. Assessing the accuracy of variant detection in cost-effective gene panel testing by next-generation sequencing.J Mol Diagn. 2018; 20: 572-582Abstract Full Text Full Text PDF PubMed Scopus (18) Google Scholar In brief, reads were aligned to the reference human genome (hg38) using the Burrows-Wheeler Aligner MEM algorithm (BWA-MEM version 0.7.5a35) using default parameters.29Li H. Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform.Bioinformatics. 2009; 25: 1754-1760Crossref PubMed Scopus (26904) Google Scholar The resultant sequence alignment/MAP (SAM) files were converted to a binary alignment/MAP and sorted by genomic locations using SAMtools version 0.1.1836.30Li H. Handsaker B. Wysoker A. Fennell T. Ruan J. Homer N. Marth G. Abecasis G. Durbin R. 1000 Genome Project Data Processing Subgroup: The Sequence Alignment/Map format and SAMtools.Bioinformatics. 2009; 25: 2078-2079Crossref PubMed Scopus (31884) Google Scholar Duplicated and misaligned reads were filtered out using Picard-1.84's MarkDuplicates and FilterSamReads on the basis of the soft-clipped Compact Idiosyncratic Gapped Alignment Report (CIGAR), respectively. Afterward, quality scores were recalibrated using the Genome Analysis Toolkit-3.6.0 using the BaseRecalibrator and PrintReads commands under the default parameters.31Van der Auwera G.A. Carneiro M.O. Hartl C. Poplin R. Del Angel G. Levy-Moonshine A. Jordan T. Shakir K. Roazen D. Thibault J. Banks E. Garimella K.V. Altshuler D. Gabriel S. DePristo M.A. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline.Curr Protoc Bioinformatics. 2013; 43 (1.10.1-33)Crossref PubMed Scopus (3532) Google Scholar Finally, variants were detected from filtered binary alignment/MAP files using VarScan2 version 2.3.3 and Genome Analysis Toolkit-3.6.0 HaplotypeCaller.31Van der Auwera G.A. Carneiro M.O. Hartl C. Poplin R. Del Angel G. Levy-Moonshine A. Jordan T. Shakir K. Roazen D. Thibault J. Banks E. Garimella K.V. Altshuler D. Gabriel S. DePristo M.A. From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline.Curr Protoc Bioinformatics. 2013; 43 (1.10.1-33)Crossref PubMed Scopus (3532) Google Scholar, 32Koboldt D.C. Larson D.E. Wilson R.K. Using VarScan 2 for germline variant calling and somatic mutation detection.Curr Protoc Bioinformatics. 2013; 44 (15.4.1-17)Crossref PubMed Scopus (139) Google Scholar To expand the application of sample assurance spike-in sequencing to targeted resequencing,22Quail M.A. Smith M. Jackson D. Leonard S. Skelly T. Swerdlow H.P. Gu Y. Ellis P. SASI-Seq: sample assurance spike-ins, and highly differentiating 384 barcoding for Illumina sequencing.BMC Genomics. 2014; 15: 110Crossref PubMed Scopus (39) Google Scholar a DNA short fragment composed of two functional segments was designed: a unique index sequence and a sequence enabling recovery of the probe by hybridization capture (Figure 1A). The segment hybridizing to the capture probe was designed first. As the phiX174 DNA virus has a well-defined and unbiased genome sequence, it is commonly used as a quality control in NGS.33Krueger F. Andrews S.R. Osborne C.S. Large scale loss of data in low-diversity Illumina sequencing libraries can be recovered by deferred cluster calling.PLoS One. 2011; 6: e16607Crossref PubMed Scopus (69) Google Scholar Therefore, 120-bp distinct segments were extracted from the PhiX genome. The Primer3 program was applied against the PhiX reference genome sequence from GenBank (https://www.ncbi.nlm.nih.gov/genbank; accession number NC_001422.1) to design primer pairs that amplify fragments of the appropriate lengths. Among the candidates, a primer pair targeting positions 4055 and 4174 within the PhiX genome, amplifying a 120-bp fragment with a GC content of 45.8%, was selected. Next, to handle the highly multiplexed NGS reads, preparation of a sufficient number of unique indexes was required to secure the address of any sequence within the pool of reads and trace it back to the original sample. The maximum number of multiplexing designs generally depends on the number of variations among the NGS adapters used. Because most laboratories prepare 96 libraries in a 96-well plate, a couple of 96-index designs would be sufficient for diagnostic purposes. Furthermore, considering the inaccuracies within sequenced reads obtained from any NGS platform, a few minimum requirements were imposed on these designs. The sequences should be substantially different from one another to prevent mixing up sequenced reads, even if few sequencing errors exist. Furthermore, in terms of GC content and the presence of homopolymers or palindromes, moderate flexibility of the index sequences is required for enzymatic amplification and high-fidelity sequencing. Thus, 192 (96 × 2 series) coordinates were rando" @default.
- W2969569897 created "2019-08-29" @default.
- W2969569897 creator A5056350708 @default.
- W2969569897 creator A5066057500 @default.
- W2969569897 creator A5070984697 @default.
- W2969569897 date "2019-11-01" @default.
- W2969569897 modified "2023-09-27" @default.
- W2969569897 title "Short DNA Probes Developed for Sample Tracking and Quality Assurance in Gene Panel Testing" @default.
- W2969569897 cites W1590169480 @default.
- W2969569897 cites W1591228931 @default.
- W2969569897 cites W1919257374 @default.
- W2969569897 cites W1986474467 @default.
- W2969569897 cites W2027589602 @default.
- W2969569897 cites W2029493835 @default.
- W2969569897 cites W2057152418 @default.
- W2969569897 cites W2061596350 @default.
- W2969569897 cites W2074735241 @default.
- W2969569897 cites W2095901358 @default.
- W2969569897 cites W2103441770 @default.
- W2969569897 cites W2104549677 @default.
- W2969569897 cites W2108234281 @default.
- W2969569897 cites W2118526609 @default.
- W2969569897 cites W2119157396 @default.
- W2969569897 cites W2121531763 @default.
- W2969569897 cites W2133980123 @default.
- W2969569897 cites W2141599418 @default.
- W2969569897 cites W2163553106 @default.
- W2969569897 cites W2169766923 @default.
- W2969569897 cites W2171777347 @default.
- W2969569897 cites W2554756829 @default.
- W2969569897 cites W2625292567 @default.
- W2969569897 cites W2735633330 @default.
- W2969569897 cites W2741154205 @default.
- W2969569897 cites W2802329058 @default.
- W2969569897 cites W2804126079 @default.
- W2969569897 cites W2809827370 @default.
- W2969569897 cites W2885979843 @default.
- W2969569897 cites W2897908095 @default.
- W2969569897 cites W2950425965 @default.
- W2969569897 cites W2951912016 @default.
- W2969569897 cites W2953354146 @default.
- W2969569897 cites W4239818044 @default.
- W2969569897 doi "https://doi.org/10.1016/j.jmoldx.2019.07.003" @default.
- W2969569897 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/31445212" @default.
- W2969569897 hasPublicationYear "2019" @default.
- W2969569897 type Work @default.
- W2969569897 sameAs 2969569897 @default.
- W2969569897 citedByCount "3" @default.
- W2969569897 countsByYear W29695698972021 @default.
- W2969569897 countsByYear W29695698972022 @default.
- W2969569897 countsByYear W29695698972023 @default.
- W2969569897 crossrefType "journal-article" @default.
- W2969569897 hasAuthorship W2969569897A5056350708 @default.
- W2969569897 hasAuthorship W2969569897A5066057500 @default.
- W2969569897 hasAuthorship W2969569897A5070984697 @default.
- W2969569897 hasBestOaLocation W29695698971 @default.
- W2969569897 hasConcept C104317684 @default.
- W2969569897 hasConcept C106436119 @default.
- W2969569897 hasConcept C142724271 @default.
- W2969569897 hasConcept C15744967 @default.
- W2969569897 hasConcept C185592680 @default.
- W2969569897 hasConcept C19417346 @default.
- W2969569897 hasConcept C198531522 @default.
- W2969569897 hasConcept C2775936607 @default.
- W2969569897 hasConcept C2778618615 @default.
- W2969569897 hasConcept C2994273702 @default.
- W2969569897 hasConcept C41008148 @default.
- W2969569897 hasConcept C43617362 @default.
- W2969569897 hasConcept C54355233 @default.
- W2969569897 hasConcept C552990157 @default.
- W2969569897 hasConcept C70721500 @default.
- W2969569897 hasConcept C71924100 @default.
- W2969569897 hasConcept C86803240 @default.
- W2969569897 hasConceptScore W2969569897C104317684 @default.
- W2969569897 hasConceptScore W2969569897C106436119 @default.
- W2969569897 hasConceptScore W2969569897C142724271 @default.
- W2969569897 hasConceptScore W2969569897C15744967 @default.
- W2969569897 hasConceptScore W2969569897C185592680 @default.
- W2969569897 hasConceptScore W2969569897C19417346 @default.
- W2969569897 hasConceptScore W2969569897C198531522 @default.
- W2969569897 hasConceptScore W2969569897C2775936607 @default.
- W2969569897 hasConceptScore W2969569897C2778618615 @default.
- W2969569897 hasConceptScore W2969569897C2994273702 @default.
- W2969569897 hasConceptScore W2969569897C41008148 @default.
- W2969569897 hasConceptScore W2969569897C43617362 @default.
- W2969569897 hasConceptScore W2969569897C54355233 @default.
- W2969569897 hasConceptScore W2969569897C552990157 @default.
- W2969569897 hasConceptScore W2969569897C70721500 @default.
- W2969569897 hasConceptScore W2969569897C71924100 @default.
- W2969569897 hasConceptScore W2969569897C86803240 @default.
- W2969569897 hasIssue "6" @default.
- W2969569897 hasLocation W29695698971 @default.
- W2969569897 hasLocation W29695698972 @default.
- W2969569897 hasOpenAccess W2969569897 @default.
- W2969569897 hasPrimaryLocation W29695698971 @default.
- W2969569897 hasRelatedWork W1991523530 @default.
- W2969569897 hasRelatedWork W2002128513 @default.
- W2969569897 hasRelatedWork W2009966535 @default.