Matches in SemOpenAlex for { <https://semopenalex.org/work/W2073569341> ?p ?o ?g. }
- W2073569341 endingPage "520" @default.
- W2073569341 startingPage "509" @default.
- W2073569341 abstract "Rare-variant association studies in common, complex diseases are customarily conducted under an additive risk model in both single-variant and burden testing. Here, we describe a method to improve detection of rare recessive variants in complex diseases termed RAFT (recessive-allele-frequency-based test). We found that RAFT outperforms existing approaches when the variant influences disease risk in a recessive manner on simulated data. We then applied our method to 1,791 Finnish individuals with type 2 diabetes (T2D) and 2,657 matched control subjects. In BBS10, we discovered a rare variant (c.1189A>G [p.Ile397Val]; rs202042386) that confers risk of T2D in a recessive state (p = 1.38 × 10−6) and would be missed by conventional methods. Testing of this variant in an established in vivo zebrafish model confirmed the variant to be pathogenic. Taken together, these data suggest that RAFT can effectively reveal rare recessive contributions to complex diseases overlooked by conventional association tests. Rare-variant association studies in common, complex diseases are customarily conducted under an additive risk model in both single-variant and burden testing. Here, we describe a method to improve detection of rare recessive variants in complex diseases termed RAFT (recessive-allele-frequency-based test). We found that RAFT outperforms existing approaches when the variant influences disease risk in a recessive manner on simulated data. We then applied our method to 1,791 Finnish individuals with type 2 diabetes (T2D) and 2,657 matched control subjects. In BBS10, we discovered a rare variant (c.1189A>G [p.Ile397Val]; rs202042386) that confers risk of T2D in a recessive state (p = 1.38 × 10−6) and would be missed by conventional methods. Testing of this variant in an established in vivo zebrafish model confirmed the variant to be pathogenic. Taken together, these data suggest that RAFT can effectively reveal rare recessive contributions to complex diseases overlooked by conventional association tests. Genome-wide association studies (GWASs) have identified numerous additive genomic variants associated with complex disorders,1Welter D. MacArthur J. Morales J. Burdett T. Hall P. Junkins H. Klemm A. Flicek P. Manolio T. Hindorff L. Parkinson H. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.Nucleic Acids Res. 2014; 42: D1001-D1006Crossref PubMed Scopus (2000) Google Scholar which appear to impact liability to disease (or affect a normally distributed trait) linearly with an effect size of a when heterozygous and 2a when homozygous. Most of these variants are common (≥5% frequency) and have only small effect sizes (odds ratio [OR] < 1.2) on complex disease classifications. However, such variants do not explain the full heritability of disease. As rare variants have been comparatively less well studied, attention has shifted to exome- or genome-sequence-based approaches to identifying additional risk factors. Another less-well-explored hypothesis is that there might be variants that influence disease susceptibility in a nonadditive fashion, for instance, if the variants influence disease risk recessively, that is, only when the minor allele is found in a homozygous state, or in trans with another minor allele to form compound heterozygotes. However, most widely used tests for disease association, such as allele-based regression tests or burden tests for aggregation of rare alleles, assume an additive model. Such approaches are well powered to detect common alleles even if they influence disease risk in a nonadditive fashion. Common variants in genes such as FUT2 (MIM 182100) and TYRP1 (MIM 115501) have also been demonstrated to influence complex diseases and traits in a recessive manner;2Lindesmith L. Moe C. Marionneau S. Ruvoen N. Jiang X. Lindblad L. Stewart P. LePendu J. Baric R. Human susceptibility and resistance to Norwalk virus infection.Nat. Med. 2003; 9: 548-553Crossref PubMed Scopus (832) Google Scholar, 3Kenny E.E. Timpson N.J. Sikora M. Yee M.C. Moreno-Estrada A. Eng C. Huntsman S. Burchard E.G. Stoneking M. Bustamante C.D. Myles S. Melanesian blond hair is caused by an amino acid change in TYRP1.Science. 2012; 336: 554Crossref PubMed Scopus (71) Google Scholar these loci were detected by the typical additive approaches because there are many homozygotes observed for these common variants and therefore strong signal exists even under the additive model.4Vukcevic D. Hechter E. Spencer C. Donnelly P. Disease model distortion in association studies.Genet. Epidemiol. 2011; (Published online March 17, 2011)https://doi.org/10.1002/gepi.20576Crossref PubMed Scopus (25) Google Scholar In contrast, the power of the additive model to detect recessive alleles diminishes greatly at lower frequencies, since the numbers of homozygotes observed are far fewer. Consequently, if there are rare variants that confer significant risk of the phenotype in a recessive manner, conventional tests are underpowered to detect such variants in complex diseases. However, extensive runs of homozygosity have been associated with complex diseases and traits such as height5McQuillan R. Eklund N. Pirastu N. Kuningas M. McEvoy B.P. Esko T. Corre T. Davies G. Kaakinen M. Lyytikäinen L.P. et al.ROHgen ConsortiumEvidence of inbreeding depression on human height.PLoS Genet. 2012; 8: e1002655Crossref PubMed Scopus (64) Google Scholar (MIM 606255), schizophrenia6Keller M.C. Simonson M.A. Ripke S. Neale B.M. Gejman P.V. Howrigan D.P. Lee S.H. Lencz T. Levinson D.F. Sullivan P.F. Schizophrenia Psychiatric Genome-Wide Association Study ConsortiumRuns of homozygosity implicate autozygosity as a schizophrenia risk factor.PLoS Genet. 2012; 8: e1002656Crossref PubMed Scopus (91) Google Scholar (MIM 181500), and autism spectrum disorders7Gamsiz E.D. Viscidi E.W. Frederick A.M. Nagpal S. Sanders S.J. Murtha M.T. Schmidt M. Triche E.W. Geschwind D.H. State M.W. et al.Simons Simplex Collection Genetics ConsortiumIntellectual disability is associated with increased runs of homozygosity in simplex autism.Am. J. Hum. Genet. 2013; 93: 103-109Abstract Full Text Full Text PDF PubMed Scopus (54) Google Scholar, 8Chahrour M.H. Yu T.W. Lim E.T. Ataman B. Coulter M.E. Hill R.S. Stevens C.R. Schubert C.R. Greenberg M.E. Gabriel S.B. Walsh C.A. ARRA Autism Sequencing CollaborationWhole-exome sequencing and homozygosity analysis implicate depolarization-regulated neuronal genes in autism.PLoS Genet. 2012; 8: e1002635Crossref PubMed Scopus (146) Google Scholar (MIM 209850), suggesting that rare recessive variants in these regions might be driving disease risk. Specifically for autism spectrum disorders, further studies have shown that there is a strong signal from recessively acting rare variants involved in the disease etiology.9Yu T.W. Chahrour M.H. Coulter M.E. Jiralerspong S. Okamura-Ikeda K. Ataman B. Schmitz-Abe K. Harmin D.A. Adli M. Malik A.N. et al.Using whole-exome sequencing to identify inherited causes of autism.Neuron. 2013; 77: 259-273Abstract Full Text Full Text PDF PubMed Scopus (317) Google Scholar, 10Lim E.T. Raychaudhuri S. Sanders S.J. Stevens C. Sabo A. MacArthur D.G. Neale B.M. Kirby A. Ruderfer D.M. Fromer M. et al.NHLBI Exome Sequencing ProjectRare complete knockouts in humans: population distribution and significant role in autism spectrum disorders.Neuron. 2013; 77: 235-242Abstract Full Text Full Text PDF PubMed Scopus (199) Google Scholar Moreover, since a high proportion of rare Mendelian disease alleles confer risk in a strictly dominant or recessive fashion, testing for nonadditive effects of rare variants in common, complex diseases might well yield fruitful associations missed by conventional association tests. We describe a statistical methodology (termed RAFT for recessive allele frequency-based test) designed specifically for detecting variants with recessive effects on a dichotomous phenotype. We demonstrate that RAFT has considerably more power to detect disease-associated rare variants than conventional approaches. In applying RAFT to an exome chip data set comprising 4,448 individuals, 1,791 of whom are case subjects with type 2 diabetes (T2D [MIM 125853]), we detected a rare variant in Bardet-Biedl syndrome 10 (BBS10 [MIM 610148]) that confers significant risk of T2D in a recessive manner (p = 1.38 × 10−6). Functional testing of this allele in vivo confirmed its effect on protein function, providing further support for the genetic association. Thus, increasing power for specific modes of inheritance offers the potential to discover novel associations in common complex diseases (as it clearly has in rare diseases), particularly for rare recessive variants for which conventional statistical methods are bereft of power. The DNA samples were obtained from the Botnia and Direva Studies. For genotyping, the samples were sent to the Broad Institute and prepared for genetic analysis with two quality-control measures. First, DNA quantity was measured by Picogreen, and then all samples with sufficient total DNA and minimum concentrations for downstream activities were genotyped for a set of 24 SNPs with the Sequenom iPLEX Assay. These 24 validated markers include 1 gender assay and 23 SNPs located across the autosomes. The genotypes for these SNPs were used as a quality filter to advance samples, as well as a technical fingerprint validation (when applicable) for array genotypes. This study was approved by the MGH/Broad/HMS institutional review board, and appropriate informed consent was obtained from human subjects. All genotyping was performed at the Broad Institute Genetic Analysis Platform. DNA samples were placed on 96-well plates and genotyped with the Illumina HumanExome v.1.1 SNP array. Genotypes were assigned with GenomeStudio v.2010.3 and the calling algorithm/genotyping module v.1.8.4 with the custom cluster file HumanExomev1_1_CEPH_A.egt. Subsequent processing of genotype calling was done by zCall.11Goldstein J.I. Crenshaw A. Carey J. Grant G.B. Maguire J. Fromer M. O’Dushlaine C. Moran J.L. Chambert K. Stevens C. et al.Swedish Schizophrenia ConsortiumARRA Autism Sequencing ConsortiumzCall: a rare variant caller for array-based genotyping: genetics and population analysis.Bioinformatics. 2012; 28: 2543-2545Crossref PubMed Scopus (140) Google Scholar Samples with two or more discordant fingerprint genotypes and/or call rates below 97% were excluded from data analysis. Individuals with call rates below 99% were excluded from analysis, as were SNVs with call rates below 80% or with extreme deviations (p < 1 × 10−8) from Hardy-Weinberg equilibrium (HWE). The accuracy of genotypes for the BBS10 c.1189A>G variant was assessed via visual inspection of intensity plots (Figure S1, available online). Principal-component analysis was performed on a set of linkage-disequilibrium-independent SNPs with PLINK12Purcell S. Neale B. Todd-Brown K. Thomas L. Ferreira M.A. Bender D. Maller J. Sklar P. de Bakker P.I. Daly M.J. Sham P.C. PLINK: a tool set for whole-genome association and population-based linkage analyses.Am. J. Hum. Genet. 2007; 81: 559-575Abstract Full Text Full Text PDF PubMed Scopus (19771) Google Scholar and EIGENSTRAT.13Price A.L. Patterson N.J. Plenge R.M. Weinblatt M.E. Shadick N.A. Reich D. Principal components analysis corrects for stratification in genome-wide association studies.Nat. Genet. 2006; 38: 904-909Crossref PubMed Scopus (6867) Google Scholar In addition, to reduce heterogeneity between the case and control subjects, we performed one-to-one matching for each case-control pair by using the first ten principal components and selecting the closest matched control for each case. Case-control pairs with a sum of the absolute differences in the first ten principal components of >0.06 were removed from the analyses. In addition, cases with excessive relatedness (identity by descent > 0.2) were removed from the analyses. In addition, all variants with heterozygous counts of ≤3 (allele frequency of ∼0.025%) in case and control subjects were removed to reduce the number of variants that were miscalled as rare homozygotes instead of heterozygotes. To further reduce genotyping errors, variants with HWE p values of ≤1 × 10−3 in the controls were removed. The variants were then annotated with the Variant Effect Predictor. The key factor in any homozygosity analysis is the probability of minor allele (a) being homozygous P(aa), which in theory is simply P(a)2 and is estimated directly from the observed allele counts. Two scenarios exist when this expectation may not be met and can be readily accommodated up front in this analysis framework. Substructure or consanguinity in a population can cause global departure from HWE, resulting in systematic genome-wide excess homozygosity. In such cases, we can fit a regression to all variants in the data set to estimate a substructure or homozygosity by descent factor F by using the standard model P(aacorrected) = FP(a) + (1 − F)P(a)2 and can then utilize this corrected value as P(aa) below. This might be important for analysis of populations with high inbreeding. Additionally, in actual genotyping and sequencing data, we encounter occasions where local departures from HWE in control subjects (as well as case subjects) are observed as a result of hemizygous deletions or systematic genotyping errors. While such local genotyping artifacts can be detected and avoided in many ways, in cases where the observed rate of homozygous genotypes in control subjects exceeds the expected corrected P(aa), we can conservatively set the P(aa) used below to the higher observed rate in control subjects, thereby insuring that the probability of homozygosity that the test statistic relies on is insulated from both global and local sources of inflation. Under the alternative hypothesis described below, the maximum-likelihood estimate of P(aa) is altered slightly (since excess homozygosity in cases is presumed to arise from case ascertainment), but this difference is generally negligible and not tested as a parameter in the association test below. Under the alternative hypothesis of recessive association, the probability of the minor allele being homozygous in a case P(aa | case, γ), given a genotypic relative risk to homozygotes (γ), isP(aa|case,γ)=γP(aa)1−P(aa)+γP(aa).And the probability of not observing a homozygote for the minor allele in a case subject isP(aa¯|case,γ)=1−P(aa|case,γ). The alternative calculation for homozygotes for the minor allele in control subjects P(aa | control, γ) is similar. However, for selected control subjects, the formulation includes the estimated prevalence of the disease (ϕ), which we have fixed as a constant to 1 in 100,000. For convenience, we use the variable x to represent the prevalence of disease in control individuals where x=ϕ/(1−P(aa)+γP(aa)), which is the prevalence under the null. As such, P(aa | control, γ) and P(aa¯|control,γ) can be calculated byP(aa|control,γ)=(1−γx)P(aa)1−ϕand P(aa¯|control,γ)=1−P(aa|control,γ). And the resulting likelihood of the observation of ncaseaa number of alternate homozygotes in the case subjects, ncaseaa¯ number of heterozygotes and reference homozygotes in the case subjects, ncontrolaa number of alternate homozygotes in the control subjects, ncontrolaa¯ number of heterozygotes and reference homozygotes in the control subjects, is proportional toL(ncaseaa,ncaseaa¯)∼P(aa|case,γ)ncaseaa[P(aa¯|case,γ)]ncase−ncaseaa×P(aa|control,γ)ncontrolaa[P(aa¯|control,γ)]ncontrol−ncontrolaaorL(ncaseaa,ncaseaa¯)∼P(aa|case,γ)ncaseaa[1−P(aa|case,γ)]ncase−ncaseaa×P(aa|control,γ)ncontrolaa[1−P(aa|control,γ)]ncontrol−ncontrolaawhere ncase is the total number of case subjects and ncontrol is the total number of control subjects. We next performed an expectation maximization (EM) step to maximize the log likelihood ratios between the alternate model L1 and the null model L0, to obtain the following RAFT statistic:RAFT=log10[L1(ncaseaa,ncaseaa¯,ncontrolaa,ncontrolaa¯)L0(ncaseaa,ncaseaa¯,ncontrolaa,ncontrolaa¯)]orRAFT=log10[P(aa|case,γ=γcase)ncaseaa[1−P(aa|case,γ=γcase)]ncase−ncaseaaP(aa|case,γ=1)ncaseaa[1−P(aa|case,γ=1)]ncase−ncaseaa×P(aa|control,γ=γcase)ncontrolaa[1−P(aa|case,γ=γcase)]ncontrol−ncontrolaaP(aa|control,γ=1)ncontrolaa[1−P(aa|control,γ=1)]ncontrol−ncontrolaa]where γcase is equivalent to the OR of the alternate homozygotes for the variant. To calculate a p value, we can estimate the following to a chi-square distribution with 1 degree of freedom (χdf=1):2ln[P(aa|case,γ=γcase)ncaseaa[1−P(aa|case,γ=γcase)]ncase−ncaseaaP(aa|case,γ=1)ncaseaa[1−P(aa|case,γ=1)]ncase−ncaseaa×P(aa|control,γ=γcase)ncontrolaa[1−P(aa|case,γ=γcase)]ncontrol−ncontrolaaP(aa|control,γ=1)ncontrolaa[1−P(aa|control,γ=1)]ncontrol−ncontrolaa]∼χdf=1. We used two different populations—Finns and non-Finnish Europeans (NFEs)—to simulate a scenario where the samples used in the association are not homogeneously from a single population even if case and control subjects are well matched. Using whole-exome sequencing data available for Finns and NFEs,14Lim E.T. Würtz P. Havulinna A.S. Palta P. Tukiainen T. Rehnström K. Esko T. Mägi R. Inouye M. Lappalainen T. et al.Sequencing Initiative Suomi (SISu) ProjectDistribution and medical impact of loss-of-function variants in the Finnish founder population.PLoS Genet. 2014; 10: e1004494Crossref PubMed Scopus (251) Google Scholar we randomly sampled genotypes from 2,500 Finns and 2,500 NFEs and assigned them as case subjects. Similarly, we randomly sampled genotypes for another 2,500 Finns and 2,500 NFEs and assigned them as control subjects. For each variant, we calculated the number of individuals that were heterozygous or homozygous for the minor allele in both case and control subjects. To test the RAFT statistic in a simulated data set without population substructure, we simulated genotypes of 5,000 Finns as case subjects and 5,000 Finns as control subjects. These simulations were repeated 100 times to obtain the 95% confidence interval for the distribution of p values calculated with the RAFT statistic. Translation blocker (TB) morpholinos against zebrafish bbs10 were synthesized by Gene Tools (5′-CGTTAAACCTCTTCTGTGAACCAGC-3′). Human BBS10 mRNA was in vitro transcribed with mMESSAGE mMACHINE SP6 Kit (Ambion). A volume of 0.5 nl mixture of 5 ng bbs10 TB and/or 75 pg mRNA was microinjected into the yolk of embryos at the 1- to 8-cell stage. At the 9- to 10-somite stage, embryos were analyzed for convergent extension defects according to previously established phenotypic criteria.15Stoetzel C. Laurier V. Davis E.E. Muller J. Rix S. Badano J.L. Leitch C.C. Salem N. Chouery E. Corbani S. et al.BBS10 encodes a vertebrate-specific chaperonin-like protein and is a major BBS locus.Nat. Genet. 2006; 38: 521-524Crossref PubMed Scopus (221) Google Scholar, 16Gerdes J.M. Liu Y. Zaghloul N.A. Leitch C.C. Lawson S.S. Kato M. Beachy P.A. Beales P.L. DeMartino G.N. Fisher S. et al.Disruption of the basal body compromises proteasomal function and perturbs intracellular Wnt response.Nat. Genet. 2007; 39: 1350-1360Crossref PubMed Scopus (324) Google Scholar, 17Zaghloul N.A. Liu Y. Gerdes J.M. Gascue C. Oh E.C. Leitch C.C. Bromberg Y. Binkley J. Leibel R.L. Sidow A. et al.Functional analyses of variants reveal a significant role for dominant negative and common alleles in oligogenic Bardet-Biedl syndrome.Proc. Natl. Acad. Sci. USA. 2010; 107: 10602-10607Crossref PubMed Scopus (100) Google Scholar At 1 day postfertilization (dpf), zebrafish embryos were dechorionated and transferred into egg water (recipe in Westerfield18Westerfield M. THE ZEBRAFISH BOOK: A Guide for the Laboratory Use of Zebrafish (Danio rerio).Fifth Edition. University of Oregon Press, Eugene2007Google Scholar) with 1% D-glucose. At 4 dpf, total RNA was extracted from zebrafish embryos with Trizol (Invitrogen) and then reverse transcripted into cDNA with SuperScript III (Invitrogen), to provide template for real-time qPCR. The expression levels of pdk2 and runx1, as well as gapdh for internal control, were examined with the following primers: dr-pdk2-qF, 5′-CGAATTAGCCAATAAACCAACAAA-3′; dr-pdk2-qR, 5′-CACACTTCACCTGCATTTCCA-3′; dr-runx1-qF, 5′-CGTCTTCACAAACCCTCCTCAA-3′; dr-runx1-qR, 5′-GCTTTACTGCTTCATCCGGCT-3′; dr-gapdh-qF, 5′-TTGTAAGCAATGCCTCCTGC-3′; dr-gapdh-qR, 5′-CTGTGTTGCTGTGATGGCAT-3′. Instead of directly comparing homozygous counts in case and control subjects, the RAFT statistic evaluates the likelihood of observing the number of homozygotes in the cases (Ncases) compared to the expected number of homozygotes given the allele frequency of the variant and normalizes this by the same statistic for observing Ncontrols, the number of homozygotes in the controls compared to the expected number of homozygotes. For instance, if there is a variant (variant A) with 0.5% allele frequency, we would expect to observe 0.05 homozygotes in 2,000 individuals (0.005 × 0.005 × 2,000) for this variant. Similarly, if there is another variant (variant B) with 5% allele frequency, we would expect to observe 5 homozygotes in 2,000 individuals (0.05 × 0.05 × 2,000). However, if we observed five homozygotes who are case subjects and zero homozygotes who are control subjects for both variants A and B, the RAFT statistic will assign a higher LOD score to variant A than to variant B because it is more unusual to observe five homozygotes in the case group for a 0.5% variant than for a 5% variant. The normalization of the log likelihood ratio with the observation in the control subjects brings in the case-control comparison and, more importantly, ensures that the results are adjusted for regions where there is excessive homozygosity in the control subjects (such as hemizygous copy-number polymorphic regions, regions with genotyping errors or arising from unusual substructure) that might not be truly associated with disease risk. A conventional test such as Fisher’s exact test will assign the same probabilities for both variants based on the observed numbers of homozygotes. However, the RAFT statistic essentially compares the observed five homozygotes in the case subjects with the expected number of homozygotes (0.05 for variant A), resulting in increased power. To compare RAFT to existing approaches, we performed simulations to evaluate the power for detecting variants with allele frequency ranging from 0.1% to 20% that confer a modest risk (OR = 10) of disease under a recessive mode of inheritance (Figure 1). We observed that there might greater than 80% power to detect common recessive variants (allele frequency > 10%) by using the standard additive tests (chi-square and transmission disequilibrium test [TDT]) and that RAFT or a conventional recessive test (Fisher’s exact test on the homozygous counts [Hom-FET]) did not confer much advantage over these additive tests (Figure 2). In contrast, RAFT provides significantly more power to detect lower-frequency recessive variants (allele frequency ≤ 5%) than do all the other tests.Figure 2Power Calculations for Various Tests in Detecting Recessive Variants with OR = 10 and Disease Prevalence of 1% in 5,000 Case Subjects and 5,000 Control SubjectsShow full caption(A) Power calculations for detecting recessive variants across different allele frequencies by using two allelic tests (chi-square and TDT), as well as two recessive tests (Fisher’s exact test on the homozygotes [Hom-FET] and RAFT).(B) Graphical representation for the power calculations for chi-square, TDT, Hom-FET, and RAFT across various allele-frequency ranges from 0.1% to 20% with the same parameters.(C) Power calculations for the same scenario but a variant with 2% allele frequency across various recessive ORs from 1.2 to 2,000.View Large Image Figure ViewerDownload Hi-res image Download (PPT) (A) Power calculations for detecting recessive variants across different allele frequencies by using two allelic tests (chi-square and TDT), as well as two recessive tests (Fisher’s exact test on the homozygotes [Hom-FET] and RAFT). (B) Graphical representation for the power calculations for chi-square, TDT, Hom-FET, and RAFT across various allele-frequency ranges from 0.1% to 20% with the same parameters. (C) Power calculations for the same scenario but a variant with 2% allele frequency across various recessive ORs from 1.2 to 2,000. We applied the RAFT statistic on a previously published instance where 2 out of 17 independent families affected by Bardet-Biedl syndrome (BBS [MIM 209900]) were reported to harbor the same homozygous c.280delT frameshift in MKKS19Katsanis N. Beales P.L. Woods M.O. Lewis R.A. Green J.S. Parfrey P.S. Ansley S.J. Davidson W.S. Lupski J.R. Mutations in MKKS cause obesity, retinal dystrophy and renal malformations associated with Bardet-Biedl syndrome.Nat. Genet. 2000; 26: 67-70Crossref PubMed Scopus (266) Google Scholar (MIM 604896). The c.280delT frameshift was shown to be pathogenic and causal for BBS in these two families. However, the p value for this observation calculated by a conventional recessive test such as Hom-FET is 1.46 × 10−5, which is shy of the exome-wide significance threshold of 2.5 × 10−6 after correcting for 20,000 genes. Instead, by incorporating the allele frequency of the variant (the c.280delT variant was found to be heterozygous in 1 out of 4,300 individuals of European ancestry from the NHLBI Exome Variant Server and was found to be heterozygous in another 2 out of the 17 families affected by BBS), the p value calculated with RAFT for this observation is highly significant at 2.26 × 10−9. In an exome-wide screen involving several more families and many more variants, it is thus possible that a conventional test might miss the association between the MKKS c.280delT variant and BBS while the RAFT statistic flags this variant as highly significant. To evaluate the null behavior of our test statistic, we performed RAFT analysis on simulated data by using allele frequencies from a whole-exome sequencing data set composed of 3,000 individuals of Finnish ancestry and 3,000 NFEs14Lim E.T. Würtz P. Havulinna A.S. Palta P. Tukiainen T. Rehnström K. Esko T. Mägi R. Inouye M. Lappalainen T. et al.Sequencing Initiative Suomi (SISu) ProjectDistribution and medical impact of loss-of-function variants in the Finnish founder population.PLoS Genet. 2014; 10: e1004494Crossref PubMed Scopus (251) Google Scholar and consisting of 590,003 coding region variants. For each variant, we randomly simulated 5,000 case subjects and 5,000 control subjects by using the Finnish allele frequencies and applied the RAFT analysis. We found that across all allele frequency bins (common ≥ 5%, low-frequency 1%–5%, or rare < 1%), we obtained similar numbers of observed and expected variants in the various p value bins (Table S1), suggesting that the RAFT statistic conforms to the null hypothesis when the individuals are drawn from a homogenous population and there is no underlying signal of any type. Given that RAFT is a method designed to test for an excess of homozygous variants given the allele frequency, artifacts in the data that cause deviation from HWE could inflate the test statistics calculated by RAFT. One such factor is population stratification. However, similar to existing genome-wide association studies, this can be detected by existing methods such as genomic control to identify whether the case subjects are well-matched to the control subjects in terms of ancestry. Another more subtle and pernicious factor for RAFT is population substructure, where the case subjects and control subjects may be sampled equally from a heterogeneous population that consists of two or more distinct population ancestries. For instance, if the case subjects and control subjects are derived from a mixture of Finns and NFEs, this might result in the Wahlund effect where there is excessive homozygosity and reduced heterozygosity. Unlike direct case-control comparisons, where balanced mixtures would not inflate allele frequency comparisons, such underlying heterogeneity can inflate the RAFT statistic. To evaluate further the effect of population substructure on RAFT (where the case and control subjects are well matched for ancestry but contain ethnically different subpopulations in both case and control subjects), we randomly generated 5,000 case subjects and 5,000 control subjects by using equal proportions of Finns and NFEs in the cases and controls. When we ran the test statistic on the simulated data, we indeed observed inflation among the common variants (Table S2). When we performed additional simulations on just the Finns alone, we did not observe any significant amount of inflation (Table S3). Such inflation, if observed, is fully managed by the inclusion of the F term in the determination of the background probability of homozygosity as described in the Material and Methods. We next asked whether we can use RAFT to discover disease-associated variants by applying existing tests to a set of exome chip genotyping data for 1,791 T2D case subjects and 2,657 control subjects of Finnish ancestry matched by principal-component analysis (Figure S2). A series of quality control checks were performed to test whether there is any major population stratification between the case and control subjects, which will result in global inflation when running an additive test on the low-frequency and common variants (≥1% allele frequency). However, Fisher’s exact test on the allele counts (FET) did not show any evidence for inflation (genomic control λ = 1.06, Figure S3). We note that even though the current sample size is underpowered to detect the previously published T2D common variants with genome-wide significance, the top hit is an int" @default.
- W2073569341 created "2016-06-24" @default.
- W2073569341 creator A5001552586 @default.
- W2073569341 creator A5011512882 @default.
- W2073569341 creator A5014862057 @default.
- W2073569341 creator A5019646038 @default.
- W2073569341 creator A5031661454 @default.
- W2073569341 creator A5039668365 @default.
- W2073569341 creator A5061973111 @default.
- W2073569341 creator A5070413566 @default.
- W2073569341 creator A5074771810 @default.
- W2073569341 creator A5081414100 @default.
- W2073569341 creator A5081489856 @default.
- W2073569341 creator A5085118026 @default.
- W2073569341 creator A5085297112 @default.
- W2073569341 date "2014-11-01" @default.
- W2073569341 modified "2023-09-25" @default.
- W2073569341 title "A Novel Test for Recessive Contributions to Complex Diseases Implicates Bardet-Biedl Syndrome Gene BBS10 in Idiopathic Type 2 Diabetes and Obesity" @default.
- W2073569341 cites W1518382324 @default.
- W2073569341 cites W1801308772 @default.
- W2073569341 cites W1890472258 @default.
- W2073569341 cites W1973211018 @default.
- W2073569341 cites W1994975332 @default.
- W2073569341 cites W2031964961 @default.
- W2073569341 cites W2040964350 @default.
- W2073569341 cites W2050040558 @default.
- W2073569341 cites W2055285987 @default.
- W2073569341 cites W2056574347 @default.
- W2073569341 cites W2056696934 @default.
- W2073569341 cites W2059145105 @default.
- W2073569341 cites W2068115215 @default.
- W2073569341 cites W2083272036 @default.
- W2073569341 cites W2095920230 @default.
- W2073569341 cites W2098001104 @default.
- W2073569341 cites W2098482403 @default.
- W2073569341 cites W2098510019 @default.
- W2073569341 cites W2105026715 @default.
- W2073569341 cites W2109616976 @default.
- W2073569341 cites W2116868464 @default.
- W2073569341 cites W2117782292 @default.
- W2073569341 cites W2124341714 @default.
- W2073569341 cites W2124490243 @default.
- W2073569341 cites W2127003470 @default.
- W2073569341 cites W2128613449 @default.
- W2073569341 cites W2130292538 @default.
- W2073569341 cites W2135786049 @default.
- W2073569341 cites W2136280523 @default.
- W2073569341 cites W2137467646 @default.
- W2073569341 cites W2157752701 @default.
- W2073569341 cites W2158501174 @default.
- W2073569341 cites W2160438116 @default.
- W2073569341 cites W2161633633 @default.
- W2073569341 cites W2161978970 @default.
- W2073569341 cites W2164004777 @default.
- W2073569341 cites W2316411244 @default.
- W2073569341 cites W2887960255 @default.
- W2073569341 doi "https://doi.org/10.1016/j.ajhg.2014.09.015" @default.
- W2073569341 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/4225638" @default.
- W2073569341 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/25439097" @default.
- W2073569341 hasPublicationYear "2014" @default.
- W2073569341 type Work @default.
- W2073569341 sameAs 2073569341 @default.
- W2073569341 citedByCount "29" @default.
- W2073569341 countsByYear W20735693412014 @default.
- W2073569341 countsByYear W20735693412015 @default.
- W2073569341 countsByYear W20735693412016 @default.
- W2073569341 countsByYear W20735693412017 @default.
- W2073569341 countsByYear W20735693412018 @default.
- W2073569341 countsByYear W20735693412019 @default.
- W2073569341 countsByYear W20735693412020 @default.
- W2073569341 countsByYear W20735693412021 @default.
- W2073569341 countsByYear W20735693412022 @default.
- W2073569341 crossrefType "journal-article" @default.
- W2073569341 hasAuthorship W2073569341A5001552586 @default.
- W2073569341 hasAuthorship W2073569341A5011512882 @default.
- W2073569341 hasAuthorship W2073569341A5014862057 @default.
- W2073569341 hasAuthorship W2073569341A5019646038 @default.
- W2073569341 hasAuthorship W2073569341A5031661454 @default.
- W2073569341 hasAuthorship W2073569341A5039668365 @default.
- W2073569341 hasAuthorship W2073569341A5061973111 @default.
- W2073569341 hasAuthorship W2073569341A5070413566 @default.
- W2073569341 hasAuthorship W2073569341A5074771810 @default.
- W2073569341 hasAuthorship W2073569341A5081414100 @default.
- W2073569341 hasAuthorship W2073569341A5081489856 @default.
- W2073569341 hasAuthorship W2073569341A5085118026 @default.
- W2073569341 hasAuthorship W2073569341A5085297112 @default.
- W2073569341 hasBestOaLocation W20735693411 @default.
- W2073569341 hasConcept C104317684 @default.
- W2073569341 hasConcept C126322002 @default.
- W2073569341 hasConcept C127716648 @default.
- W2073569341 hasConcept C134018914 @default.
- W2073569341 hasConcept C2777180221 @default.
- W2073569341 hasConcept C2781232497 @default.
- W2073569341 hasConcept C511355011 @default.
- W2073569341 hasConcept C54355233 @default.
- W2073569341 hasConcept C555293320 @default.
- W2073569341 hasConcept C60644358 @default.
- W2073569341 hasConcept C71924100 @default.