Matches in SemOpenAlex for { <https://semopenalex.org/work/W3113178668> ?p ?o ?g. }
- W3113178668 abstract "Method8 December 2020Open Access Transparent process The GENDULF algorithm: mining transcriptomics to uncover modifier genes for monogenic diseases Noam Auslander Noam Auslander orcid.org/0000-0002-1923-8735 Cancer Data Science Laboratory (CDSL), National Cancer Institute, National Institutes of Health, Bethesda, MD, USA National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USAThese authors contributed equally to this work as first and second authorsThis article has been contributed to by US Government employees and their work is in the public domain in the USA. Search for more papers by this author Daniel M Ramos Daniel M Ramos Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USAThese authors contributed equally to this work as first and second authors Search for more papers by this author Ivette Zelaya Ivette Zelaya Interdepartmental Program in Bioinformatics, University of California Los Angeles, Los Angeles, CA, USA Search for more papers by this author Hiren Karathia Hiren Karathia orcid.org/0000-0003-3607-6552 Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, MD, USAThis article has been contributed to by US Government employees and their work is in the public domain in the USA. Search for more papers by this author Thomas O. Crawford Thomas O. Crawford Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, MD, USA Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA Search for more papers by this author Alejandro A Schäffer Alejandro A Schäffer orcid.org/0000-0002-2147-8033 Cancer Data Science Laboratory (CDSL), National Cancer Institute, National Institutes of Health, Bethesda, MD, USAThis article has been contributed to by US Government employees and their work is in the public domain in the USA. Search for more papers by this author Charlotte J Sumner Charlotte J Sumner orcid.org/0000-0001-5088-6012 Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USAThese authors contributed equally to this work as last authors Search for more papers by this author Eytan Ruppin Corresponding Author Eytan Ruppin [email protected] orcid.org/0000-0002-7862-3940 Cancer Data Science Laboratory (CDSL), National Cancer Institute, National Institutes of Health, Bethesda, MD, USAThese authors contributed equally to this work as last authorsThis article has been contributed to by US Government employees and their work is in the public domain in the USA. Search for more papers by this author Noam Auslander Noam Auslander orcid.org/0000-0002-1923-8735 Cancer Data Science Laboratory (CDSL), National Cancer Institute, National Institutes of Health, Bethesda, MD, USA National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USAThese authors contributed equally to this work as first and second authorsThis article has been contributed to by US Government employees and their work is in the public domain in the USA. Search for more papers by this author Daniel M Ramos Daniel M Ramos Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USAThese authors contributed equally to this work as first and second authors Search for more papers by this author Ivette Zelaya Ivette Zelaya Interdepartmental Program in Bioinformatics, University of California Los Angeles, Los Angeles, CA, USA Search for more papers by this author Hiren Karathia Hiren Karathia orcid.org/0000-0003-3607-6552 Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, MD, USAThis article has been contributed to by US Government employees and their work is in the public domain in the USA. Search for more papers by this author Thomas O. Crawford Thomas O. Crawford Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, MD, USA Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA Search for more papers by this author Alejandro A Schäffer Alejandro A Schäffer orcid.org/0000-0002-2147-8033 Cancer Data Science Laboratory (CDSL), National Cancer Institute, National Institutes of Health, Bethesda, MD, USAThis article has been contributed to by US Government employees and their work is in the public domain in the USA. Search for more papers by this author Charlotte J Sumner Charlotte J Sumner orcid.org/0000-0001-5088-6012 Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USAThese authors contributed equally to this work as last authors Search for more papers by this author Eytan Ruppin Corresponding Author Eytan Ruppin [email protected] orcid.org/0000-0002-7862-3940 Cancer Data Science Laboratory (CDSL), National Cancer Institute, National Institutes of Health, Bethesda, MD, USAThese authors contributed equally to this work as last authorsThis article has been contributed to by US Government employees and their work is in the public domain in the USA. Search for more papers by this author Author Information Noam Auslander1,2, Daniel M Ramos3, Ivette Zelaya4, Hiren Karathia5, Thomas O. Crawford6,7, Alejandro A Schäffer1, Charlotte J Sumner3,7 and Eytan Ruppin *,1 1Cancer Data Science Laboratory (CDSL), National Cancer Institute, National Institutes of Health, Bethesda, MD, USA 2National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA 3Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD, USA 4Interdepartmental Program in Bioinformatics, University of California Los Angeles, Los Angeles, CA, USA 5Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, MD, USA 6Department of Pediatrics, Johns Hopkins University School of Medicine, Baltimore, MD, USA 7Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA *Corresponding author. Tel: +1 240 858 3169; E-mail: [email protected] Molecular Systems Biology (2020)16:e9701https://doi.org/10.15252/msb.20209701 PDFDownload PDF of article text and main figures. Peer ReviewDownload a summary of the editorial decision process including editorial decision letters, reviewer comments and author responses to feedback. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract Modifier genes are believed to account for the clinical variability observed in many Mendelian disorders, but their identification remains challenging due to the limited availability of genomics data from large patient cohorts. Here, we present GENDULF (GENetic moDULators identiFication), one of the first methods to facilitate prediction of disease modifiers using healthy and diseased tissue gene expression data. GENDULF is designed for monogenic diseases in which the mechanism is loss of function leading to reduced expression of the mutated gene. When applied to cystic fibrosis, GENDULF successfully identifies multiple, previously established disease modifiers, including EHF, SLC6A14, and CLCA1. It is then utilized in spinal muscular atrophy (SMA) and predicts U2AF1 as a modifier whose low expression correlates with higher SMN2 pre-mRNA exon 7 retention. Indeed, knockdown of U2AF1 in SMA patient-derived cells leads to increased full-length SMN2 transcript and SMN protein expression. Taking advantage of the increasing availability of transcriptomic data, GENDULF is a novel addition to existing strategies for prediction of genetic disease modifiers, providing insights into disease pathogenesis and uncovering novel therapeutic targets. SYNOPSIS GENDULF predicts modifiers of loss-of-function monogenetic diseases using healthy and disease gene expression data. Application to cystic fibrosis (CF) and spinal muscular atrophy (SMA) identifies established CF modifiers and a new putative modifier of SMA, U2AF1. GENDULF is a novel algorithm that identifies genetic modifiers for monogenetic diseases from healthy and disease gene expression data, by detecting patterns of co-expression that are uniquely observed in healthy tissues. GENDULF may be used to provide a list of candidates for large-scale analysis or may be incorporated with other approaches or a knowledge-based step to yield a small list of candidates for small-scale experimental evaluation. Different applications are demonstrated for CF, where the performance is estimated against previously established modifiers, and for SMA where it is used to uncover a new modifier, U2AF1. Introduction Phenotypic heterogeneity is observed in many Mendelian diseases such that patients with the same mutation may develop a severe form of disease, a mild one, or show no symptoms at all. Among the factors that account for these differences are modifier genes (Nadeau, 2001), whose activity influences disease severity. Identifying such genes has major implications for disease prognostication and development of novel therapeutics (Antonarakis & Beckmann, 2006). However, due to the low frequency of the mutations causing most Mendelian disorders and the scarcity of large relevant patient cohorts, only a few modifier genes have been identified thus far (Génin et al, 2008; Kousi & Katsanis, 2015) leaving the mechanisms underlying clinical variability of most Mendelian disorders poorly understood. Existing strategies for studying the role of genetic factors in determining phenotypic presentation are often classified into three categories depending on the type of data analyzed (Génin et al, 2008; Kousi & Katsanis, 2015): (i) genome-wide association studies (GWAS), which compare the distribution of marker genotypes in patients with different disease phenotypes, (ii) genetic linkage studies, using tools such as Superlink (Fishelson & Geiger, 2002) and GENHUNTER-TWOLOCUS (Dietter et al, 2004), which assess the inheritance pattern and content of alleles shared between phenotypically concordant and discordant relatives, and (iii) systematic genome-wide exome sequencing projects, which identify individuals who are resilient to otherwise phenotype-causing mutations. A recent example of the third approach is the Resilience Project that analyzes genomes to ascertain subjects who are healthy despite harboring disease-causing mutations (Chen et al, 2016). Each of these existing strategies as well as disease-specific wet laboratory functional screens may yield a long list of candidate modifier genes, emphasizing the need for complementary methods to narrow in on a smaller set of candidate modifiers to validate. Here, we present a new approach for the genome-wide identification of genetic modifiers of monogenic disorders, termed GENetic moDULators identiFication (GENDULF). We use the term “monogenic disease” for disorders in which mutations in one gene determine who is affected with high penetrance, but variations in that gene alone may not fully explain the variable phenotypes seen in different patients. The GENDULF method is intended for monogenic diseases in which the mechanism is loss-of-function or reduced function. For many of these diseases with known gene modifiers (Gazzo et al, 2016), mutations of the gene causal of disease (herein termed GCD) often result in its reduced expression. We therefore reasoned that some healthy individuals who have reduced expression of a GCD are protected from disease by differential expression of other genes (the modifiers). We do not assume that every disease-causing mutation in the GCD leads to reduced gene expression, but rather that some of them do. Under this assumption, reduced expression of the GCD can be deleterious, even if it does not occur in the context of a disease-causing mutation. The interactions we seek between the GCD and modifiers are akin to some definitions of the genetic term “epistasis”, but we avoid using this term because it is sometimes associated with formal measures of overall organism fitness, which we do not compute. GENDULF operates by mining gene expression data of healthy tissues, available most prominently from the Genotype Tissue Expression project (GTEx; https://gtexportal.org), and disease-vs.-control tissues to identify expression patterns of genes that may modify disease severity. The GENDULF approach is feasible because gene expression in healthy individuals can vary significantly due to genetic and non-genetic reasons (Curtis et al, 2012). Variation in the expression of a GCD may be explained, at least in part, by regulation of other genes that compensate for low expression levels of the GCD in healthy individuals particularly in the tissues most relevant to the disease etiology. Because GTEx contains tissue-specific gene expression and for some diseases, the most affected tissues are known, GENDULF can examine gene expression in the disease-relevant tissues of healthy subjects. By identifying unaffected GTEx individuals with very low tissue-specific expression levels of the GCD, GENDULF can predict potential modifiers that may compensate for these low GCD levels. The expression levels of these potential modifier genes are then examined in disease-relevant tissues from affected and unaffected individuals to evaluate the association of candidate modifiers with the disease phenotype. Since we are interested in identifying ‘actionable’ modifiers, which could most readily be targeted by drugs to inhibit their activity, we focus on negative modifiers—genes whose inactivation could alleviate the disease phenotype. Nevertheless, GENDULF could be readily modified to identify targets whose increased expression may alleviate disease phenotypes. We first tested the ability of GENDULF to identify tissue-specific modifiers of cystic fibrosis (CF) because it is the most common recessive Mendelian disease and has variable severity. CF is caused by biallelic mutations of the cystic fibrosis transmembrane conductance regulator gene [CFTR (Rommens et al, 1989)], resulting in disrupted epithelial fluid transport in lungs, pancreas, colon, and other organs (Cutting, 2015). A twin study suggested that 50% of the variability in CF lung function is due to genetic factors (Collaco et al, 2010). Several CF modifiers have been previously discovered in both lung and colon tissues (Wright et al, 2011; Gallati, 2014; Corvol et al, 2015) providing an opportunity to evaluate the GENDULF approach for a disease in which there are established results. To determine whether GENDULF can be utilized in other diseases, we then applied it to spinal muscular atrophy (SMA), a neuromuscular disorder of variable severity caused by biallelic mutations of the survival motor neuron 1 gene (SMN1) (Lefebvre et al, 1995) and retention of the paralog gene SMN2. A cytosine to thymine nucleotide change in exon 7 of SMN2 leads to frequent exclusion of exon 7 during splicing of SMN2 pre-mRNAs and thus less functional SMN protein (Monani et al, 1999). We applied GENDULF to SMA to search for candidates that may influence SMN2 exon 7 pre-mRNA splicing, as this is an important determinant of disease severity (Prior et al, 2009; Hua et al, 2011; Wu et al, 2017). Together, our findings support the utility of GENDULF in prioritizing disease modifiers in CF and SMA. Particularly, when used in conjunction with available GWAS or relevant biological insights, and when transcriptomic data are available, GENDULF could facilitate identification of modifier genes for other loss-of-function Mendelian diseases. Results The GENDULF approach We provide an overview of GENDULF here and refer the reader to a full description in Materials and Methods below. GENDULF consists of two steps at its core (Fig 1, Materials and Methods). [Step 1] The aim of the first step is to find genes that are downregulated when the GCD is downregulated in healthy individuals and particularly in the tissues that are relevant to the disease in question. We reason that in healthy individuals with very low tissue-specific expression of the GCD, compensatory downregulation of some other genes may in part help maintain the observed unaffected phenotype. We term the candidate genes Potential Modifiers (PMs). [Step 2] The aim of the second step is to find genes that are not downregulated when the GCD is downregulated in disease-relevant tissues of individuals affected with the disease, thus testifying that their co-expression with the GCD is specific in healthy individuals and may have compensating effect. In step 2, we examine expression levels of the PMs in data sets that include both disease and control samples. We define disease-associated PMs (DPMs) as those PM genes that lose the association with the GCD (found in the tissue of unaffected individuals) in the relevant tissues of affected individuals; that is, they are not significantly downregulated in disease samples in which the GCD is, by definition, inactive. The downregulation of a PM in the healthy controls and the absence of downregulation of the same PM in patients rules out the possibilities that either the PM is generally co-regulated with the GCD or downregulated in a signaling pathway downstream of the GCD. As demonstrated in the analysis of SMA, when the disease phenotype is influenced by expression of a known specific modifying transcript, a third step may be introduced, as described later in that case. Figure 1. An overview of GENDULF computational approach The two steps of GENDULF: (Step 1) Mine transcriptomics of healthy disease-relevant tissues to identify PMs, which are genes that are differentially under-expressed when the GCD is lowly expressed. GENDULF does not compute a correlation across the whole range of expression values, but specifically searches only for a significant association at the lower level range, as shown in the region boxed in the left scatter plot, where the dots are in blue and the y-axis is labeled ‘PM’ for ‘potential modifier’. The scatter plot on the upper right depicts an example of relationship between expression of the GCD and another gene in which GENDULF is not expected to find the other gene as a modifier, and hence the y-axis is labeled non-PM expression. (Step 2) Evaluate the expression of PMs identified in step 1 in transcriptomic data sets containing both diseased and control samples to find a subset of PMs that we label as DPMs (left graphs)—PM genes that are not down regulated in the disease tissues. Download figure Download PowerPoint Applying GENDULF to identify gene modifiers of CF To evaluate the performance of GENDULF, we first applied it to a monogenic disorder in which several genetic modifiers have been previously identified (Drumm et al, 2005; Cutting, 2010; Wright et al, 2011). CF is caused by biallelic, loss-of-function mutations of CFTR and affects 60,000 individuals worldwide (Kerem et al, 1990). It is clinically characterized by mucous retention in the lungs, pancreas, colon, and other organs and repeated lung infections result in significant morbidity and mortality (O’Sullivan & Freedman, 2009). It is a good model for the identification of disease modifiers because it has a relatively high prevalence leading to availability of data on many accessible patients for performing detailed phenotypic analyses. Furthermore, while a large fraction of patients are heterozygous or homozygous for the ΔF508 mutation, these patients can exhibit quite divergent phenotypes (Drumm et al, 2005), where several other mutations cause disease via lowering of CFTR expression; consistent with the premise of GENDULF (Kerem et al, 1997; Ramalho et al, 2002). In our first test of GENDULF, we focused on CF patients with either lung or intestinal disease, to mitigate the effects of allelic heterogeneity, as was done in other studies seeking CF modifiers (Drumm et al, 2005; Stanke et al, 2010). To identify modifiers of lung disease in CF, we employed Step 1 of GENDULF to RNA sequencing data derived from 320 healthy lung samples deposited in the GTEx database, which includes transcriptomics data from 53 tissues and 544 human subjects (The GTEx Consortium, 2013). This step pointed to 366 PM genes that were found significantly downregulated when CFTR was downregulated in healthy lung tissues (Dataset EV1). Applying Step 2 of GENDULF, we examined the expression of each of these PM genes in nasal brushings of the inferior turbinates of mild and severe CF patients (identical homozygote ΔF508) and healthy controls (Wright et al, 2006). The expression of most PMs was decreased in CF patient tissue compared with healthy tissue in a similar pattern to CFTR, indicating that their expression may simply be correlated with that of CFTR activity. However, 131 (36%) passed GENDULF step 2 and were found not to be similarly downregulated when CFTR expression was impaired in CF tissues (Materials and Methods, Dataset EV1) testifying to their potential compensatory role as DPMs. Examining all reported CF genetic modifiers of the lung phenotype we collected from the literature (Dataset EV2), we find that the GENDULF-predicted DPMs are highly enriched with previously verified modifiers of CF manifestations in the lung (with eight overlapping genes, P-value = 5.4109e-15 from a hypergeometric test (Johnson & Kotz, 1977), Dataset EV2). We found that 8 of the 10 previously published CF lung modifiers that passed GENDULF step 1 also passed GENDULF step 2; the two exceptions (SFTPA1, SLC26A9) had no expression measurements in the case–control data and hence could not possibly be detected at step 2. All four previously published colon CF modifiers that passed GENDULF step 1 were eligible at step 2 and also passed step 2. We conclude that GENDULF is effective at finding those known PMs that fit the expected pattern of expression in GTEx (Fig 1, upper) and are measured in the case–control gene expression data. A recent review of CF lung phenotype modifiers found evidence in favor of two additional predicted DPM genes (Dataset EV2), namely KRT8 and MUC1 (Shanthikumar et al, 2019). Some of these known modifiers are ranked within the top DPMs, including EHF, SFTPA2, and SLC6A14, all previously identified modifiers of lung disease severity in CF (Choi et al, 2006; Tagaram et al, 2007; Wright et al, 2011; Li et al, 2014) (Fig 2A–C). The number of samples in this dataset with mild lung manifestations is too small to evaluate whether modifiers are differentially expressed between CF patients with mild lung disease and severe lung disease. Importantly, we also find that these four CF predicted modifiers (CLCA1, FABP1, MUC2, SLC4A4) tend to be co-downregulated in those GTEx lung samples in which CFTR expression is low (all pairwise hypergeometric P-values evaluating the overlap between pairs of these modifiers are < 0.05, Fig 2D). Figure 2. CF modifiers identified by GENDULF A–C. Upper panels: Scatter plots associating the expression of the GCD (CFTR) vs. identified PMs (A) EHF, (B) SLC6A14, and (C) SFTPA2 in healthy lung tissues from GTEx. Bottom panels: Boxplots associating the expression of the identified DPMs in case–control studies; the expression of (A) EHF, (B) SLC6A14, and (C) SFTPA2 in severe and mild CF and in healthy controls. For all boxplots, center lines indicate medians, box edges represent the interquartile range, whiskers extend to the most extreme data points not considered outliers, and the outliers are plotted individually. Points are defined as outliers if they are greater than q3 + w × (q3 − q1) or < q1 − w × (q3 − q1), where w is the maximum whisker length, and q1 and q3 are the 25th and 75th percentiles of the sample data, respectively. There are five severe (red), four mild (orange), and eleven control (blue) biological replicates. D. Map of coincidence of the low expression (lowest 10% of expression levels) of the 7 top DPMs with the low expression of CFTR in GTEx lung samples (lowest 10% of expression levels, when symptoms would be present in CF patients Ramalho et al, 2002; Kerem et al, 1997); each small rectangle inside the big rectangle represents one individual; all presented samples are those with low CFTR expression. Dark blue rectangles indicate samples with low expression of the listed DPM). E, F. Upper panels: scatter plot associating the expression of the GCD (CFTR) vs. identified PMs in healthy colon GTEx tissue; the expression of CFTR (x-axis) vs. that of (E) CLCA1 and (F) SLC4A4, respectively, in healthy colon tissues. Bottom panels: Boxplots associating the expression of the identified DPMs in case–control studies; the expression of (E) CLCA1 and (F) SLC4A4 in colon tissues from CF and healthy controls. Empirical P-value significance is indicated qualitatively for two thresholds. There are sixteen CF (red) and thirteen control (blue) biological replicates. G. The P-values assigned by GENDULF to genes within chr11p12-p13, chr6p21.3, chr3q29, and chrXq22-q23 chromosomal segments, ordered by their location. The lower dashed line represents a significance threshold corrected for the number of genes evaluated, and the upper dashed line represents a significance threshold corrected for all genes and transcripts in GTEx, with alpha = 0.05. Data information: **P-value < 0.01 and ***P-value < 0.001, using the permutation test defined in the Materials and Methods section. The P-values in panels (A, B, C, E, F, and G) are for the hypergeometric enrichment test. Download figure Download PowerPoint To estimate the robustness of these results, we apply a sensitivity, specificity, and positive predictive value (PPV) analysis of the GENDULF-predicted modifiers against these previously verified modifiers, when setting different thresholds for GENDULF step 1 (See Materials and Methods for details, Appendix Fig S1). The specificity is very high and close to the perfect 1.0, but this is partially driven by the small number of known modifiers (positives). The sensitivity is not very high, but still over two orders of magnitude higher than would be expected by chance, providing a manageable rate of modifiers that are predicted with GENDULF. The PPV is also substantially higher than would be expected by chance. To estimate the number of case–control samples that are required for GENDULF step 2 given GENDULF step 1 results, we provide a power calculation capability in the software (see Materials and Methods). We evaluated the power for the GENDULF step 1 modifiers obtained for CF lung disease and find, for example, that seven cases and seven controls are expected to be sufficient to have 80% power to detect a modifier (See Materials and Methods for details and Appendix Fig S2). To identify modifiers of intestinal disease in CF, we applied GENDULF to analyze 345 non-CF colon samples from the GTEx database and examined the expression of these genes in rectal mucosal epithelia from CF patients (bearing ΔF508 mutation) and healthy controls (Stanke et al, 2014). From 344 PM candidates identified in Step 1 in the healthy colon tissue, 123 (35%) are predicted to be DPMs in step 2, i.e., their expression is not significantly downregulated when CFTR was mutated in tissues from CF patients (Dataset EV1). Examining reported CF genetic modifiers of colon disease collected from the literature (Dataset EV2), the top GENDULF-predicted modifiers are highly enriched with those previously reported (with four overlapping genes, hypergeometric test P-value = 3.9311e-08, Dataset EV2). These include CLCA1, whose locus has been reported to modulate gastrointestinal defect in CF (Ritzka et al, 2004; Van Der Doef et al, 2010) (Fig 2E) and SLC4A4, which was found to modify the intestinal phenotype in CF (Dorfman et al, 2009) (Fig 2F). Furthermore, the genes identified by GENDULF are highly enriched with genes located on chromosome 19q13 (hypergeometric enrichment P-value = 9e-04, Dataset EV2), a locus associated with variability in the CF colon phenotype (meconium ileus) (Zielenski et al, 1999). Some studies have suggested that the linkage signal in 19q13 is best explained by the functional candidate gene TGFB1 (Bremer et al, 2008; Corvol et al, 2008), but a fine mapping study provided evidence in favor of a tight cluster of three other immune-related genes in 19q13: CEACAM5, CEACAM3, CEACAM6 (Stanke et al, 2010). One of the candidates found by GENDULF in this band is CEACAM5, supporting the latter hypothesis. Combining GENDULF with association or linkage studies A particular strength of GENDULF is its ability to be used in conjunction with prior genomic data to help guide discovery of modifier genes. We studied the incorporation of GENDULF with data from association studies and its application to a given list of genes in a locus. In one study, five independent genomic loci were shown to have significant associations with variation in the clinical severity of CF lung disease (Corvol et al, 2015). Yet, as such association loci typically include numerous genes, the identification of the individual genes within the loci that correlate to disease variation can be challenging (Génin et al, 2008). We applied GENDULF to evaluate each of the genes contained within these five loci. To this end, for each gene in a given locus, we evaluated the level by which its expression is significantly d" @default.
- W3113178668 created "2020-12-21" @default.
- W3113178668 creator A5000317216 @default.
- W3113178668 creator A5020874644 @default.
- W3113178668 creator A5023561602 @default.
- W3113178668 creator A5023911818 @default.
- W3113178668 creator A5063492133 @default.
- W3113178668 creator A5075104004 @default.
- W3113178668 creator A5087008106 @default.
- W3113178668 creator A5090408123 @default.
- W3113178668 date "2020-12-01" @default.
- W3113178668 modified "2023-10-14" @default.
- W3113178668 title "The GENDULF algorithm: mining transcriptomics to uncover modifier genes for monogenic diseases" @default.
- W3113178668 cites W1496719976 @default.
- W3113178668 cites W1533942137 @default.
- W3113178668 cites W1577577364 @default.
- W3113178668 cites W1874092228 @default.
- W3113178668 cites W1969697265 @default.
- W3113178668 cites W1971655161 @default.
- W3113178668 cites W1975560789 @default.
- W3113178668 cites W1979451256 @default.
- W3113178668 cites W1980477780 @default.
- W3113178668 cites W1996824414 @default.
- W3113178668 cites W1998671407 @default.
- W3113178668 cites W2000725203 @default.
- W3113178668 cites W2001314764 @default.
- W3113178668 cites W2005895632 @default.
- W3113178668 cites W2018838463 @default.
- W3113178668 cites W2019298465 @default.
- W3113178668 cites W2021793145 @default.
- W3113178668 cites W2028236275 @default.
- W3113178668 cites W2029216301 @default.
- W3113178668 cites W2029297983 @default.
- W3113178668 cites W2030674922 @default.
- W3113178668 cites W2034869993 @default.
- W3113178668 cites W2035030336 @default.
- W3113178668 cites W2036091825 @default.
- W3113178668 cites W2036275580 @default.
- W3113178668 cites W2037171191 @default.
- W3113178668 cites W2046084748 @default.
- W3113178668 cites W2049622153 @default.
- W3113178668 cites W2062552675 @default.
- W3113178668 cites W2068362128 @default.
- W3113178668 cites W2070086105 @default.
- W3113178668 cites W2072219058 @default.
- W3113178668 cites W2073836327 @default.
- W3113178668 cites W2077063914 @default.
- W3113178668 cites W2081735313 @default.
- W3113178668 cites W2082337556 @default.
- W3113178668 cites W2086293424 @default.
- W3113178668 cites W2088131585 @default.
- W3113178668 cites W2104953797 @default.
- W3113178668 cites W2108430210 @default.
- W3113178668 cites W2109814336 @default.
- W3113178668 cites W2114668572 @default.
- W3113178668 cites W2124879534 @default.
- W3113178668 cites W2125530309 @default.
- W3113178668 cites W2131328125 @default.
- W3113178668 cites W2134526812 @default.
- W3113178668 cites W2135175348 @default.
- W3113178668 cites W2146215535 @default.
- W3113178668 cites W2146754982 @default.
- W3113178668 cites W2148071380 @default.
- W3113178668 cites W2152565567 @default.
- W3113178668 cites W2153833484 @default.
- W3113178668 cites W2165235225 @default.
- W3113178668 cites W2165952534 @default.
- W3113178668 cites W2167484059 @default.
- W3113178668 cites W2167590477 @default.
- W3113178668 cites W2169456326 @default.
- W3113178668 cites W2209106767 @default.
- W3113178668 cites W2262187954 @default.
- W3113178668 cites W2264585211 @default.
- W3113178668 cites W2326403029 @default.
- W3113178668 cites W2337925307 @default.
- W3113178668 cites W2555614465 @default.
- W3113178668 cites W2560202024 @default.
- W3113178668 cites W2562013491 @default.
- W3113178668 cites W2582137641 @default.
- W3113178668 cites W2607921836 @default.
- W3113178668 cites W2610440890 @default.
- W3113178668 cites W2732446246 @default.
- W3113178668 cites W2753748225 @default.
- W3113178668 cites W2761275051 @default.
- W3113178668 cites W2782573192 @default.
- W3113178668 cites W2786344490 @default.
- W3113178668 cites W2810767781 @default.
- W3113178668 cites W2830879651 @default.
- W3113178668 cites W2916086954 @default.
- W3113178668 cites W2947836740 @default.
- W3113178668 cites W2952273935 @default.
- W3113178668 cites W2964351992 @default.
- W3113178668 cites W2978447867 @default.
- W3113178668 cites W2980062155 @default.
- W3113178668 cites W3045948537 @default.
- W3113178668 cites W4210996550 @default.
- W3113178668 doi "https://doi.org/10.15252/msb.20209701" @default.
- W3113178668 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/7754056" @default.
- W3113178668 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/33438800" @default.
- W3113178668 hasPublicationYear "2020" @default.