Matches in SemOpenAlex for { <https://semopenalex.org/work/W2623843598> ?p ?o ?g. }
- W2623843598 endingPage "446" @default.
- W2623843598 startingPage "427" @default.
- W2623843598 abstract "Genetic variation modulating risk of sporadic Parkinson disease (PD) has been primarily explored through genome-wide association studies (GWASs). However, like many other common genetic diseases, the impacted genes remain largely unknown. Here, we used single-cell RNA-seq to characterize dopaminergic (DA) neuron populations in the mouse brain at embryonic and early postnatal time points. These data facilitated unbiased identification of DA neuron subpopulations through their unique transcriptional profiles, including a postnatal neuroblast population and substantia nigra (SN) DA neurons. We use these population-specific data to develop a scoring system to prioritize candidate genes in all 49 GWAS intervals implicated in PD risk, including genes with known PD associations and many with extensive supporting literature. As proof of principle, we confirm that the nigrostriatal pathway is compromised in Cplx1-null mice. Ultimately, this systematic approach establishes biologically pertinent candidates and testable hypotheses for sporadic PD, informing a new era of PD genetic research. Genetic variation modulating risk of sporadic Parkinson disease (PD) has been primarily explored through genome-wide association studies (GWASs). However, like many other common genetic diseases, the impacted genes remain largely unknown. Here, we used single-cell RNA-seq to characterize dopaminergic (DA) neuron populations in the mouse brain at embryonic and early postnatal time points. These data facilitated unbiased identification of DA neuron subpopulations through their unique transcriptional profiles, including a postnatal neuroblast population and substantia nigra (SN) DA neurons. We use these population-specific data to develop a scoring system to prioritize candidate genes in all 49 GWAS intervals implicated in PD risk, including genes with known PD associations and many with extensive supporting literature. As proof of principle, we confirm that the nigrostriatal pathway is compromised in Cplx1-null mice. Ultimately, this systematic approach establishes biologically pertinent candidates and testable hypotheses for sporadic PD, informing a new era of PD genetic research. The most commonly used genetic tool today for studying complex disease is the genome-wide association study (GWAS). As a strategy, GWASs were initially hailed for the insight they might provide into the genetic architecture of common human disease risk. Indeed, the collective data from GWASs since 2005 have revealed a trove of variants and genomic intervals associated with an array of phenotypes.1Visscher P.M. Brown M.A. McCarthy M.I. Yang J. Five years of GWAS discovery.Am. J. Hum. Genet. 2012; 90: 7-24Abstract Full Text Full Text PDF PubMed Scopus (1574) Google Scholar The majority of variants identified in GWASs are located in non-coding DNA2Maurano M.T. Humbert R. Rynes E. Thurman R.E. Haugen E. Wang H. Reynolds A.P. Sandstrom R. Qu H. Brody J. et al.Systematic localization of common disease-associated variation in regulatory DNA.Science. 2012; 337: 1190-1195Crossref PubMed Scopus (2200) Google Scholar and are enriched for characteristics denoting regulatory DNA.2Maurano M.T. Humbert R. Rynes E. Thurman R.E. Haugen E. Wang H. Reynolds A.P. Sandstrom R. Qu H. Brody J. et al.Systematic localization of common disease-associated variation in regulatory DNA.Science. 2012; 337: 1190-1195Crossref PubMed Scopus (2200) Google Scholar, 3Farh K.K. Marson A. Zhu J. Kleinewietfeld M. Housley W.J. Beik S. Shoresh N. Whitton H. Ryan R.J. Shishkin A.A. et al.Genetic and epigenetic fine mapping of causal autoimmune disease variants.Nature. 2015; 518: 337-343Crossref PubMed Scopus (1153) Google Scholar This regulatory variation is expected to impact expression of a nearby gene, leading to disease susceptibility. Traditionally, the gene closest to the lead SNP has been prioritized as the gene most likely to be affected by the disease variation. However, recent studies show that disease-associated variants can act on more distally located genes, invalidating genes that were previously extensively studied.4Smemo S. Tena J.J. Kim K.H. Gamazon E.R. Sakabe N.J. Gómez-Marín C. Aneas I. Credidio F.L. Sobreira D.R. Wasserman N.F. et al.Obesity-associated variants within FTO form long-range functional connections with IRX3.Nature. 2014; 507: 371-375Crossref PubMed Scopus (807) Google Scholar, 5Gupta R.M. Hadaya J. Trehan A. Zekavat S.M. Roselli C. Klarin D. Emdin C.A. Hilvering C.R.E. Bianchi V. Mueller C. et al.A genetic variant associated with five vascular diseases is a distal regulator of endothelin-1 gene expression.Cell. 2017; 170: 522-533.e15Abstract Full Text Full Text PDF PubMed Scopus (236) Google Scholar The inability to systematically connect common variation with the genes impacted limits our capacity to elucidate potential therapeutic targets and can waste valuable research efforts. Although GWASs are inherently agnostic to the context in which disease-risk variation acts, the biological impact of common functional variation has been shown to be cell context dependent.2Maurano M.T. Humbert R. Rynes E. Thurman R.E. Haugen E. Wang H. Reynolds A.P. Sandstrom R. Qu H. Brody J. et al.Systematic localization of common disease-associated variation in regulatory DNA.Science. 2012; 337: 1190-1195Crossref PubMed Scopus (2200) Google Scholar, 6Lee D. Gorkin D.U. Baker M. Strober B.J. Asoni A.L. McCallion A.S. Beer M.A. A method to predict the impact of regulatory variants from DNA sequence.Nat. Genet. 2015; 47: 955-961Crossref PubMed Scopus (260) Google Scholar Extending these observations, Pritchard and colleagues recently demonstrated that although genes need only to be expressed in disease-relevant cell types to contribute to risk, those expressed preferentially or exclusively therein contribute more per SNP.7Boyle E.A. Li Y.I. Pritchard J.K. An expanded view of complex traits: from polygenic to omnigenic.Cell. 2017; 169: 1177-1186Abstract Full Text Full Text PDF PubMed Scopus (1377) Google Scholar Thus, accounting for the cellular and gene regulatory network (GRN) contexts within which variation acts may better inform the identification of impacted genes. These principles have not yet been applied systematically to many of the traits for which GWAS data exist. We have chosen Parkinson disease (PD) as a model complex disorder for which a significant body of GWAS data remains to be explored biologically in a context-dependent manner. PD is the most common progressive neurodegenerative movement disorder. Incidence of PD increases with age, affecting an estimated 1% worldwide beyond 70 years of age.8de Rijk M.C. Tzourio C. Breteler M.M. Dartigues J.F. Amaducci L. Lopez-Pousa S. Manubens-Bertran J.M. Alpérovitch A. Rocca W.A. Prevalence of parkinsonism and Parkinson’s disease in Europe: the EUROPARKINSON Collaborative Study. European Community Concerted Action on the Epidemiology of Parkinson’s disease.J. Neurol. Neurosurg. Psychiatry. 1997; 62: 10-15Crossref PubMed Scopus (707) Google Scholar, 9Pringsheim T. Jette N. Frolkis A. Steeves T.D. The prevalence of Parkinson’s disease: a systematic review and meta-analysis.Mov. Disord. 2014; 29: 1583-1590Crossref PubMed Scopus (1206) Google Scholar The genetic underpinnings of non-familial or sporadic PD have been studied through the use of GWASs with recent meta-analyses highlighting 49 loci associated with sporadic PD susceptibility.10Nalls M.A. Pankratz N. Lill C.M. Do C.B. Hernandez D.G. Saad M. DeStefano A.L. Kara E. Bras J. Sharma M. et al.International Parkinson’s Disease Genomics Consortium (IPDGC)Parkinson’s Study Group (PSG) Parkinson’s Research: The Organized GENetics Initiative (PROGENI)23andMeGenePDNeuroGenetics Research Consortium (NGRC)Hussman Institute of Human Genomics (HIHG)Ashkenazi Jewish Dataset InvestigatorCohorts for Health and Aging Research in Genetic Epidemiology (CHARGE)North American Brain Expression Consortium (NABEC)United Kingdom Brain Expression Consortium (UKBEC)Greek Parkinson’s Disease ConsortiumAlzheimer Genetic Analysis GroupLarge-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson’s disease.Nat. Genet. 2014; 46: 989-993Crossref PubMed Scopus (1269) Google Scholar, 11Chang D. Nalls M.A. Hallgrímsdóttir I.B. Hunkapiller J. van der Brug M. Cai F. Kerchner G.A. Ayalon G. Bingol B. Sheng M. et al.International Parkinson’s Disease Genomics Consortium23andMe Research TeamA meta-analysis of genome-wide association studies identifies 17 new Parkinson’s disease risk loci.Nat. Genet. 2017; 49: 1511-1516Crossref PubMed Scopus (622) Google Scholar While a small fraction of PD GWAS loci contain genes known to be mutated in familial PD (SNCA and LRRK2),12Puschmann A. Monogenic Parkinson’s disease and parkinsonism: clinical phenotypes and frequencies of known mutations.Parkinsonism Relat. Disord. 2013; 19: 407-415Abstract Full Text Full Text PDF PubMed Scopus (193) Google Scholar, 13Klein C. Westenberger A. Genetics of Parkinson’s disease.Cold Spring Harb. Perspect. Med. 2012; 2: a008888Crossref PubMed Scopus (795) Google Scholar most indicted intervals do not contain a known mutated gene or genes. Although PD ultimately affects multiple neuronal centers, preferential degeneration of DA neurons in the SN leads to functional collapse of the nigrostriatal pathway and loss of fine motor control. The preferential degeneration of SN DA neurons in relation to other mesencephalic DA neurons has driven research interest in the genetic basis of selective SN vulnerability in PD. Consequently, one can reasonably assert that a significant fraction of PD-associated variation likely mediates its influence specifically within the SN. In an effort to illuminate a biological context in which PD GWAS results could be better interpreted, we undertook single-cell RNA-seq (scRNA-seq) analyses of multiple DA neuronal populations in the brain, including ventral midbrain DA neurons. This analysis defined the heterogeneity of DA populations over developmental time in the brain, revealing gene expression profiles specific to discrete DA neuron subtypes. These data further facilitated the definition of GRNs active in DA neuron populations including the SN. With these data, we establish a framework to systematically prioritize candidate genes in all 49 PD GWAS loci and begin exploring their pathological significance. The Th:EGFP BAC transgenic mice (Tg(Th-EGFP)DJ76Gsat/Mmnc) used in this study were generated by the GENSAT Project and were purchased through the Mutant Mouse Resource & Research Centers (MMRRC) Repository. Mice were maintained on a Swiss Webster (SW) background with female SW mice obtained from Charles River Laboratories. The Tg(Th-EGFP)DJ76Gsat/Mmnc line was primarily maintained through matings between Th:EGFP-positive, hemizygous male mice and wild-type SW females (dams). Timed matings for cell isolation were similarly established between hemizygous male mice and wild-type SW females. The observation of a vaginal plug was defined as embryonic day 0.5 (E0.5). All work involving mice (husbandry, colony maintenance, and euthanasia) were reviewed and pre-approved by the institutional care and use committee. Cplx1 knockout mice and wild-type littermates used for immunocytochemistry were taken from a colony established in Cambridge using founders from mutant mouse lines that were obtained from the Max-Planck-Institute for Experimental Medicine (Gottingen, Germany). Cplx1 mice in this colony have been backcrossed onto a C57BL/6J inbred background for at least ten generations. All experimental procedures were licensed and undertaken in accordance with the regulations of the UK Animals (Scientific Procedures) Act 1986. Housing, rearing, and genotyping of mice has been described in detail previously.14Glynn D. Drew C.J. Reim K. Brose N. Morton A.J. Profound ataxia in complexin I knockout mice masks a complex phenotype that includes exploratory and habituation deficits.Hum. Mol. Genet. 2005; 14: 2369-2385Crossref PubMed Scopus (64) Google Scholar, 15Glynn D. Sizemore R.J. Morton A.J. Early motor development is abnormal in complexin 1 knockout mice.Neurobiol. Dis. 2007; 25: 483-495Crossref PubMed Scopus (31) Google Scholar Mice were housed in hard-bottomed polypropylene experimental cages in groups of 5–10 mice in a housing facility maintained at 21°C–23°C with relative humidity of 55% ± 10%. Mice had ad libitum access to water and standard dry chow. Because homozygous knockout Cplx1 mice have ataxia, they have difficulty in reaching the hard pellets in the food hopper and drinking from the water bottles. Lowered waterspouts were provided and access to normal laboratory chow was improved by providing mash (made by soaking 100 g of chow pellets in 230 mL water for 60 min until the pellets were soft and fully expanded) on the floor of the cage twice daily. Cplx1 genotyping to identify mice with a homozygous (Cplx1−/−) or heterozygous (Cplx1+/−) deletion of Cplx1 was conducted as previously described,14Glynn D. Drew C.J. Reim K. Brose N. Morton A.J. Profound ataxia in complexin I knockout mice masks a complex phenotype that includes exploratory and habituation deficits.Hum. Mol. Genet. 2005; 14: 2369-2385Crossref PubMed Scopus (64) Google Scholar using DNA prepared from tail biopsies. At 15.5 days after the timed mating, pregnant dams were euthanized and the entire litter of E15.5 embryos were dissected out of the mother and immediately placed in chilled Eagle’s Minimum Essential Media (EMEM). Individual embryos were then decapitated and heads were placed in fresh EMEM on ice. Embryonic brains were removed and placed in Hank’s Balanced Salt Solution (HBSS) without Mg2+ and Ca2+ and manipulated while on ice. The brains were immediately observed under a fluorescent stereomicroscope and EGFP+ brains were selected. EGFP+ regions of interest in the forebrain (hypothalamus) and the midbrain were then dissected and placed in HBSS on ice. This process was repeated for each EGFP+ brain. Brain regions from four EGFP+ mouse pups were pooled together for dissociation. After timed matings, pregnant females were sorted into their own cages and checked daily for newly born pups. The morning the pups were born was considered postnatal day 0 (P0). Once the mice were aged to P7, all the mice from the litter were euthanized and the brains were then quickly dissected and placed in HBSS without Mg2+ and Ca2+ on ice. As before, the brains were observed under a fluorescent microscope, EGFP+ status for P7 mice was determined, and EGFP+ brains were retained. For each EGFP+ brain, the entire olfactory bulb was first resected and placed in HBSS on ice. Immediately thereafter, the EGFP+ forebrain and midbrain regions for each brain were resected and also placed in distinct containers of HBSS on ice. Brain regions from five EGFP+ P7 mice were pooled together for dissociation. Resected brain tissues were dissociated using papain (Papain Dissociation System, Worthington Biochemical Corporation; Cat#: LK003150) following the trehalose-enhanced protocol reported by Saxena et al.16Saxena A. Wagatsuma A. Noro Y. Kuji T. Asaka-Oba A. Watahiki A. Gurnot C. Fagiolini M. Hensch T.K. Carninci P. Trehalose-enhanced isolation of neuronal sub-types from adult mouse brain.Biotechniques. 2012; 52: 381-385Crossref PubMed Scopus (43) Google Scholar with the following modifications. The dissociation was carried out at 37°C in a sterile tissue culture cabinet and RNase inhibitor was added to all solutions. During dissociation, all tissues at all time points were triturated every 10 min using a sterile Pasteur pipette. For E15.5 tissues, this was continued for no more than 40 min. For P7, this was continued for up to 1.5 hr or until the tissue appeared to be completely dissociated. Additionally, for P7 tissues, after dissociation but before cell sorting, the cell pellets were passed through a discontinuous density gradient in order to remove cell debris that could impede cell sorting. This gradient was adapted from the Worthington Papain Dissociation System kit. Briefly, after completion of dissociation according to the Saxena protocol,16Saxena A. Wagatsuma A. Noro Y. Kuji T. Asaka-Oba A. Watahiki A. Gurnot C. Fagiolini M. Hensch T.K. Carninci P. Trehalose-enhanced isolation of neuronal sub-types from adult mouse brain.Biotechniques. 2012; 52: 381-385Crossref PubMed Scopus (43) Google Scholar the final cell pellet was resuspended in DNase dilute albumin-inhibitor solution, layered on top of 5 mL of albumin-inhibitor solution, and centrifuged at 70 × g for 6 min. The supernatant was then removed. For each time point-region condition, pellets were resuspended in 200 μL of media without serum comprised of DMEM/F12 without phenol red, 5% trehalose (w/v), 25 μM AP-V, 100 μM kynurenic acid, and 10 μL of 40 U/μL RNase inhibitor (RNasin Plus RNase Inhibitor, Promega) at room temperature. The resuspended cells were then passed through a 40 μM filter and introduced into a FACS machine (Beckman Coulter MoFlo Cell Sorter or Becton Dickinson FACSJazz). Viable cells were identified via propidium iodide staining, and individual neurons were sorted based on their fluorescence directly into lysis buffer in individual wells of 96-well plates for single-cell sequencing (2 μL Smart-Seq2 lysis buffer + RNase inhibitor, 1 μL oligo-dT primer, and 1 μL dNTPs) according to Picelli et al.17Picelli S. Faridani O.R. Björklund Å.K. Winberg G. Sagasser S. Sandberg R. Full-length RNA-seq from single cells using Smart-seq2.Nat. Protoc. 2014; 9: 171-181Crossref PubMed Scopus (1956) Google Scholar Blank wells were used as negative controls for each plate collected. Upon completion of a sort, the plates were briefly spun in a tabletop microcentrifuge and snap-frozen on dry ice. Single-cell lysates were subsequently kept at −80°C until cDNA conversion. Library preparation and amplification of single-cell samples were performed using a modified version of the Smart-Seq2 protocol.17Picelli S. Faridani O.R. Björklund Å.K. Winberg G. Sagasser S. Sandberg R. Full-length RNA-seq from single cells using Smart-seq2.Nat. Protoc. 2014; 9: 171-181Crossref PubMed Scopus (1956) Google Scholar Briefly, 96-well plates of single cell lysates were thawed to 4°C, heated to 72°C for 3 min, then immediately placed on ice. Template switching first-strand cDNA synthesis was performed as described above using a 5′-biotinylated TSO oligo. cDNAs were amplified using 20 cycles of KAPA HiFi PCR and 5′-biotinylated ISPCR primer. Amplified cDNA was cleaned with a 1:1 ratio of Ampure XP beads and approximately 200 pg was used for a one-quarter standard sized Nextera XT tagmentation reaction. Tagmented fragments were amplified for 14 cycles and dual indexes were added to each well to uniquely label each library. Concentrations were assessed with Quant-iT PicoGreen dsDNA Reagent (Invitrogen) and samples were diluted to ∼2 nM and pooled. Pooled libraries were sequenced on the Illumina HiSeq 2500 platform to a target mean depth of ∼8.0 × 105 50-bp paired-end fragments per cell at the Hopkins Genetics Research Core Facility. For all libraries, paired-end reads were aligned to the mouse reference genome (mm10) supplemented with the Th-EGFP+ transgene contig, using HISAT218Kim D. Langmead B. Salzberg S.L. HISAT: a fast spliced aligner with low memory requirements.Nat. Methods. 2015; 12: 357-360Crossref PubMed Scopus (9449) Google Scholar with default parameters except: -p 8. Aligned reads from individual samples were quantified against a reference transcriptome (GENCODE vM8)19Mudge J.M. Harrow J. Creating reference gene annotation for the mouse C57BL6/J genome assembly.Mamm. Genome. 2015; 26: 366-378Crossref PubMed Scopus (133) Google Scholar supplemented with the addition of the EGFP transcript. Quantification was performed using cuffquant20Trapnell C. Roberts A. Goff L. Pertea G. Kim D. Kelley D.R. Pimentel H. Salzberg S.L. Rinn J.L. Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.Nat. Protoc. 2012; 7: 562-578Crossref PubMed Scopus (168) Google Scholar with default parameters and the following additional arguments:–no-update-check –p 8. Normalized expression estimates across all samples were obtained using cuffnorm20Trapnell C. Roberts A. Goff L. Pertea G. Kim D. Kelley D.R. Pimentel H. Salzberg S.L. Rinn J.L. Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.Nat. Protoc. 2012; 7: 562-578Crossref PubMed Scopus (168) Google Scholar with default parameters. Gene-level and isoform-level FPKM (fragments per kilobase of transcript per million) values produced by cuffquant20Trapnell C. Roberts A. Goff L. Pertea G. Kim D. Kelley D.R. Pimentel H. Salzberg S.L. Rinn J.L. Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks.Nat. Protoc. 2012; 7: 562-578Crossref PubMed Scopus (168) Google Scholar and the normalized FPKM matrix from cuffnorm were used as input for the Monocle 2 single-cell RNA-seq framework21Trapnell C. Cacchiarelli D. Grimsby J. Pokharel P. Li S. Morse M. Lennon N.J. Livak K.J. Mikkelsen T.S. Rinn J.L. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells.Nat. Biotechnol. 2014; 32: 381-386Crossref PubMed Scopus (2435) Google Scholar in R/Bioconductor.22Huber W. Carey V.J. Gentleman R. Anders S. Carlson M. Carvalho B.S. Bravo H.C. Davis S. Gatto L. Girke T. et al.Orchestrating high-throughput genomic analysis with Bioconductor.Nat. Methods. 2015; 12: 115-121Crossref PubMed Scopus (1866) Google Scholar Genes were annotated using the Gencode vM8 release.19Mudge J.M. Harrow J. Creating reference gene annotation for the mouse C57BL6/J genome assembly.Mamm. Genome. 2015; 26: 366-378Crossref PubMed Scopus (133) Google Scholar A CellDataSet (cds) was then created using Monocle 2 (v2.2.0)21Trapnell C. Cacchiarelli D. Grimsby J. Pokharel P. Li S. Morse M. Lennon N.J. Livak K.J. Mikkelsen T.S. Rinn J.L. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells.Nat. Biotechnol. 2014; 32: 381-386Crossref PubMed Scopus (2435) Google Scholar containing the gene FPKM table, gene annotations, and all available metadata for the sorted cells. All cells labeled as negative controls and empty wells were removed from the data. Relative FPKM values for each cell were converted to estimates of absolute mRNA counts per cell (RPC) using the Monocle 2 Census algorithm23Qiu X. Hill A. Packer J. Lin D. Ma Y.A. Trapnell C. Single-cell mRNA quantification and differential analysis with Census.Nat. Methods. 2017; 14: 309-315Crossref PubMed Scopus (592) Google Scholar using the Monocle function “relative2abs().” After RPCs were inferred, a new cds was created using the estimated RNA copy numbers with the expression Family set to “negbinomial.size()” and a lower detection limit of 0.1 RPC. After expression estimates were inferred, the cds containing a total of 473 cells was run through Monocle 2's “detectGenes()” function with the minimum expression level set at 0.1 transcripts. The following filtering criteria were then imposed on the entire dataset:(1)Number of expressed genes: The number of expressed genes detected in each cell in the dataset was plotted and the high and low expressed gene thresholds were set based on observations of each distribution. Only those cells that expressed between 2,000 and 10,000 genes were retained.(2)Cell mass: Cells were then filtered based on the total mass of RNA in the cells calculated by Monocle 2. Again, the total mass of the cell was plotted and mass thresholds were set based on observations from each distribution. Only those cells with a total cell mass between 100,000 and 1,300,000 fragments mapped were retained.(3)Total RNA copies per cell: Cells were then filtered based on the total number of RNA transcripts estimated for each cell. Again, the total RNA copies per cell was plotted and RNA transcript thresholds were set based on observations from each distribution. Only those cells with a total mRNA count between 1,000 and 40,000 RPCs were retained. A total of 410 individual cells passed these initial filters. Outliers found in subsequent, reiterative analyses described below were analyzed and removed, resulting in a final cell number of 396. Analysis using Monocle 2 relies on the assumption that the expression data being analyzed follows a log-normal distribution. Comparison to this distribution was performed after initial filtering prior to continuing with analysis and was observed to be well fit. After initial filtering described above, the entire cds as well as subsets of the cds based on “age” and “region” of cells were created for recursive analysis. Regardless of how the data were subdivided, all data followed a similar downstream analysis workflow. The genes to be analyzed for each iteration were filtered based on the number of cells that expressed each gene. Genes were retained if they were expressed in >5% of the cells in the dataset being analyzed. These were designated “expressed_genes.” For example, when analyzing all cells collected together (n = 410), a gene had to be expressed in 20.5 cells (410 × 0.05 = 20.5) to be included in the analysis. In contrast, when analyzing P7 MB cells (n = 80), a gene had to be expressed in just four cells (80 × 0.05 = 4). This was done to include genes that may define rare populations of cells that could be present in any given population. The data were prepared for Monocle analysis by retaining only the expressed genes that passed the filtering described above. Size factors were estimated using the Monocle 2 “estimateSizeFactors()” function. Dispersions were estimated using the “estimateDispersions()” function. Genes that have a high biological coefficient of variation (BCV) were identified by first calculating the BCV by dividing the standard deviation of expression for each expressed gene by the mean expression of each expressed gene. A dispersion table was then extracted using the “dispersionTable()” function from Monocle 2. Genes with a mean expression > 0.5 transcripts and a “dispersion_empirical” ≥ 1.5∗dispersion_fit or 2.0∗dispersion_fit were identified as “high variance genes.” PCA was run using the R “prcomp()” function on the centered and scaled log2 expression values of the “high variance genes.” PC1 and PC2 were visualized to scan the data for outliers as well as bias in the PCs for age, region, or plates on which the cells were sequenced. If any visual outliers in the data were observed, those cells were removed from the original subsetted cds and all filtering steps above were repeated. Once there were no visual outliers in PC1 or PC2, a screeplot was used to determine the number of PCs that contributed most significantly to the variation in the data. This was manually determined by inspecting the screeplot and including only those PCs that occur before the leveling-off of the plot. Once the number of significant PCs was determined, t-SNE24Van Der Maaten L. Hinton G. Visualizing data using t-SNE.J. Mach. Learn. Res. 2008; 9: 2579-2605Google Scholar was used to embed chosen PC dimensions in a 2D space for visualization. This was done using the “tsne()” function available through the tsne package (v.0.1-3) in R with “whiten = FALSE.” The parameters “perplexity” and “max_iter” were tested with various values and set according to what was deemed to give the cleanest clustering of the data. After dimensionality reduction via t-SNE, the number of clusters was determined in an unbiased manner by fitting multiple Gaussian distributions over the 2D t-SNE projection coordinates using the R package ADPclust.25Wang X.-F. Xu Y. Fast clustering using adaptive density peak detection.Stat. Methods Med. Res. 2017; 26: 2800-2811Crossref PubMed Scopus (52) Google Scholar t-SNE plots were visualized using a custom R script. The number of genes expressed and the total mRNAs for each cluster were then compared. In order to find differentially expressed genes between brain DA populations at each age, the E15.5 and P7 datasets were annotated with regional cluster identity (“subset cluster”). Differential expression analysis was performed using the “differentialGeneTest()” function from Monocle 2 that uses a likelihood ratio test to compare a vector generalized additive model (VGAM) using a negative binomial family function to a reduced model in which one parameter of interest has been removed. In practice, the following model was fit: “∼subset.cluster” for E15.5 or P7 dataset. Genes were called as significantly differentially expressed if they had a q value (Benjamini-Hochberg corrected p value) < 0.05. In order to identify differentially expressed genes that were “specifically” expressed in a particular subset cluster, R code calculating the Jensen-Shannon-based specificity score from the R package cummeRbund26Trapnell C. Hendrickson D.G. Sauvageau M. Goff L. Rinn J.L. Pachter L. Differential analysis of gene regulation at transcript resolution with RNA-seq.Nat. Biotechnol. 2013; 31: 46-53Crossref PubMed Scopus (2418) Google Scholar was used similarly to what was described in Burns et al.27Burns J.C. Kelly M.C. Hoa M. Morell R.J. Kelley M.W. Single-cell RNA-Seq resolves cellular complexity in sensory organs from the neonatal inner ear.Nat. Commun. 2015; 6: 8557Crossref PubMed Scopus (123) Google Scholar Briefly, the mean RPC within each cluster for each expressed gene as well as the percentage of cells within each cluster that express each gene at a level >1 transcript were calculated. The “.specificity()” function from the cummeRbund package was then used to calculate and identify the cluster with maximum specificity of each gene’s expression. Details of this specificity metric" @default.
- W2623843598 created "2017-06-15" @default.
- W2623843598 creator A5021277023 @default.
- W2623843598 creator A5021610960 @default.
- W2623843598 creator A5048435659 @default.
- W2623843598 creator A5062557198 @default.
- W2623843598 creator A5070083131 @default.
- W2623843598 creator A5071711769 @default.
- W2623843598 creator A5080770975 @default.
- W2623843598 date "2018-03-01" @default.
- W2623843598 modified "2023-10-15" @default.
- W2623843598 title "Single-Cell RNA-Seq of Mouse Dopaminergic Neurons Informs Candidate Gene Selection for Sporadic Parkinson Disease" @default.
- W2623843598 cites W1217705870 @default.
- W2623843598 cites W1479935944 @default.
- W2623843598 cites W156202311 @default.
- W2623843598 cites W1757892199 @default.
- W2623843598 cites W1966327575 @default.
- W2623843598 cites W1969276438 @default.
- W2623843598 cites W1971104810 @default.
- W2623843598 cites W1973085272 @default.
- W2623843598 cites W1973094248 @default.
- W2623843598 cites W1980786543 @default.
- W2623843598 cites W1982825408 @default.
- W2623843598 cites W1984883254 @default.
- W2623843598 cites W1993219522 @default.
- W2623843598 cites W1998416626 @default.
- W2623843598 cites W1999923500 @default.
- W2623843598 cites W2000156275 @default.
- W2623843598 cites W2000289729 @default.
- W2623843598 cites W2002475003 @default.
- W2623843598 cites W2007091385 @default.
- W2623843598 cites W2010334242 @default.
- W2623843598 cites W2023265636 @default.
- W2623843598 cites W2028796406 @default.
- W2623843598 cites W2030110789 @default.
- W2623843598 cites W2035618305 @default.
- W2623843598 cites W2038834667 @default.
- W2623843598 cites W2045949302 @default.
- W2623843598 cites W2046329404 @default.
- W2623843598 cites W2046623850 @default.
- W2623843598 cites W2049431608 @default.
- W2623843598 cites W2053617989 @default.
- W2623843598 cites W2054677853 @default.
- W2623843598 cites W2056148512 @default.
- W2623843598 cites W2056198580 @default.
- W2623843598 cites W2056413798 @default.
- W2623843598 cites W2057850260 @default.
- W2623843598 cites W2067147530 @default.
- W2623843598 cites W2068767807 @default.
- W2623843598 cites W2069382901 @default.
- W2623843598 cites W2070021921 @default.
- W2623843598 cites W2071553034 @default.
- W2623843598 cites W2077900721 @default.
- W2623843598 cites W2078059415 @default.
- W2623843598 cites W2082171231 @default.
- W2623843598 cites W2084772990 @default.
- W2623843598 cites W2085302808 @default.
- W2623843598 cites W2085930600 @default.
- W2623843598 cites W2088154433 @default.
- W2623843598 cites W2088745158 @default.
- W2623843598 cites W2092293921 @default.
- W2623843598 cites W2094507987 @default.
- W2623843598 cites W2094757997 @default.
- W2623843598 cites W2096663185 @default.
- W2623843598 cites W2098416374 @default.
- W2623843598 cites W2100305481 @default.
- W2623843598 cites W2102278945 @default.
- W2623843598 cites W2122477201 @default.
- W2623843598 cites W2123106337 @default.
- W2623843598 cites W2124906886 @default.
- W2623843598 cites W2126145188 @default.
- W2623843598 cites W2127508962 @default.
- W2623843598 cites W2130410032 @default.
- W2623843598 cites W2130497877 @default.
- W2623843598 cites W2137849205 @default.
- W2623843598 cites W2141459724 @default.
- W2623843598 cites W2143906078 @default.
- W2623843598 cites W2144618115 @default.
- W2623843598 cites W2145144631 @default.
- W2623843598 cites W2145825942 @default.
- W2623843598 cites W2150098952 @default.
- W2623843598 cites W2151557671 @default.
- W2623843598 cites W2151704931 @default.
- W2623843598 cites W2155213268 @default.
- W2623843598 cites W2156247618 @default.
- W2623843598 cites W2156714600 @default.
- W2623843598 cites W2157241157 @default.
- W2623843598 cites W2163924952 @default.
- W2623843598 cites W2166820820 @default.
- W2623843598 cites W2169389191 @default.
- W2623843598 cites W2175664647 @default.
- W2623843598 cites W2178312212 @default.
- W2623843598 cites W2188193460 @default.
- W2623843598 cites W2200183975 @default.
- W2623843598 cites W2212528563 @default.
- W2623843598 cites W2256016639 @default.
- W2623843598 cites W2312055187 @default.
- W2623843598 cites W2522953732 @default.