Matches in SemOpenAlex for { <https://semopenalex.org/work/W2147109452> ?p ?o ?g. }
- W2147109452 endingPage "482" @default.
- W2147109452 startingPage "476" @default.
- W2147109452 abstract "Gene expression analysis is a widely used and powerful method for investigating the transcriptional behavior of biological systems, for classifying cell states in disease, and for many other purposes. Recent studies indicate that common assumptions currently embedded in experimental and analytical practices can lead to misinterpretation of global gene expression data. We discuss these assumptions and describe solutions that should minimize erroneous interpretation of gene expression data from multiple analysis platforms. Gene expression analysis is a widely used and powerful method for investigating the transcriptional behavior of biological systems, for classifying cell states in disease, and for many other purposes. Recent studies indicate that common assumptions currently embedded in experimental and analytical practices can lead to misinterpretation of global gene expression data. We discuss these assumptions and describe solutions that should minimize erroneous interpretation of gene expression data from multiple analysis platforms. Global gene expression analysis provides quantitative information about the population of RNA species in cells and tissues. It is an exceptionally powerful tool of molecular biology that is used to explore basic biology, diagnose disease, facilitate drug discovery and development, tailor therapeutics to specific pathologies and generate databases with information about living processes. Consequently, expression analysis is among the most commonly used methods in modern biology; there are over 750,000 expression data sets in the National Center for Biotechnology Information Gene Expression Omnibus (GEO) public database (Edgar et al., 2002Edgar R. Domrachev M. Lash A.E. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository.Nucleic Acids Res. 2002; 30: 207-210Crossref PubMed Scopus (8467) Google Scholar). Global gene expression analysis uses DNA microarrays, RNA-Seq, and other methods to measure the levels of RNA species in biological systems (Geiss et al., 2008Geiss G.K. Bumgarner R.E. Birditt B. Dahl T. Dowidar N. Dunaway D.L. Fell H.P. Ferree S. George R.D. Grogan T. et al.Direct multiplexed measurement of gene expression with color-coded probe pairs.Nat. Biotechnol. 2008; 26: 317-325Crossref PubMed Scopus (1554) Google Scholar; Heller, 2002Heller M.J. DNA microarray technology: devices, systems, and applications.Annu. Rev. Biomed. Eng. 2002; 4: 129-153Crossref PubMed Scopus (932) Google Scholar; Lockhart and Winzeler, 2000Lockhart D.J. Winzeler E.A. Genomics, gene expression and DNA arrays.Nature. 2000; 405: 827-836Crossref PubMed Scopus (1691) Google Scholar; Ozsolak and Milos, 2011Ozsolak F. Milos P.M. RNA sequencing: advances, challenges and opportunities.Nat. Rev. Genet. 2011; 12: 87-98Crossref PubMed Scopus (1434) Google Scholar; Schena et al., 1998Schena M. Heller R.A. Theriault T.P. Konrad K. Lachenmeier E. Davis R.W. Microarrays: biotechnology’s discovery platform for functional genomics.Trends Biotechnol. 1998; 16: 301-306Abstract Full Text Full Text PDF PubMed Scopus (676) Google Scholar; Wang et al., 2009Wang Z. Gerstein M. Snyder M. RNA-Seq: a revolutionary tool for transcriptomics.Nat. Rev. Genet. 2009; 10: 57-63Crossref PubMed Scopus (8465) Google Scholar). DNA microarrays, which have been most frequently used for expression analysis, consist of millions of individual oligonucleotide probes fixed to a solid surface. The oligonucleotide probes typically have sequences representative of known RNA species and are generally used to quantitate the relative levels of RNA species that hybridize to the probes. Massively parallel sequencing technologies, developed more recently, provide a measure of the frequency of RNA species through sequencing of RNA-derived cDNA populations. Other approaches, such as digital molecular barcoding, represent a fusion of the hybridization and counting approaches. For instance, the nCounter digital quantification platform relies on hybridization of labeled probes to RNA molecules and single-molecule imaging to provide a measurement of the frequency of particular RNA species. Almost all global expression analysis involves isolation of RNA from two or more cellular sources, introducing similar amounts of RNA from the sources into the experimental platform and analyzing the data by using algorithms that normalize the signal from the samples (Kulkarni, 2011Kulkarni M.M. Digital multiplexed gene expression analysis using the NanoString nCounter system.Curr. Protoc. Mol. Biol. 2011; (Chapter 25)PubMed Google Scholar; Mortazavi et al., 2008Mortazavi A. Williams B.A. McCue K. Schaeffer L. Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq.Nat. Methods. 2008; 5: 621-628Crossref PubMed Scopus (9850) Google Scholar; Quackenbush, 2002Quackenbush J. Microarray data normalization and transformation.Nat. Genet. 2002; 32: 496-501Crossref PubMed Scopus (1497) Google Scholar; Schulze and Downward, 2001Schulze A. Downward J. Navigating gene expression using microarrays—a technology review.Nat. Cell Biol. 2001; 3: E190-E195Crossref PubMed Scopus (422) Google Scholar). If the cellular sources produce equivalent amounts of RNA/cell, and the yields of RNA and its derivatives are equivalent throughout experimental manipulation, then normalized expression data should produce an accurate representation of the relative levels of each gene product. We recently found that cells with high levels of c-Myc can amplify their gene expression program, producing two to three times more total RNA and generating cells that are larger than their low-Myc counterparts (Lin et al., 2012Lin C.Y. Lovén J. Rahl P.B. Paranal R.M. Burge C.B. Bradner J.E. Lee T.I. Young R.A. Transcriptional Amplification in Tumor Cells with Elevated c-Myc.Cell. 2012; 151: 56-67Abstract Full Text Full Text PDF PubMed Scopus (1024) Google Scholar; Nie et al., 2012Nie Z. Hu G. Wei G. Cui K. Yamane A. Resch W. Wang R. Green D.R. Tessarollo L. Casellas R. et al.c-Myc Is a Universal Amplifier of Expressed Genes in Lymphocytes and Embryonic Stem Cells.Cell. 2012; 151: 68-79Abstract Full Text Full Text PDF PubMed Scopus (742) Google Scholar). This discovery has led us to question the common assumption that cells produce similar levels of RNA/cell and the general practice of introducing similar amounts of total RNA into analysis platforms without including standardized controls that would reveal transcriptional amplification or repression. As described below, it is likely that this assumption and practice has led to erroneous interpretations. We describe here an experimental approach to genome-wide analysis of RNA expression that is more likely to produce accurate assessments of changes in steady-state levels of RNA. Consider two different models for changes in gene expression (Figure 1). In the first, RNA levels for a minority of genes are elevated, but the levels of total RNA in the two cells are similar (Figure 1A). The absolute levels of most RNA species are therefore similar in the two cells, and when the total signal for the RNA population is normalized by standard algorithms, the resulting expression data appropriately indicates an increase in the relative RNA levels for a set of genes (Figure 1B). In the second model, the two cells express a similar set of genes, but one cell produces and accumulates two to three times more RNA/gene for many of the same genes expressed in the other cell (Figure 1C), an effect that has been termed transcriptional amplification (Lin et al., 2012Lin C.Y. Lovén J. Rahl P.B. Paranal R.M. Burge C.B. Bradner J.E. Lee T.I. Young R.A. Transcriptional Amplification in Tumor Cells with Elevated c-Myc.Cell. 2012; 151: 56-67Abstract Full Text Full Text PDF PubMed Scopus (1024) Google Scholar; Nie et al., 2012Nie Z. Hu G. Wei G. Cui K. Yamane A. Resch W. Wang R. Green D.R. Tessarollo L. Casellas R. et al.c-Myc Is a Universal Amplifier of Expressed Genes in Lymphocytes and Embryonic Stem Cells.Cell. 2012; 151: 68-79Abstract Full Text Full Text PDF PubMed Scopus (742) Google Scholar). In the conventional approach to expression analysis, similar amounts of RNA from the two cells are introduced into the assay, thus masking the fact that one of the cells has two to three times more RNA than the other (Figure 1D). This potential source of error is typically overlooked because of the commonly believed, though rarely stated, assumption that the absolute amount of total mRNA in each cell is similar across different cell types or experimental perturbations. Furthermore, the most commonly used analysis methods are primarily intended to account for technical variations in signal to noise and assume that the signals for different samples from different experiments should be scaled to have the same median or average value or that the distributions of signal intensities for each experiment within a set should all be the same (Bolstad et al., 2003Bolstad B.M. Irizarry R.A. Astrand M. Speed T.P. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.Bioinformatics. 2003; 19: 185-193Crossref PubMed Scopus (6397) Google Scholar; Huber et al., 2002Huber W. von Heydebreck A. Sültmann H. Poustka A. Vingron M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression.Bioinformatics. 2002; 18: S96-S104Crossref PubMed Scopus (1618) Google Scholar; Irizarry et al., 2003Irizarry R.A. Hobbs B. Collin F. Beazer-Barclay Y.D. Antonellis K.J. Scherf U. Speed T.P. Exploration, normalization, and summaries of high density oligonucleotide array probe level data.Biostatistics. 2003; 4: 249-264Crossref PubMed Scopus (8454) Google Scholar; Kalocsai and Shams, 2001Kalocsai P. Shams S. Use of bioinformatics in arrays.Methods Mol. Biol. 2001; 170: 223-236PubMed Google Scholar; Li and Wong, 2001Li C. Wong W.H. Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection.Proc. Natl. Acad. Sci. USA. 2001; 98: 31-36Crossref PubMed Scopus (2704) Google Scholar; Reimers, 2010Reimers M. Making informed choices about microarray data analysis.PLoS Comput. Biol. 2010; 6: e1000786Crossref PubMed Scopus (47) Google Scholar; Wu et al., 2004Wu Z.J. Irizarry R.A. Gentleman R. Martinez-Murillo F. Spencer F. A model-based background adjustment for oligonucleotide expression arrays.J. Am. Stat. Assoc. 2004; 99: 909-917Crossref Scopus (1245) Google Scholar). Normalization of signal from cells that experience transcriptional amplification can thus have the net result of equalizing values that are actually different and producing the erroneous perception that some genes have elevated expression, whereas a similar number of genes have reduced expression. To produce a reliable gene expression analysis protocol that addresses this experimental and data normalization issue, we investigated the use of spiked-in standards (Benes and Muckenthaler, 2003Benes V. Muckenthaler M. Standardization of protocols in cDNA microarray analysis.Trends Biochem. Sci. 2003; 28: 244-249Abstract Full Text Full Text PDF PubMed Scopus (49) Google Scholar; Hartemink et al., 2001Hartemink A.J. Gifford D.K. Jaakkola T.S. Young R.A. Maximum likelihood estimation of optimal scaling factors for expression array normalization.P. Soc. Photo-Opt. Ins. 2001; 2: 132-140Google Scholar; Hill et al., 2001Hill A.A. Brown E.L. Whitley M.Z. Tucker-Kellogg G. Hunter C.P. Slonim D.K. Evaluation of normalization procedures for oligonucleotide array data based on spiked cRNA controls.Genome Biol. 2001; 2: 1-0055Crossref Google Scholar; Jiang et al., 2011Jiang L. Schlesinger F. Davis C.A. Zhang Y. Li R. Salit M. Gingeras T.R. Oliver B. Synthetic spike-in standards for RNA-seq experiments.Genome Res. 2011; 21: 1543-1551Crossref PubMed Scopus (421) Google Scholar; Mortazavi et al., 2008Mortazavi A. Williams B.A. McCue K. Schaeffer L. Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq.Nat. Methods. 2008; 5: 621-628Crossref PubMed Scopus (9850) Google Scholar). We implemented an approach that uses spiked-in RNA standards to allow normalization to cell number and permit correction for differences in yields during experimental manipulation (Figure 2A). We performed genome-wide analysis on P493-6 cells expressing low or high levels of c-Myc (Pajic et al., 2000Pajic A. Spitkovsky D. Christoph B. Kempkes B. Schuhmacher M. Staege M.S. Brielmeier M. Ellwart J. Kohlhuber F. Bornkamm G.W. et al.Cell cycle activation by c-myc in a burkitt lymphoma model cell line.Int. J. Cancer. 2000; 87: 787-793Crossref PubMed Scopus (162) Google Scholar; Schuhmacher et al., 1999Schuhmacher M. Staege M.S. Pajic A. Polack A. Weidle U.H. Bornkamm G.W. Eick D. Kohlhuber F. Control of cell growth by c-Myc in the absence of cell division.Curr. Biol. 1999; 9: 1255-1258Abstract Full Text Full Text PDF PubMed Scopus (234) Google Scholar) in which cells with high levels of the transcription factor were found to produce 2- to 3-fold higher levels of the same RNA species found in cells with low levels (Lin et al., 2012Lin C.Y. Lovén J. Rahl P.B. Paranal R.M. Burge C.B. Bradner J.E. Lee T.I. Young R.A. Transcriptional Amplification in Tumor Cells with Elevated c-Myc.Cell. 2012; 151: 56-67Abstract Full Text Full Text PDF PubMed Scopus (1024) Google Scholar). Cell number was determined by counting cells with C-Chip disposable hemocytometers (Digital Bio) and equivalent numbers of high- and low-Myc cells were harvested. The DNA content of the two samples was measured and found to be equivalent. Following total RNA extraction, spiked-in RNA standards were added in proportion to the number of cells present in the sample. Samples were then split and prepared for microarray, RNA-seq, and digital analysis by using NanoString. DNA-microarrays were first used to compare the high-Myc versus low-Myc cell RNA populations (Figure 2B; Table S1 available online). Similar amounts of RNA from the low- and high-Myc cells were introduced into the Affymetrix DNA microarray assay following the manufacturer’s protocol, which is the most common approach used in expression analysis. The resulting data were processed by using standard normalization methods and by using the spike-in standards for normalization. The results obtained by using standard approaches can be interpreted to mean that the expression levels of some genes are unchanged, whereas others increase or decrease (Figure 2B). The interpretation is quite different when the same data is normalized by using spike-in standards that reflect cell number: 90% of the genes show increases in expression in high-Myc cells relative to low-Myc cells (Figure 2B). RNA-Seq analysis was then used to compare the high-Myc versus low-Myc cell RNA populations (Figure 2C; Table S2). Similar amounts of RNA from the low- and high-Myc cells were subjected to sequencing. The resulting data were processed by using standard normalization methods and by using the spike-in standards for normalization. Again, the results obtained by using standard approaches suggest that the expression levels of some genes are unchanged, whereas others increase or decrease (Figure 2C), yet when the same data are normalized by using spike-in standards that reflect cell number, there is an increase in transcript levels for the vast majority of genes (Figure 2C). We then used whole-sample, digital gene expression quantification (NanoString, Seattle, WA) to compare transcript levels in the high-Myc and low-Myc cells. In one experiment, equal amounts of RNA from the high- and low-Myc cells were compared by using this method. The results of this analysis suggest that the expression levels of some genes is unchanged, whereas others increase or decrease. In a second experiment, equal numbers of high- and low-Myc cells were used to prepare RNA, and these total RNA populations were subjected to digital gene expression quantification. Here, the data indicate there is an increase in transcript levels for the vast majority of genes in high- versus low-Myc cells (Figure 2D, Table S3). In summary, three of the major technologies typically used for global gene expression analysis—microarray, RNA-sequencing, and digital quantification—detect a widespread increase in transcripts/cells in cells that experience transcriptional amplification by c-Myc. Significantly, all three of these major technologies used for gene expression fail to detect the widespread increase of transcription when inappropriate normalization methods are used. Instead, they erroneously suggest the interpretation that a similar number of genes show increases and decreases in expression. Our results indicate that spike-in controls of the type described here are a robust, cross-platform method to allow normalization to cell number and thus enable more accurate detection of differential gene expression and changes in gene expression programs. The clear implication is that the use of spike-in controls normalized to cell number should become the default standard for all expression experiments, as opposed to their more limited use in experiments where gross changes in RNA levels are already anticipated, as exemplified by transcription shutdown experiments (Bar-Joseph et al., 2012Bar-Joseph Z. Gitter A. Simon I. Studying and modelling dynamic biological processes using time-series gene expression data.Nat. Rev. Genet. 2012; 13: 552-564Crossref PubMed Scopus (305) Google Scholar). When cell counting may be problematic, as for expression experiments from solid tumors or tissues, DNA content may be used as a surrogate if ploidy and DNA replication profiles are also characterized to prevent the introduction of a DNA content-based artifact. The discovery of transcriptional amplification and the realization that common experimental methods may lead to erroneous interpretation of gene expression experiments has implications for much current biological research. How prevalent is misinterpretation of genome-wide expression data due to the assumption that cells produce similar levels of total RNA? The answer is likely related to the prevalence of regulatory mechanisms that globally amplify or suppress transcription. What are the implications for classifying cell states in disease? Significant effort is being devoted to expression profiling cancer cells and these studies use standard normalization methods (Alizadeh et al., 2000Alizadeh A.A. Eisen M.B. Davis R.E. Ma C. Lossos I.S. Rosenwald A. Boldrick J.C. Sabet H. Tran T. Yu X. et al.Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling.Nature. 2000; 403: 503-511Crossref PubMed Scopus (7999) Google Scholar; Beer et al., 2002Beer D.G. Kardia S.L. Huang C.C. Giordano T.J. Levin A.M. Misek D.E. Lin L. Chen G. Gharib T.G. Thomas D.G. et al.Gene-expression profiles predict survival of patients with lung adenocarcinoma.Nat. Med. 2002; 8: 816-824Crossref PubMed Scopus (1656) Google Scholar; Berger et al., 2010Berger M.F. Levin J.Z. Vijayendran K. Sivachenko A. Adiconis X. Maguire J. Johnson L.A. Robinson J. Verhaak R.G. Sougnez C. et al.Integrative analysis of the melanoma transcriptome.Genome Res. 2010; 20: 413-427Crossref PubMed Scopus (219) Google Scholar; Bhattacharjee et al., 2001Bhattacharjee A. Richards W.G. Staunton J. Li C. Monti S. Vasa P. Ladd C. Beheshti J. Bueno R. Gillette M. et al.Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.Proc. Natl. Acad. Sci. USA. 2001; 98: 13790-13795Crossref PubMed Scopus (2138) Google Scholar; Bittner et al., 2000Bittner M. Meltzer P. Chen Y. Jiang Y. Seftor E. Hendrix M. Radmacher M. Simon R. Yakhini Z. Ben-Dor A. et al.Molecular classification of cutaneous malignant melanoma by gene expression profiling.Nature. 2000; 406: 536-540Crossref PubMed Scopus (1706) Google Scholar; Golub et al., 1999Golub T.R. Slonim D.K. Tamayo P. Huard C. Gaasenbeek M. Mesirov J.P. Coller H. Loh M.L. Downing J.R. Caligiuri M.A. et al.Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.Science. 1999; 286: 531-537Crossref PubMed Scopus (9191) Google Scholar; Lapointe et al., 2004Lapointe J. Li C. Higgins J.P. van de Rijn M. Bair E. Montgomery K. Ferrari M. Egevad L. Rayford W. Bergerheim U. et al.Gene expression profiling identifies clinically relevant subtypes of prostate cancer.Proc. Natl. Acad. Sci. USA. 2004; 101: 811-816Crossref PubMed Scopus (1056) Google Scholar; Northcott et al., 2012Northcott P.A. Shih D.J. Peacock J. Garzia L. Morrissy A.S. Zichner T. Stütz A.M. Korshunov A. Reimand J. Schumacher S.E. et al.Subgroup-specific structural variation across 1,000 medulloblastoma genomes.Nature. 2012; 488: 49-56Crossref PubMed Scopus (628) Google Scholar; Ramaswamy et al., 2001Ramaswamy S. Tamayo P. Rifkin R. Mukherjee S. Yeang C.H. Angelo M. Ladd C. Reich M. Latulippe E. Mesirov J.P. et al.Multiclass cancer diagnosis using tumor gene expression signatures.Proc. Natl. Acad. Sci. USA. 2001; 98: 15149-15154Crossref PubMed Scopus (1646) Google Scholar; Ross et al., 2000Ross D.T. Scherf U. Eisen M.B. Perou C.M. Rees C. Spellman P. Iyer V. Jeffrey S.S. Van de Rijn M. Waltham M. et al.Systematic variation in gene expression patterns in human cancer cell lines.Nat. Genet. 2000; 24: 227-235Crossref PubMed Scopus (1821) Google Scholar; Schmitz et al., 2012Schmitz R. Young R.M. Ceribelli M. Jhavar S. Xiao W. Zhang M. Wright G. Shaffer A.L. Hodson D.J. Buras E. et al.Burkitt lymphoma pathogenesis and therapeutic targets from structural and functional genomics.Nature. 2012; 490: 116-120Crossref PubMed Scopus (629) Google Scholar; Shipp et al., 2002Shipp M.A. Ross K.N. Tamayo P. Weng A.P. Kutok J.L. Aguiar R.C. Gaasenbeek M. Angelo M. Reich M. Pinkus G.S. et al.Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning.Nat. Med. 2002; 8: 68-74Crossref PubMed Scopus (2032) Google Scholar; Su et al., 2001Su A.I. Welsh J.B. Sapinoso L.M. Kern S.G. Dimitrov P. Lapp H. Schultz P.G. Powell S.M. Moskaluk C.A. Frierson Jr., H.F. Hampton G.M. Molecular classification of human carcinomas by use of gene expression signatures.Cancer Res. 2001; 61: 7388-7393PubMed Google Scholar; Cancer Genome Atlas Network, 2012Cancer Genome Atlas NetworkComprehensive molecular characterization of human colon and rectal cancer.Nature. 2012; 487: 330-337Crossref PubMed Scopus (5904) Google Scholar; van ’t Veer et al., 2002van ’t Veer L.J. Dai H. van de Vijver M.J. He Y.D. Hart A.A. Mao M. Peterse H.L. van der Kooy K. Marton M.J. Witteveen A.T. et al.Gene expression profiling predicts clinical outcome of breast cancer.Nature. 2002; 415: 530-536Crossref PubMed Scopus (7763) Google Scholar; van de Vijver et al., 2002van de Vijver M.J. He Y.D. van’t Veer L.J. Dai H. Hart A.A. Voskuil D.W. Schreiber G.J. Peterse J.L. Roberts C. Marton M.J. et al.A gene-expression signature as a predictor of survival in breast cancer.N. Engl. J. Med. 2002; 347: 1999-2009Crossref PubMed Scopus (5257) Google Scholar; Yeoh et al., 2002Yeoh E.J. Ross M.E. Shurtleff S.A. Williams W.K. Patel D. Mahfouz R. Behm F.G. Raimondi S.C. Relling M.V. Patel A. et al.Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling.Cancer Cell. 2002; 1: 133-143Abstract Full Text Full Text PDF PubMed Scopus (1604) Google Scholar). Because c-Myc expression occurs at widely varying levels in various tumor cells, transcriptional amplification is likely having a profound impact on cancer cell signatures. Where expression data are being used to gain insights into cancer cell behavior and regulation, it should be interpreted with added caution. P493-6 cells were kindly provided by Chi Van Dang, University of Pennsylvania. Cells were propagated in RPMI-1640 supplemented with 10% fetal bovine serum and 1% GlutaMAX (Invitrogen, 35050-061). The conditional pmyc-tet construct in P493-6 cells was repressed with 0.1 μg/ml tetracycline (Sigma, T7660) for 72 hr. Cells were then washed three times with RPMI-1640 medium containing 10% tetracycline system approved FBS (Clontech, 631105) and 1% GlutaMAX and recultured in tetracycline-free culture conditions. All experiments were performed in the absence of EBNA2 activation. Cell numbers were determined by manually counting cells with C-Chip disposable hemocytometers (Digital Bio, DHC-N01) prior to lysis and RNA extraction. Ten million P493-6 cells were homogenized in 1 ml of TRIzol Reagent (Life Technologies, 15596-026), purified with the mirVANA miRNA isolation kit (Ambion, AM1560) following the manufacturer’s instructions and resuspended in 100 μl nuclease-free water (Ambion, AM9938). Total RNA was spiked-in with the External RNA Controls Consortium (ERCC) ExFold RNA spike-in controls, treated with DNA-free DNase I (Ambion, AM1906), and analyzed on Agilent 2100 Bioanalyzer for integrity. The external control spike-ins used in the microarray and RNA-Seq analysis were obtained from the ERCC ExFold RNA Spike-In kit (Ambion, 4456739). The ERCC RNA Spike-In Control Mixes used here comprise a set of 92 polyadenylated transcripts that mimic natural eukaryotic mRNAs. The RNAs range in size from 250–2,000 nucleotides in length and span an approximately 106-fold concentration range. After extracting total RNA from equal numbers of cells, a fixed amount of diluted ERCC Spike-In Mix #1 was added. The amount of spike-in added was calibrated to the RNA yield of the high-Myc cells to ensure the spike-in signal was in the appropriate dynamic range (ERCC User Guide, Table 4). For these experiments, 1μl of a 1:10 dilution of Mix #1 was added to total RNA extracted from 1 × 106 cells. RNA with the RNA integrity number (RIN) above 9.8 was used for library generation for RNA-Seq or hybridized to GeneChip PrimeView Human Gene Expression Arrays (Affymetrix) by using 10 μg or 100 ng of total RNA, respectively. For these experiments, we followed the manufacturer's recommendation and added the spike-in controls to total RNA following RNA extraction. However, we have found that spike-in controls can also be added directly to the sample-Trizol homogenate prior to RNA purification if desired. For microarray analysis, 100 ng of total RNA containing ERCC ExFold Mix #1 RNA spike-in controls (see above) was used to prepare biotinylated aRNA (cRNA) according to the manufacturer’s protocol (3′ IVT Express Kit, Affymetrix 901228). GeneChip arrays (Primeview, Affymetrix 901837) were hybridized and scanned according to standard Affymetrix protocols. All samples were processed in technical duplicate. Images were extracted with Affymetrix GeneChip Command Console (AGCC) and analyzed by using GeneChip Expression Console. A Primeview Chip Definition File that included probe information for the ERCC controls, provided by Affymetrix, was used to generate CEL files. We processed the CEL files by using standard tools available within the affy package in R. The CEL files were processed with the expresso command to convert the raw probe intensities to probe set expression values. The parameters of the expresso command were set to generate Affymetrix MAS5-normalized probe set values. We used a loess regression to renormalize these MAS5 normalized probe set values by using only the spike-in probe sets to fit the loess. The affy package provides a function, loess.normalize, which will perform loess regression on a matrix of values (defined by using the parameter mat) and allows for the user to specify which subset of data to use when fitting the loess (defined by using the parameter subset, see the affy package documentation for further details). For this application, the parameters mat and subset were set as the MAS5-normalized values and the row indices of the ERCC control probe sets, respectively. The default settings for all other parameters were used. The result of this was a matrix of expression values normalized to the control ERCC probes. The probe set values from the duplicates were averaged together and the log2 fold change from the low-Myc to the high-Myc samples are shown. Using 10 μg of total RNA containing ERCC ExFold Mix #1 RNA spike-in controls (see above), we prepared sequencing libraries according to the following protocol. Polyadenylated RNA was purified by two rounds of selection with Dynabeads mRNA Purification Kit for mRNA Purification from total RNA (Life Technologies, 610-06) following the manufacturer instructions. This resulting RNA was then further processed for RNA-Seq assays. Briefly, polyadenylated RNA was fragmented with divalent cations under elevated temperature. First strand cDNA synthesis was performed with random hexamers and Superscript III reverse transcriptase (Life Technologies, 18080-051). Second strand cDNA synthesis was performed by using RNase H and DNA Polymerase I. In the second-strand synthesis reaction, dTTP was replaced with dUTP. After cDNA synthesis, the double-stranded products were end repaired, a single “A” base was added, and Illumina PE adaptors were ligated onto the cDNA products. The ligation products with an average size of 300 bp were purified by using agarose gel electrophoresis. Following gel purification, the strand of cDNA containing dUTP was selectively destroyed during incubation of purified double-stranded DNA with HK-UNG (Epicenter, HU59100). The adaptor ligated single-stranded cDNA was then amplified with 15 cycles of PCR and PCR products were purified by using gel electrophoresis. These RNA-Seq libraries were subsequently sequenced on Illumina HiSeq 2000. Sequences were aligned by using Bowtie (version 0.12.2) to build version NCBI36/HG18 of the human genome where the sequences of the ERCC synthetic spike-in RNAs (http://tools.invitrogen.com/downloads/ERCC92.fa) had been added. The RPKM (reads per kilobase of exon per million) was then computed for each gene and synthetic spike-in RNA. We used a loess regression to renormalize the RPKM values by using only the spike-in values to fit the loess. The affy package in R provides a function, loess.normalize, which will perform loess regression on a matrix of values (defined by using the parameter mat) and allows for the user to specify which subset of data to use when fitting the loess (defined by using the parameter subset, see the affy package documentation for further details). For this application the parameters mat and subset were set as a matrix of all RPKM values and the row indices of the ERCC spike-ins, respectively. The default settings for all other parameters were used. The result of this was a matrix of RPKM values normalized to the control ERCC spike-ins. Eighteen thousand five hundred and thirty-six genes with a RPKM value of 1.0 or greater in the low-Myc sample were selected, and the log2 fold ratio between the low-Myc and high-Myc samples were calculated and shown as a heatmap. For digital gene expression using NanoString nCounter Gene Expression CodeSets, 1 × 106 cells were collected and lysed directly either in 100 μl RLT buffer (QIAGEN, 74104) to yield a concentration of 10,000 cells per μl or in 500 μl lysis buffer with the mirVANA miRNA isolation kit (Ambion, AM1560). Samples were processed according to the cell lysate protocol (nCounter Gene Expression Protocol, NanoString) or the total RNA extraction protocol (Ambion). Four μl of cell lysate (for cell-count normalization) or 100 ng of total RNA (for total RNA normalization) was subsequently incubated overnight at 65°C in nCounter Reporter CodeSet, Capture ProbeSet, and hybridization buffer. Following hybridization, samples were immediately processed with the nCounter PrepStation and subsequently analyzed on an nCounter Digital Analyzer. All samples were processed in biological duplicate. We used two custom nCounter Reporter CodeSets encompassing 429 genes. These codesets encompassed sets of known cancer related genes (CodeSets CS-1 and CS-2) (Delmore et al., 2011Delmore J.E. Issa G.C. Lemieux M.E. Rahl P.B. Shi J.W. Jacobs H.M. Kastritis E. Gilpatrick T. Paranal R.M. Qi J. et al.BET bromodomain inhibition as a therapeutic strategy to target c-Myc.Cell. 2011; 146: 904-917Abstract Full Text Full Text PDF PubMed Scopus (2098) Google Scholar). For each NanoString data set, we used a piecewise linear interpolation of control RNAs (added after hybridization as part of the nCounter PrepStation protocol) to their known concentrations to normalize each data set. Two hundred and sixty-six genes showing expression with a normalized value of 1.0 or greater in both the low-Myc Total-RNA and low-Myc Cell-Count samples were selected, and the log2 fold ratio between the low-Myc and high-Myc samples were calculated and shown as a heatmap. We thank Tom Volkert, Jeong-Ah Kwen, Jennifer Love, and Sumeet Gupta at the Whitehead Genome Technologies Core for Solexa sequencing and microarray processing and Ziv Bar-Joseph for critical comments. This work was supported by National Institutes of Health grants HG002668 (R.A.Y.) and CA146445 (R.A.Y., T.I.L.), an American Cancer Society Postdoctoral Fellowship PF-11-042-01-DMC (P.B.R.) and a Swedish Research Council Postdoctoral Fellowship VR-B0086301 (J.L.). Raw and normalized microarray and RNA-Seq data can be found online associated with the GEO Accession ID GSE40784. Download .xls (5.92 MB) Help with xls files Table S1. Gene Expression Levels by Microarray Download .xls (5.24 MB) Help with xls files Table S2. Gene Expression Levels by RNA-Seq Download .xls (.07 MB) Help with xls files Table S3. Gene Expression Levels by Nanostring" @default.
- W2147109452 created "2016-06-24" @default.
- W2147109452 creator A5013359694 @default.
- W2147109452 creator A5015415643 @default.
- W2147109452 creator A5032502969 @default.
- W2147109452 creator A5034054137 @default.
- W2147109452 creator A5042183595 @default.
- W2147109452 creator A5048799588 @default.
- W2147109452 creator A5057814473 @default.
- W2147109452 creator A5061488860 @default.
- W2147109452 creator A5089343289 @default.
- W2147109452 date "2012-10-01" @default.
- W2147109452 modified "2023-10-14" @default.
- W2147109452 title "Revisiting Global Gene Expression Analysis" @default.
- W2147109452 cites W1582020170 @default.
- W2147109452 cites W1651215666 @default.
- W2147109452 cites W1964461072 @default.
- W2147109452 cites W1973614663 @default.
- W2147109452 cites W1981509058 @default.
- W2147109452 cites W1988248632 @default.
- W2147109452 cites W1988339935 @default.
- W2147109452 cites W1989076816 @default.
- W2147109452 cites W1995830102 @default.
- W2147109452 cites W2000771269 @default.
- W2147109452 cites W2005555854 @default.
- W2147109452 cites W2009458183 @default.
- W2147109452 cites W2020541351 @default.
- W2147109452 cites W2027600997 @default.
- W2147109452 cites W2038437697 @default.
- W2147109452 cites W2060842569 @default.
- W2147109452 cites W2064208261 @default.
- W2147109452 cites W2082080058 @default.
- W2147109452 cites W2097413644 @default.
- W2147109452 cites W2109363337 @default.
- W2147109452 cites W2113962581 @default.
- W2147109452 cites W2118028844 @default.
- W2147109452 cites W2120572492 @default.
- W2147109452 cites W2120865735 @default.
- W2147109452 cites W2122598723 @default.
- W2147109452 cites W2128985829 @default.
- W2147109452 cites W2133111499 @default.
- W2147109452 cites W2137476312 @default.
- W2147109452 cites W2138218344 @default.
- W2147109452 cites W2144738674 @default.
- W2147109452 cites W2147246240 @default.
- W2147109452 cites W2148101811 @default.
- W2147109452 cites W2154431984 @default.
- W2147109452 cites W2156246479 @default.
- W2147109452 cites W2159941442 @default.
- W2147109452 cites W2160450758 @default.
- W2147109452 cites W2161577298 @default.
- W2147109452 cites W2165366028 @default.
- W2147109452 cites W2166422007 @default.
- W2147109452 cites W2167869412 @default.
- W2147109452 cites W2170989872 @default.
- W2147109452 cites W2262414037 @default.
- W2147109452 doi "https://doi.org/10.1016/j.cell.2012.10.012" @default.
- W2147109452 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3505597" @default.
- W2147109452 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/23101621" @default.
- W2147109452 hasPublicationYear "2012" @default.
- W2147109452 type Work @default.
- W2147109452 sameAs 2147109452 @default.
- W2147109452 citedByCount "507" @default.
- W2147109452 countsByYear W21471094522012 @default.
- W2147109452 countsByYear W21471094522013 @default.
- W2147109452 countsByYear W21471094522014 @default.
- W2147109452 countsByYear W21471094522015 @default.
- W2147109452 countsByYear W21471094522016 @default.
- W2147109452 countsByYear W21471094522017 @default.
- W2147109452 countsByYear W21471094522018 @default.
- W2147109452 countsByYear W21471094522019 @default.
- W2147109452 countsByYear W21471094522020 @default.
- W2147109452 countsByYear W21471094522021 @default.
- W2147109452 countsByYear W21471094522022 @default.
- W2147109452 countsByYear W21471094522023 @default.
- W2147109452 crossrefType "journal-article" @default.
- W2147109452 hasAuthorship W2147109452A5013359694 @default.
- W2147109452 hasAuthorship W2147109452A5015415643 @default.
- W2147109452 hasAuthorship W2147109452A5032502969 @default.
- W2147109452 hasAuthorship W2147109452A5034054137 @default.
- W2147109452 hasAuthorship W2147109452A5042183595 @default.
- W2147109452 hasAuthorship W2147109452A5048799588 @default.
- W2147109452 hasAuthorship W2147109452A5057814473 @default.
- W2147109452 hasAuthorship W2147109452A5061488860 @default.
- W2147109452 hasAuthorship W2147109452A5089343289 @default.
- W2147109452 hasBestOaLocation W21471094521 @default.
- W2147109452 hasConcept C104317684 @default.
- W2147109452 hasConcept C150194340 @default.
- W2147109452 hasConcept C18431079 @default.
- W2147109452 hasConcept C199360897 @default.
- W2147109452 hasConcept C41008148 @default.
- W2147109452 hasConcept C54355233 @default.
- W2147109452 hasConcept C70721500 @default.
- W2147109452 hasConcept C78458016 @default.
- W2147109452 hasConcept C86803240 @default.
- W2147109452 hasConcept C90559484 @default.
- W2147109452 hasConceptScore W2147109452C104317684 @default.
- W2147109452 hasConceptScore W2147109452C150194340 @default.