Matches in SemOpenAlex for { <https://semopenalex.org/work/W2023172335> ?p ?o ?g. }
- W2023172335 endingPage "274" @default.
- W2023172335 startingPage "265" @default.
- W2023172335 abstract "Signals of archaic admixture have been identified through comparisons of the draft Neanderthal and Denisova genomes with those of living humans. Studies of individual loci contributing to these genome-wide average signals are required for characterization of the introgression process and investigation of whether archaic variants conferred an adaptive advantage to the ancestors of contemporary human populations. However, no definitive case of adaptive introgression has yet been described. Here we provide a DNA sequence analysis of the innate immune gene STAT2 and show that a haplotype carried by many Eurasians (but not sub-Saharan Africans) has a sequence that closely matches that of the Neanderthal STAT2. This haplotype, referred to as N, was discovered through a resequencing survey of the entire coding region of STAT2 in a global sample of 90 individuals. Analyses of publicly available complete genome sequence data show that haplotype N shares a recent common ancestor with the Neanderthal sequence (∼80 thousand years ago) and is found throughout Eurasia at an average frequency of ∼5%. Interestingly, N is found in Melanesian populations at ∼10-fold higher frequency (∼54%) than in Eurasian populations. A neutrality test that controls for demography rejects the hypothesis that a variant of N rose to high frequency in Melanesia by genetic drift alone. Although we are not able to pinpoint the precise target of positive selection, we identify nonsynonymous mutations in ERBB3, ESYT1, and STAT2—all of which are part of the same 250 kb introgressive haplotype—as good candidates. Signals of archaic admixture have been identified through comparisons of the draft Neanderthal and Denisova genomes with those of living humans. Studies of individual loci contributing to these genome-wide average signals are required for characterization of the introgression process and investigation of whether archaic variants conferred an adaptive advantage to the ancestors of contemporary human populations. However, no definitive case of adaptive introgression has yet been described. Here we provide a DNA sequence analysis of the innate immune gene STAT2 and show that a haplotype carried by many Eurasians (but not sub-Saharan Africans) has a sequence that closely matches that of the Neanderthal STAT2. This haplotype, referred to as N, was discovered through a resequencing survey of the entire coding region of STAT2 in a global sample of 90 individuals. Analyses of publicly available complete genome sequence data show that haplotype N shares a recent common ancestor with the Neanderthal sequence (∼80 thousand years ago) and is found throughout Eurasia at an average frequency of ∼5%. Interestingly, N is found in Melanesian populations at ∼10-fold higher frequency (∼54%) than in Eurasian populations. A neutrality test that controls for demography rejects the hypothesis that a variant of N rose to high frequency in Melanesia by genetic drift alone. Although we are not able to pinpoint the precise target of positive selection, we identify nonsynonymous mutations in ERBB3, ESYT1, and STAT2—all of which are part of the same 250 kb introgressive haplotype—as good candidates. Comparisons of the Neanderthal and Denisova genomes with those of present-day humans support the hypothesis of hybridization between these ancient Pleistocene populations and the ancestors of anatomically modern humans (AMH) in Eurasia.1Green R.E. Krause J. Briggs A.W. Maricic T. Stenzel U. Kircher M. Patterson N. Li H. Zhai W. Fritz M.H. et al.A draft sequence of the Neandertal genome.Science. 2010; 328: 710-722Crossref PubMed Scopus (2469) Google Scholar, 2Reich D. Green R.E. Kircher M. Krause J. Patterson N. Durand E.Y. Viola B. Briggs A.W. Stenzel U. Johnson P.L. et al.Genetic history of an archaic hominin group from Denisova Cave in Siberia.Nature. 2010; 468: 1053-1060Crossref PubMed Scopus (1137) Google Scholar With the growing acceptance of gene flow between archaic humans and AMH, we can now begin to investigate the role that natural selection might have played in influencing the introgression process after hybridization. To do this, we must move beyond estimates of the average extent of archaic ancestry across the genome to studies that (1) identify specific genomic regions that have introgressed, (2) determine the extent of the chromosomal region affected by introgression, and (3) measure the frequency of introgressive alleles in human populations. Neutrally evolving introgressive alleles are only expected to be found sporadically among human populations given the likely loss of many of these variants through genetic drift. On the other hand, archaic alleles that confer a selective advantage after introgressing may consistently reach higher frequencies even in the case of low levels of archaic admixture.3Evans P.D. Mekel-Bobrov N. Vallender E.J. Hudson R.R. Lahn B.T. Evidence that the adaptive allele of the brain size gene microcephalin introgressed into Homo sapiens from an archaic Homo lineage.Proc. Natl. Acad. Sci. USA. 2006; 103: 18178-18183Crossref PubMed Scopus (117) Google Scholar, 4Hawks J. Cochran G. Harpending H.C. Lahn B.T. A genetic legacy from archaic Homo.Trends Genet. 2008; 24: 19-23Abstract Full Text Full Text PDF PubMed Scopus (28) Google Scholar Thus far, only a handful of loci have been hypothesized to have entered the human gene pool through archaic admixture and positive selection, including MAPT (MIM 157140),5Hardy J. Pittman A. Myers A. Gwinn-Hardy K. Fung H.C. de Silva R. Hutton M. Duckworth J. Evidence suggesting that Homo neanderthalensis contributed the H2 MAPT haplotype to Homo sapiens.Biochem. Soc. Trans. 2005; 33: 582-585Crossref PubMed Scopus (59) Google Scholar MCPH1 (MIM 607117),3Evans P.D. Mekel-Bobrov N. Vallender E.J. Hudson R.R. Lahn B.T. Evidence that the adaptive allele of the brain size gene microcephalin introgressed into Homo sapiens from an archaic Homo lineage.Proc. Natl. Acad. Sci. USA. 2006; 103: 18178-18183Crossref PubMed Scopus (117) Google Scholar and particular alleles at the HLA locus (MIM 142800, 142830, 142840).6Abi-Rached L. Jobin M.J. Kulkarni S. McWhinnie A. Dalva K. Gragert L. Babrzadeh F. Gharizadeh B. Luo M. Plummer F.A. et al.The shaping of modern human immune systems by multiregional admixture with archaic humans.Science. 2011; 334: 89-94Crossref PubMed Scopus (336) Google Scholar However, analysis of the Neanderthal genome failed to provide evidence of introgressive alleles at the former two loci.1Green R.E. Krause J. Briggs A.W. Maricic T. Stenzel U. Kircher M. Patterson N. Li H. Zhai W. Fritz M.H. et al.A draft sequence of the Neandertal genome.Science. 2010; 328: 710-722Crossref PubMed Scopus (2469) Google Scholar Because of its role in fighting pathogens, HLA presents an instance where it is relatively easy to conceive of an a priori reason that acquisition of an archaic Eurasian HLA allele would benefit human ancestors, especially as they expanded into new habitats.7Reed D.L. Smith V.S. Hammond S.L. Rogers A.R. Clayton D.H. Genetic analysis of lice supports direct contact between modern and archaic humans.PLoS Biol. 2004; 2: e340Crossref PubMed Scopus (191) Google Scholar However, the fact that HLA haplotypes are known to exhibit transspecific polymorphism and show evidence of strong balancing selection8Takahata N. Nei M. Allelic genealogy under overdominant and frequency-dependent selection and polymorphism of major histocompatibility complex loci.Genetics. 1990; 124: 967-978PubMed Google Scholar, 9Thomson G. HLA population genetics.Baillieres Clin. Endocrinol. Metab. 1991; 5: 247-260Abstract Full Text PDF PubMed Scopus (7) Google Scholar increases the probability that similarities between modern and archaic haplotypes are due to ancestral shared polymorphism (i.e., as opposed to archaic admixture). In addition, the SNPs tagging the main HLA haplotype that was said to have introgressed were not observed in the Denisova or Neanderthal draft genomes. Here we present evidence that STAT2 (MIM 600556), a gene also having an important role in immunity, introgressed from Neanderthals. Located on chromosome 12, STAT2 encodes STAT2 (accession number AAA98760.1), which plays an important role in interferon signaling pathways. Because of its key role in interferon-mediated responses10Reich N.C. STAT dynamics.Cytokine Growth Factor Rev. 2007; 18: 511-518Abstract Full Text Full Text PDF PubMed Scopus (53) Google Scholar and potential associations with autoimmune disorders,11Li Y. Begovich A.B. Unraveling the genetics of complex diseases: Susceptibility genes for rheumatoid arthritis and psoriasis.Semin. Immunol. 2009; 21: 318-327Crossref PubMed Scopus (58) Google Scholar we considered STAT2 a candidate for local adaptation in humans. Initially, we resequenced ∼8.6 kb of STAT2, including all coding exons, in six Old World populations (Biaka, Mandenka, San, Han Chinese, French Basque, and Papua New Guineans) and observed the presence of a haplotype (N) that is restricted to non-African populations and has a relatively deep branching. This haplotype shares derived SNPs with Neanderthals, produces extended linkage disequilibrium (LD) in non-Africans, and shows recent common ancestry with the Neanderthal sequence. Surprisingly, haplotype N is found at 10-fold higher frequency in Papuan New Guinea, making it a candidate for positive selection in Melanesians. Four panels of samples were used in this study. The first panel (resequencing panel) consisted of 90 humans from three sub-Saharan African and three non-African populations (16 Mandenka from Senegal, 16 Biaka Pygmy from the Central African Republic, 10 San from Namibia, 16 French Basque, 16 Chinese Han, and 16 Papua New Guineans), as well as a common chimpanzee and a bonobo. A study of neutral genetic variation in humans,12Hammer M.F. Woerner A.E. Mendez F.L. Watkins J.C. Cox M.P. Wall J.D. The ratio of human X chromosome to autosome diversity is positively correlated with genetic distance from genes.Nat. Genet. 2010; 42: 830-831Crossref PubMed Scopus (69) Google Scholar, 13Wall J.D. Cox M.P. Mendez F.L. Woerner A. Severson T. Hammer M.F. A novel DNA sequence database for analyzing human demographic history.Genome Res. 2008; 18: 1354-1361Crossref PubMed Scopus (68) Google Scholar which included 61 noncoding loci, had used these samples previously. The second panel (genotyping panel) consisted of 75 Melanesians and was genotyped for SNPs diagnostic for haplotypes in the N and D clades (see below). The third panel (public SNP panel) consisted of samples genotyped in published studies; these included the Human Genome Diversity Project (HGDP) subset that was genotyped on the Illumina 650Y array,14López Herráez D. Bauchet M. Tang K. Theunert C. Pugach I. Li J. Nandineni M.R. Gross A. Scholz M. Stoneking M. Genetic variation and recent positive selection in worldwide human populations: evidence from nearly 1 million SNPs.PLoS ONE. 2009; 4: e7888Crossref PubMed Scopus (111) Google Scholar, 15Pickrell J.K. Coop G. Novembre J. Kudaravalli S. Li J.Z. Absher D. Srinivasan B.S. Barsh G.S. Myers R.M. Feldman M.W. Pritchard J.K. Signals of recent positive selection in a worldwide sample of human populations.Genome Res. 2009; 19: 826-837Crossref PubMed Scopus (545) Google Scholar ten European populations used in a study of the geographic structure of genetic variation in Europe,16Lao O. Lu T.T. Nothnagel M. Junge O. Freitag-Wolf S. Caliebe A. Balascakova M. Bertranpetit J. Bindoff L.A. Comas D. et al.Correlation between genetic and geographic structure in Europe.Curr. Biol. 2008; 18: 1241-1248Abstract Full Text Full Text PDF PubMed Scopus (352) Google Scholar six HapMap populations, and 24 other populations.17Xing J. Watkins W.S. Shlien A. Walker E. Huff C.D. Witherspoon D.J. Zhang Y. Simonson T.S. Weiss R.B. Schiffman J.D. et al.Toward a more uniform sampling of human genetic diversity: A survey of worldwide populations by high-density genotyping.Genomics. 2010; 96: 199-210Crossref PubMed Scopus (67) Google Scholar The fourth panel consists of publicly available whole-genome sequences (public WGS panel), including 1 Japanese (NA18956) and 1 Luhya (NA19026) sequenced by Complete Genomics, 1 San (KB1),18Schuster S.C. Miller W. Ratan A. Tomsho L.P. Giardine B. Kasson L.R. Harris R.S. Petersen D.C. Zhao F. Qi J. et al.Complete Khoisan and Bantu genomes from southern Africa.Nature. 2010; 463: 943-947Crossref PubMed Scopus (298) Google Scholar and 1 Papuan (HGDP00542),1Green R.E. Krause J. Briggs A.W. Maricic T. Stenzel U. Kircher M. Patterson N. Li H. Zhai W. Fritz M.H. et al.A draft sequence of the Neandertal genome.Science. 2010; 328: 710-722Crossref PubMed Scopus (2469) Google Scholar as well as the Neanderthal and Denisova draft genomes. All sampling procedures were approved by the University of Arizona Human Subjects Committee. In what follows all positions refer to chromosome 12 and the 2006 build of the human genome (hg18). The resequencing panel was amplified by PCR and sequenced for ∼8.6 kb of STAT2 in six segments spanning bases 55,021,597–55,040,412 (Figure 1; see also Tables S1A and, for primer sequences, Table S1B in the Supplemental Data available with this article online). Chromatograms were analyzed with Phred/Phrap/Consed/Polyphred and finished manually.13Wall J.D. Cox M.P. Mendez F.L. Woerner A. Severson T. Hammer M.F. A novel DNA sequence database for analyzing human demographic history.Genome Res. 2008; 18: 1354-1361Crossref PubMed Scopus (68) Google Scholar The ancestral state was inferred from chimpanzee and bonobo sequences. Samples in the genotyping panel were sequenced at positions 55,030,502, 55,030,689, and 55,030,712 (Table S2). Watterson’s θW, nucleotide diversity π, and Tajima’s D19Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.Genetics. 1989; 123: 585-595Crossref PubMed Google Scholar were computed from the resequenced data. DNAsp20Librado P. Rozas J. DnaSP v5: A software for comprehensive analysis of DNA polymorphism data.Bioinformatics. 2009; 25: 1451-1452Crossref PubMed Scopus (12382) Google Scholar was used for estimating the parameters and their sample standard deviations. Haplotypes were phased manually after alleles that occurred in fewer than three chromosomes were removed. Cladograms were constructed from the haplotypes in Table 1, both manually and through the use of PAUP21. Wilgenbusch, J.C., and Swofford, D. (2003). Inferring evolutionary trees with PAUP∗. In Current Protocols in Bioinformatics, Chapter 6, Unit 6.4.Google Scholar after the removal of two haplotypes showing evidence of recombination. A fully resolved bifurcating tree was obtained after our data set was augmented with the publicly available genome sequences of individuals NA19026 and KB1 and after two additional nucleotide sites outside of the resequenced region were taken into consideration (Table S3).Table 1Polymorphism Table for Six Segments Covering All Exons of STAT2aSites with <3 chromosomes are excluded.HaplotypeGenomic Positionsb55,000,000 has been subtracted from the original positions.AfricansNon-Africans2211423522237332424026128262842664726949269672926429634299653035830502305433068930712357613576235859364714008940404BIAMANSANPNGHANBASGCGGCCAGACGCTCTAAGGTATACCTNeanderthal∗...–∗.....∗∗..AC...T∗.Denisova..∗...∗......T.....∗...N....–—.........AC...TA.193D.............T.........3S.T....T...........CG...6Mb-1...A...C..T...C........98103122Mb-2..AA...C..T...C........4Mb-3...A...C..T.C.C........3Ma-1.......C...............22Ma-2.......C...C...........2Ma-3.......C...C..........G16105Ma-4A......C...C..........G22Ma-5.......CT..C..........G3Ma-6.......C.C.C..........G172Ma-7.......C...C.....G....G15Rec N-Mb...A...C.......AC...TA.1Rec S-M.T....T................1An asterisk indicates low coverage. The following abbreviations are used: BIA, Biaka; MAN, Mandinka; SAN, San; PNG, Papua New Guinea; HAN, Han Chinese; and BAS, French Basque.a Sites with <3 chromosomes are excluded.b 55,000,000 has been subtracted from the original positions. Open table in a new tab An asterisk indicates low coverage. The following abbreviations are used: BIA, Biaka; MAN, Mandinka; SAN, San; PNG, Papua New Guinea; HAN, Han Chinese; and BAS, French Basque. Phased haplotypes from HapMap phase III were downloaded and analyzed with the program Haploblock Finder v. 0.7.22Zhang K. Jin L. HaploBlockFinder: Haplotype block analyses.Bioinformatics. 2003; 19: 1300-1301Crossref PubMed Scopus (101) Google Scholar For LD analysis, SNPs with minor allele frequency greater than 0.02 were used (i.e., singletons were removed). Pairwise LD in each population was plotted with the scripts accompanying the program. The probability that the haplotype N of length r (in Morgans) persisted in a panmictic population for t generations was estimated under the assumptions that generation time was 25 years and that the decay of a haplotype by recombination follows an exponential distribution with parameter r. Because of its high precision, we chose the genetic map of Hinch et al.23Hinch A.G. Tandon A. Patterson N. Song Y. Rohland N. Palmer C.D. Chen G.K. Wang K. Buxbaum S.G. Akylbekova E.L. et al.The landscape of recombination in African Americans.Nature. 2011; 476: 170-175Crossref PubMed Scopus (219) Google Scholar to determine r. Given the absence of recombinational hotsposts in the analyzed region, the variance in r between populations is expected to be small. We used one of two different methods to calculate the divergence time between a pair of hominin lineages, depending on sequence coverage. For sequences with complete coverage, the number of mutations separating the sequences was assumed to be a sample from a Poisson distribution. The corresponding mutation rate was calculated with 6 million years (My) as a divergence time for the human and chimpanzee reference sequences (Figure S1). For an individual, the number of mutations separating the sequences of the two chromosomes is the number of heterozygous sites. For comparisons between NA18956 (from the WGS panel) and the Neanderthal sequence, which has incomplete coverage, the mutations derived in NA18956 since the common ancestor with chimpanzee were checked against the Neanderthal sequence (Figure S1). The mutations with sequence coverage were classified as predating or postdating the split between NA18956 and Neanderthal lineages. We used methods based on the distribution of presplit and postsplit mutations24Karafet T.M. Mendez F.L. Meilerman M.B. Underhill P.A. Zegura S.L. Hammer M.F. New binary polymorphisms reshape and increase resolution of the human Y chromosomal haplogroup tree.Genome Res. 2008; 18: 830-838Crossref PubMed Scopus (670) Google Scholar, 25Mendez F.L. Karafet T.M. Krahn T. Ostrer H. Soodyall H. Hammer M.F. Increased resolution of Y chromosome haplogroup T defines relationships among populations of the Near East, Europe, and Africa.Hum. Biol. 2011; 83: 39-53Crossref PubMed Scopus (41) Google Scholar to estimate the fraction of the interval that postdates the split. A joint likelihood was then obtained for the divergence times of the sequence of NA18956 from those of Neanderthals and the human reference (Appendix A), which was then used to obtain point estimates and confidence intervals. The allele frequency of the N lineage was estimated via diagnostic SNPs (Table S2), and its geographic distribution was plotted with Generic Mapping Tools.26Wessel P. Smith W.H.F. New, improved version of generic mapping tools released.Eos Transactions, American Geophysical Union. 1998; 79: 579Crossref Google Scholar To test for an unusually high frequency of haplotype N we (1) generated an empirical distribution of derived-allele frequencies for each of two Melanesian HGDP samples (Papuans and Nasioi), (2) compared the frequency of the SNP rs7962107 (diagnostic of the long variant of the N lineage) to this distribution, and (3) applied a one-tailed test for elevated frequency of the derived allele. We built the empirical distribution by using SNPs in the Illumina 650Y array that were genotyped in the HGDP panel.15Pickrell J.K. Coop G. Novembre J. Kudaravalli S. Li J.Z. Absher D. Srinivasan B.S. Barsh G.S. Myers R.M. Feldman M.W. Pritchard J.K. Signals of recent positive selection in a worldwide sample of human populations.Genome Res. 2009; 19: 826-837Crossref PubMed Scopus (545) Google Scholar We filtered SNPs by requiring that the derived-allele frequency and variance among East Asian populations be within 30% of the values corresponding to rs7962107. The frequencies in Melanesian populations of the 6,213 SNPs that passed the filter were used for generating the empirical distributions of derived-allele frequencies in Papuans and in Nasioi samples. We note that the test yielded similar results when these values were between 20% and 30% (see Results); however, the value of 30% is reported below as it both increases the robustness of the test (i.e., by including a larger number of SNPs) and at the same time makes the test more conservative (i.e., by including more SNPs that are at higher frequency in East Asia than in the N lineage). Figure 2 shows a cladogram of 13 haplotypes observed in the 90 humans included in the resequencing panel (Table 1 and Table S4). Publicly available sequence data (public WGS panel) from regions surrounding the 8.6 kb STAT2 helped to resolve the phylogeny into a fully binary tree (boxes in Figure 2). The cladogram contains four clades, labeled S, D, N, and M (standing for San, Denisova, Neanderthal, and modern, respectively). S is observed only in the San, with a frequency of 35% (including a recombinant haplotype) (Table 1). Clade D, containing a single rare haplotype, is restricted to our Papuan sample, where it is found at a frequency of 9%. Clade N is present at high frequency in Papuans (59%) and at lower frequency in the Basque (9%). The remaining chromosomes fall into two major subclades, labeled Ma and Mb (Figure 2). The Ma subclade is restricted to sub-Saharan Africans, where it ranges in frequency from 65% to 75%, whereas the Mb subclade is most common in our worldwide sample (48%). Haplotypes in clade Mb predominate in non-Africans, especially the Han and French Basque, where they are found at frequencies of 97% and 91%, respectively. In the resequencing panel, levels of polymorphism within the 8,606 bp of sequence generated within and around STAT2 (∼0.03%–0.04% per base, Table 2) are lower than the genome average (∼0.1% per base).13Wall J.D. Cox M.P. Mendez F.L. Woerner A. Severson T. Hammer M.F. A novel DNA sequence database for analyzing human demographic history.Genome Res. 2008; 18: 1354-1361Crossref PubMed Scopus (68) Google Scholar This result holds when the analysis is restricted to the noncoding sequences of STAT2 (Table 2 and Figure 1C). Notably, although they are still lower than the genome average, values of STAT2 nucleotide diversity are highest in the San (θ = 0.043 ± 0.018 and π = 0.049) and in Papuans (θ = 0.040 ± 0.016 and π = 0.050). Additionally, these two population samples exhibit the highest Tajima’s D values (0.55 and 0.79, respectively) (Table 2). In a comparison with 61 noncoding loci sequenced in the same populations,12Hammer M.F. Woerner A.E. Mendez F.L. Watkins J.C. Cox M.P. Wall J.D. The ratio of human X chromosome to autosome diversity is positively correlated with genetic distance from genes.Nat. Genet. 2010; 42: 830-831Crossref PubMed Scopus (69) Google Scholar the STAT2 locus shows reduced polymorphism in all three sub-Saharan African samples, as well as in the Han Chinese, but not in our samples of French Basque or Papua New Guineans (Figure S2).Table 2Nucleotide Diversity at STAT2PopulationnaNumber of chromosomes in the sample.All Amplicons (8,606 bp)Noncoding (6,027 bp)Sθ (%)π (%)TDSθ (%)π (%)TDbTajima’s D.θ/ DcDivergence calculated between human and chimpanzee reference sequences.π/ DcDivergence calculated between human and chimpanzee reference sequences.Biaka32100.0290.0300.1670.0290.0390.950.0290.038Mandenka32130.0380.034−0.30110.0460.039−0.490.0450.038San20130.0430.0490.5580.0380.036−0.150.0380.035Papuans32140.0400.0500.79120.0500.0640.890.0490.063Han32100.0290.007−2.34100.0420.011−2.340.0410.010Basque32140.0400.024−1.35130.0540.032−1.360.0530.031a Number of chromosomes in the sample.b Tajima’s D.c Divergence calculated between human and chimpanzee reference sequences. Open table in a new tab The tree-like structure within the 8.6 kb of sequence data analyzed in Figure 2 is consistent with strong LD in the vicinity of STAT2. To explore how far LD extends along the chromosome and to assess whether observed LD in the region of STAT2 could be the consequence of a recent bottleneck in non-Africans, we performed a haploblock analysis of SNP-based haplotypes present in ten populations of HapMap phase III. Some non-African chromosomes are characterized by an approximately 260 kb haploblock, whereas others contain a shorter 130 kb haploblock that is nested within the longer block (Figure 1). Neither version of this haploblock is present in the African HapMap data, where the average length of LD is much shorter (Figure S3). The short and long haploblocks match haplotypes that are members of the N clade. Thus, we refer to these haplotypes as short and long variants of N. The short variant is present in all non-African populations, whereas the long variant is found only in East Asians, especially Japanese (Figure S3). The individual NA18956, whose sequence is used in a more detailed analysis below, is heterozygous for the short and long variants of the N lineage. This sample was chosen because (1) it was sequenced to high coverage (e.g., > 40×), and (2) it is homozygous for the N lineage at STAT2. The maximal genetic distance between markers at the ends of the short variant is ∼0.032 cM. The 95% upper bound for the time of maintenance of this haplotype is estimated at ∼235 kya. Analogously, for the long variant, the genetic length is ∼0.081 cM, resulting in a 95% upper bound of 92 kya. For all positions within the 130 kb short block with coverage in the Neanderthal draft genome sequence (Figure 1), we compared the sequences of NA18956, the human reference, the chimpanzee reference, and Neanderthal. Wherever the human reference has the derived state at a given site, the Neanderthal sequence shares the ancestral state with NA18956. The Neanderthal sequence matches NA18956 at 32 out of the 36 positions at which NA18956 is homozygous derived. To assess whether the 130 kb that are unique to the long block also match Neanderthal sequence (i.e., from 54,770,000 to 54,913,000), we compared the sequences of NA18956, the human reference, a Papuan individual homozygous for the N lineage (HGDP00542), and chimpanzee references as an outgroup (i.e., to infer ancestral state). We chose to analyze variants in NA18956 because it has the highest sequence coverage among individuals carrying the N/Neanderthal lineage. We considered only sites at which the best alignment quality of a Neanderthal read was 60 or more. In 86% of the cases (18/21) where a variant in NA18956 was ancestral (i.e., where it differed from the human reference and was shared with the chimpanzee reference), we found the Neanderthal variant to be ancestral. Analogously, for 86% (19/22) of the sites at which NA18956 and HGDP00542 shared the derived allele, we found the Neanderthal variant to be derived (Table S5). The overall pattern of similarity between the N lineage and the Neanderthal lineage over the long block suggests that the entire 260 kb introgressed from the Neanderthal lineage. Finally, Denisova and Neanderthal sequences agree at eight of the 20 sites at which both Neanderthal and Denisova have sequence coverage and Neanderthal sequence is derived. We used variant sites between positions 54,913,000 and 55,040,500 in the public WGS panel to estimate divergence time between the Neanderthal and N clade lineages. We used a maximum-likelihood approach to estimate the times of divergence both between the Neanderthal and NA18956 sequences and between each of these sequences and the human reference (Figure 3). The Neanderthal-N lineage divergence time is necessarily more recent because the sequences of NA18956 and Neanderthal share several derived mutations. If we assume a divergence time for human and chimpanzee sequences of 6 Mya, the estimated times of sequence divergence for the reference-Neanderthal comparison and the NA18956-Neanderthal comparison are 609 kya (501–731 kya, 95% CI) and 78 kya (25–159 kya, 95% CI), respectively. The sequences of the short and long variants observed in NA18956 diverged ∼22 kya (6–56 kya, 95% CI) (Table 3).Table 3Times of Divergence between Pairs of Haplotype LineagesHaplotype Lineage 1 (individual)Haplotype Lineage 2 (Individual)Divergence Time in kya (95% CI)N (NA18956)Mb (reference)609 (501–731)N (NA18956)Neanderthal78 (25–159)N-short (NA18956)N-long (NA18956)22 (6–56) Open table in a new tab We used data from the genotyping and public SNP panels (see Subjects and Methods) to investigate the global distribution of N haplotypes. Although N lineages are broadly distributed across non-African populations and distributed in North African Mozabites at an average frequency of 5% (Figure 4 and Table S6), they are 10 times more frequent in Melanesian populations (∼54%). To determine the relative prevalence of the short and long variants of the N lineage, we examined the subset of populations (i.e., 30 populations from HGDP) with sufficient genotyping information to distinguish them. Table S6 shows that the long variant is present in East Asian (14/748 chromosomes) and Oceanian (22/56 chromosomes) populations (but it ne" @default.
- W2023172335 created "2016-06-24" @default.
- W2023172335 creator A5006007183 @default.
- W2023172335 creator A5076190364 @default.
- W2023172335 creator A5091321729 @default.
- W2023172335 date "2012-08-01" @default.
- W2023172335 modified "2023-10-16" @default.
- W2023172335 title "A Haplotype at STAT2 Introgressed from Neanderthals and Serves as a Candidate of Positive Selection in Papua New Guinea" @default.
- W2023172335 cites W1929921649 @default.
- W2023172335 cites W1977286650 @default.
- W2023172335 cites W1984358340 @default.
- W2023172335 cites W1988060825 @default.
- W2023172335 cites W1993481578 @default.
- W2023172335 cites W1993804787 @default.
- W2023172335 cites W1994263888 @default.
- W2023172335 cites W1998348937 @default.
- W2023172335 cites W2008712670 @default.
- W2023172335 cites W2018314772 @default.
- W2023172335 cites W2035027593 @default.
- W2023172335 cites W2039406200 @default.
- W2023172335 cites W2040897787 @default.
- W2023172335 cites W2066781376 @default.
- W2023172335 cites W2091151261 @default.
- W2023172335 cites W2098460741 @default.
- W2023172335 cites W2103225712 @default.
- W2023172335 cites W2113104073 @default.
- W2023172335 cites W2113439351 @default.
- W2023172335 cites W2113482718 @default.
- W2023172335 cites W2113708808 @default.
- W2023172335 cites W2120095382 @default.
- W2023172335 cites W2124362779 @default.
- W2023172335 cites W2129274466 @default.
- W2023172335 cites W2139265150 @default.
- W2023172335 cites W2140784970 @default.
- W2023172335 cites W2142215232 @default.
- W2023172335 cites W2142851407 @default.
- W2023172335 cites W2143183225 @default.
- W2023172335 cites W2144162338 @default.
- W2023172335 cites W2150906610 @default.
- W2023172335 cites W2152405312 @default.
- W2023172335 cites W2154845780 @default.
- W2023172335 cites W2165571729 @default.
- W2023172335 cites W2168766899 @default.
- W2023172335 cites W4235530169 @default.
- W2023172335 doi "https://doi.org/10.1016/j.ajhg.2012.06.015" @default.
- W2023172335 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3415544" @default.
- W2023172335 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/22883142" @default.
- W2023172335 hasPublicationYear "2012" @default.
- W2023172335 type Work @default.
- W2023172335 sameAs 2023172335 @default.
- W2023172335 citedByCount "148" @default.
- W2023172335 countsByYear W20231723352013 @default.
- W2023172335 countsByYear W20231723352014 @default.
- W2023172335 countsByYear W20231723352015 @default.
- W2023172335 countsByYear W20231723352016 @default.
- W2023172335 countsByYear W20231723352017 @default.
- W2023172335 countsByYear W20231723352018 @default.
- W2023172335 countsByYear W20231723352019 @default.
- W2023172335 countsByYear W20231723352020 @default.
- W2023172335 countsByYear W20231723352021 @default.
- W2023172335 countsByYear W20231723352022 @default.
- W2023172335 countsByYear W20231723352023 @default.
- W2023172335 crossrefType "journal-article" @default.
- W2023172335 hasAuthorship W2023172335A5006007183 @default.
- W2023172335 hasAuthorship W2023172335A5076190364 @default.
- W2023172335 hasAuthorship W2023172335A5091321729 @default.
- W2023172335 hasBestOaLocation W20231723351 @default.
- W2023172335 hasConcept C104317684 @default.
- W2023172335 hasConcept C135763542 @default.
- W2023172335 hasConcept C154945302 @default.
- W2023172335 hasConcept C197754878 @default.
- W2023172335 hasConcept C2549261 @default.
- W2023172335 hasConcept C3017739461 @default.
- W2023172335 hasConcept C41008148 @default.
- W2023172335 hasConcept C54355233 @default.
- W2023172335 hasConcept C78458016 @default.
- W2023172335 hasConcept C81917197 @default.
- W2023172335 hasConcept C86803240 @default.
- W2023172335 hasConcept C95457728 @default.
- W2023172335 hasConceptScore W2023172335C104317684 @default.
- W2023172335 hasConceptScore W2023172335C135763542 @default.
- W2023172335 hasConceptScore W2023172335C154945302 @default.
- W2023172335 hasConceptScore W2023172335C197754878 @default.
- W2023172335 hasConceptScore W2023172335C2549261 @default.
- W2023172335 hasConceptScore W2023172335C3017739461 @default.
- W2023172335 hasConceptScore W2023172335C41008148 @default.
- W2023172335 hasConceptScore W2023172335C54355233 @default.
- W2023172335 hasConceptScore W2023172335C78458016 @default.
- W2023172335 hasConceptScore W2023172335C81917197 @default.
- W2023172335 hasConceptScore W2023172335C86803240 @default.
- W2023172335 hasConceptScore W2023172335C95457728 @default.
- W2023172335 hasIssue "2" @default.
- W2023172335 hasLocation W20231723351 @default.
- W2023172335 hasLocation W20231723352 @default.
- W2023172335 hasLocation W20231723353 @default.
- W2023172335 hasLocation W20231723354 @default.
- W2023172335 hasOpenAccess W2023172335 @default.
- W2023172335 hasPrimaryLocation W20231723351 @default.