Matches in SemOpenAlex for { <https://semopenalex.org/work/W3120727991> ?p ?o ?g. }
- W3120727991 endingPage "266.e8" @default.
- W3120727991 startingPage "250" @default.
- W3120727991 abstract "•107 microalgae from diverse phlya and environments were cultured and sequenced•Membrane-related protein families were significantly enriched in saltwater species•Nuclear-related protein families were significantly enriched in freshwater species•More than 90,000 viral-origin sequences in 184 algal genomes were discovered Being integral primary producers in diverse ecosystems, microalgal genomes could be mined for ecological insights, but representative genome sequences are lacking for many phyla. We cultured and sequenced 107 microalgae species from 11 different phyla indigenous to varied geographies and climates. This collection was used to resolve genomic differences between saltwater and freshwater microalgae. Freshwater species showed domain-centric ontology enrichment for nuclear and nuclear membrane functions, while saltwater species were enriched in organellar and cellular membrane functions. Further, marine species contained significantly more viral families in their genomes (p = 8e–4). Sequences from Chlorovirus, Coccolithovirus, Pandoravirus, Marseillevirus, Tupanvirus, and other viruses were found integrated into the genomes of algal from marine environments. These viral-origin sequences were found to be expressed and code for a wide variety of functions. Together, this study comprehensively defines the expanse of protein-coding and viral elements in microalgal genomes and posits a unified adaptive strategy for algal halotolerance. Being integral primary producers in diverse ecosystems, microalgal genomes could be mined for ecological insights, but representative genome sequences are lacking for many phyla. We cultured and sequenced 107 microalgae species from 11 different phyla indigenous to varied geographies and climates. This collection was used to resolve genomic differences between saltwater and freshwater microalgae. Freshwater species showed domain-centric ontology enrichment for nuclear and nuclear membrane functions, while saltwater species were enriched in organellar and cellular membrane functions. Further, marine species contained significantly more viral families in their genomes (p = 8e–4). Sequences from Chlorovirus, Coccolithovirus, Pandoravirus, Marseillevirus, Tupanvirus, and other viruses were found integrated into the genomes of algal from marine environments. These viral-origin sequences were found to be expressed and code for a wide variety of functions. Together, this study comprehensively defines the expanse of protein-coding and viral elements in microalgal genomes and posits a unified adaptive strategy for algal halotolerance. Viruses of microbial eukaryotes have extreme diversity and considerable influence in local (Correa et al., 2016Correa A.M. Ainsworth T.D. Rosales S.M. Thurber A.R. Butler C.R. Vega Thurber R.L. Viral outbreak in corals associated with an in situ bleaching event: atypical herpes-like viruses and a new megavirus infecting Symbiodinium.Front. Microbiol. 2016; 7: 127Crossref PubMed Scopus (40) Google Scholar) and global (Gregory et al., 2019Gregory A.C. Zayed A.A. Conceição-Neto N. Temperton B. Bolduc B. Alberti A. Ardyna M. Arkhipova K. Carmichael M. Cruaud C. et al.Marine DNA viral macro- and microdiversity from pole to pole.Cell. 2019; 177: 1109-1123.e14Abstract Full Text Full Text PDF PubMed Scopus (86) Google Scholar) ecosystems. Nearly 200,000 marine viral populations have been found to be active and segregate through five distinct zones throughout global oceans (Gregory et al., 2019Gregory A.C. Zayed A.A. Conceição-Neto N. Temperton B. Bolduc B. Alberti A. Ardyna M. Arkhipova K. Carmichael M. Cruaud C. et al.Marine DNA viral macro- and microdiversity from pole to pole.Cell. 2019; 177: 1109-1123.e14Abstract Full Text Full Text PDF PubMed Scopus (86) Google Scholar). Viral infection of microalgae can influence oceanic aerosol release, including dimethyl sulfide (DMS) (Trainic et al., 2018Trainic M. Koren I. Sharoni S. Frada M. Segev L. Rudich Y. Vardi A. Infection Dynamics of a Bloom-Forming Alga and Its Virus Determine Airborne coccolith Emission from Seawater.iScience. 2018; 6: 327-335Abstract Full Text Full Text PDF PubMed Scopus (0) Google Scholar) and atmospheric gas composition (Bonetti et al., 2019Bonetti G. Trevathan-Tackett S.M. Carnell P.E. Macreadie P.I. Implication of viral infections for greenhouse gas dynamics in freshwater wetlands: challenges and perspectives.Front. Microbiol. 2019; 10: 1962Crossref PubMed Scopus (1) Google Scholar), via rapid destruction of microalgal populations (Sorensen et al., 2009Sorensen G. Baker A.C. Hall M.J. Munn C.B. Schroeder D.C. Novel virus dynamics in an Emiliania huxleyi bloom.J. Plankton Res. 2009; 31: 787-791Crossref PubMed Scopus (14) Google Scholar). Viral-mediated cell death in coccolithophore microalgae leads to massive carbonate depositions from skeleton sedimentation, which significantly contributes to CO2 sequestration (Jover et al., 2014Jover L.F. Effler T.C. Buchan A. Wilhelm S.W. Weitz J.S. The elemental composition of virus particles: implications for marine biogeochemical cycles.Nat. Rev. Microbiol. 2014; 12: 519-528Crossref PubMed Scopus (120) Google Scholar; Nilsson, 2019Nilsson J. Protein phosphatases in the regulation of mitosis.J. Cell Biol. 2019; 218: 395-409Crossref PubMed Scopus (25) Google Scholar; Ruiz et al., 2017Ruiz E. Baudoux A.C. Simon N. Sandaa R.A. Thingstad T.F. Pagarete A. Micromonas versus virus: new experimental insights challenge viral impact.Environ. Microbiol. 2017; 19: 2068-2076Crossref PubMed Scopus (0) Google Scholar; Villain et al., 2016Villain A. Gallot-Lavallée L. Blanc G. Maumus F. Giant viruses at the core of microscopic wars with global impacts.Curr. Opin. Virol. 2016; 17: 130-137Crossref PubMed Scopus (2) Google Scholar). Finally, life support roles of microalgae, including sustaining coral reef species (Correa et al., 2016Correa A.M. Ainsworth T.D. Rosales S.M. Thurber A.R. Butler C.R. Vega Thurber R.L. Viral outbreak in corals associated with an in situ bleaching event: atypical herpes-like viruses and a new megavirus infecting Symbiodinium.Front. Microbiol. 2016; 7: 127Crossref PubMed Scopus (40) Google Scholar) and producing atmospheric oxygen are modulated and occasionally threatened by their viral predators. Despite these roles, pan genomic-level effects of viruses on microalgae from different lineages and environments are unknown. The information on the presence of viral elements in algae is currently tenuous, particularly regarding their distribution in various habitats, including marine and terrestrial ecosystems. Algae are a polyphyletic group of photosynthetic microorganisms that provide foundational nutrients for every biome on Earth and play essential roles in the exchange of molecules, including O2, CO2, and DMS, between the atmosphere and biosphere. Although microalgae are fundamental to global ecosystems and have the potential for sustainable biotechnological development, they have received far less research attention than other microbes. For example, more than 30,000 bacterial genomes, but only 62 algal genomes, have been sequenced (ncbi.nlm.nih.gov). Genome sequences of most algal species are unknown, with sparse representatives from some clades, such as the Chromeridia and Myzozoa, and no representatives from other clades, including Euglenozoa. Algal culture collection centers, including the national center for algae and microbiota (NCMA; ncma.bigelow.org) and the culture for collection of algae at the University of Texas, Austin (UTEX; utex.org), among other centers, host thousands of microalgal species from around the world. The disparity between the availability of algal species in collections and the understanding of their genomic contents can, presently, be resolved through high-throughput sequencing. Expanded genomic sequence databases would allow for statistical resolution of essential questions of microalgal evolution and niche habitation. The viral contribution to algal genomes has not been studied on a large scale, but evidence suggests that viruses have contributed to their hosts’ adaptation to different environments. Gene shuffling between algal hosts and viruses has led to the emergence of giant viruses that incorporate entire biosynthetic pathways, sourced from their algal hosts, into their enormous genomes (Abrahão et al., 2018Abrahão J. Silva L. Silva L.S. Khalil J.Y.B. Rodrigues R. Arantes T. Assis F. Boratto P. Andrade M. Kroon E.G. et al.Tailed giant Tupanvirus possesses the most complete translational apparatus of the known virosphere.Nat. Commun. 2018; 9: 749Crossref PubMed Scopus (97) Google Scholar; Filée, 2015Filée J. Genomic comparison of closely related giant viruses supports an accordion-like model of evolution.Front. Microbiol. 2015; 6: 593PubMed Google Scholar; Filée and Chandler, 2010Filée J. Chandler M. Gene exchange and the origin of giant viruses.Intervirology. 2010; 53: 354-361Crossref PubMed Scopus (47) Google Scholar; Moniruzzaman et al., 2020Moniruzzaman M. Martinez-Gutierrez C.A. Weinheimer A.R. Aylward F.O. Dynamic genome evolution and complex virocell metabolism of globally-distributed giant viruses.Nat. Commun. 2020; 11: 1710Crossref PubMed Scopus (10) Google Scholar; Schulz et al., 2020Schulz F. Roux S. Paez-Espino D. Jungbluth S. Walsh D.A. Denef V.J. McMahon K.D. Konstantinidis K.T. Eloe-Fadrosh E.A. Kyrpides N.C. Woyke T. Giant virus diversity and host interactions through global metagenomics.Nature. 2020; 578: 432-436Crossref PubMed Scopus (23) Google Scholar; Van Etten, 2003Van Etten J.L. Unusual life style of giant chlorella viruses.Annu. Rev. Genet. 2003; 37: 153-195Crossref PubMed Scopus (147) Google Scholar; Van Etten and Meints, 1999Van Etten J.L. Meints R.H. Giant viruses infecting algae.Annu. Rev. Microbiol. 1999; 53: 447-494Crossref PubMed Scopus (0) Google Scholar; Xiao and Rossmann, 2011Xiao C. Rossmann M.G. Structures of giant icosahedral eukaryotic dsDNA viruses.Curr. Opin. Virol. 2011; 1: 101-109Crossref PubMed Scopus (0) Google Scholar; Yamada, 2011Yamada T. Giant viruses in the environment: their origins and evolution.Curr. Opin. Virol. 2011; 1: 58-62Crossref PubMed Scopus (21) Google Scholar; Yoosuf et al., 2012Yoosuf N. Yutin N. Colson P. Shabalina S.A. Pagnier I. Robert C. Azza S. Klose T. Wong J. Rossmann M.G. et al.Related giant viruses in distant locations and different habitats: Acanthamoeba polyphaga moumouvirus represents a third lineage of the Mimiviridae that is close to the megavirus lineage.Genome Biol. Evol. 2012; 4: 1324-1330Crossref PubMed Scopus (90) Google Scholar). As examples, the Klosneuvirus (1.57 Mb), Bodo saltans virus (1.39 Mb), and the Megavirus chilensis (1.23 Mb) genomes encode for 1,000–1,300 proteins each, many that are host-derived. When host specificity expands, viral genes can be transferred to distantly related organisms and confer specific evolutionary adaptations, such as the introduction of new metabolic pathways that facilitate the assimilation of fresh nutrients (Piacente et al., 2014Piacente F. De Castro C. Jeudy S. Molinaro A. Salis A. Damonte G. Bernardi C. Abergel C. Tonetti M.G. Giant virus Megavirus chilensis encodes the biosynthetic pathway for uncommon acetamido sugars.J. Biol. Chem. 2014; 289: 24428-24439Abstract Full Text Full Text PDF PubMed Scopus (10) Google Scholar) or abiotic stress-resistance genes that promote survival in niche habitats. The host specificity of microbial, eukaryotic viruses displays erratic patterns; algal viruses can cross-kingdom boundaries and infect mammals. For example, the Chlorella virus ACTV-1 infects human oropharynx tissues and causes reduced mental performance in mice (Yolken et al., 2014Yolken R.H. Jones-Brando L. Dunigan D.D. Kannan G. Dickerson F. Severance E. Sabunciyan S. Talbot Jr., C.C. Prandovszky E. Gurnon J.R. et al.Chlorovirus ATCV-1 is part of the human oropharyngeal virome and is associated with changes in cognitive functions in humans and mice.Proc. Natl. Acad. Sci. USA. 2014; 111: 16106-16111Crossref PubMed Scopus (67) Google Scholar). A comprehensive survey of viral elements in algal genomes would provide foundational data that are necessary to detect other algal viruses to human transmissions. The recent COVID-19 pandemic is believed to have zoonotic origins, but in many cases, the mechanisms underlying cross-species host specificity are not well understood (Cohen, 2020Cohen J. New coronavirus threat galvanizes scientists.Science. 2020; 367: 492-493Crossref PubMed Scopus (23) Google Scholar; Cohen and Kupferschmidt, 2020aCohen J. Kupferschmidt K. Labs scramble to produce new coronavirus diagnostics.Science. 2020; 367: 727Crossref PubMed Scopus (8) Google Scholar, Cohen and Kupferschmidt, 2020bCohen J. Kupferschmidt K. Strategies shift as coronavirus pandemic looms.Science. 2020; 367: 962-963Crossref PubMed Scopus (31) Google Scholar; Service, 2020Service R.F. Coronavirus epidemic snarls science worldwide.Science. 2020; 367: 836-837Crossref PubMed Scopus (3) Google Scholar). Genomic analysis is essential to discover host specificity (Kupferschmidt, 2020Kupferschmidt K. Genome analyses help track coronavirus' moves.Science. 2020; 367: 1176-1177Crossref PubMed Google Scholar). Additional genomic sequences from new algal species and characterization of extant viral elements in their genomes, as records of past infections, would resolve urgent questions about other possible environmental sources of viruses. Here, we sequenced more than 100 new microalgal genomes from 11 phyla to comprehensively define the expanse of protein-coding and viral elements in microalgal genomes. This study sheds light on microalgal evolution by defining clade- and environment-specific protein-coding and viral genomic elements present in microalgae. To facilitate fundamental and applied research projects using broad subsets of representative microalgal species, we selected and performed whole-genome sequencing on 107 different species of microalgae from the UTEX and NCMA culture collection centers and our New York University Abu Dhabi (NYUAD) isolate collection (Figure 1; Table S1). All genomes assembled de novo in this study from our monocultures are hosted at National Center for Biotechnology Information (NCBI) under the Bioproject accession PRJNA517804 (see Data S1 for assemblies, and Figures 2 and S1, Table S1, and Data S2 for quality controls and reference species validations). A master dataset, “Data S1,” contains sub-Data S1–S13, including all of the data and results produced in this manuscript. All of the data produced in this project are available to the reader as bulk archive downloads, including genome assemblies, predicted CDSs and proteins, protein families (PFAMs) and viral family (VFAM) predictions, as well as VFAM-coding sequence (CDS), endogenous viral-origin Pfams (EVOPs), and other data described in the manuscript.Figure 2Assessment of de novo assemblies and contamination screening by alignment with bacterial proteinsShow full caption(A–D) Based on evolutionary-informed expectations of the gene content of near-universal single-copy orthologs (BUSCO) metrics for the de novo-assembled genomes from short reads (n = 101). The BUSCO metrics are complementary to technical metrics included in the genome QUality ASsessment Tool (QUAST) results (Table S1). Percent complete BUSCO orthologs (x axis) versus percent missing or incomplete genes (y axis) in (A) Platanus and (B) ABySS assemblies. For Platanus, seven assemblies had evidence of poor quality based on BUSCO results and assembly size compared to expected sizes from close relatives. Percent single copy versus BUSCO genes with detected duplicates in (C) Platanus and (D) ABySS de novo assemblies.(E) BLASTP hits of predicted microalgal proteins with bacterial species were used as markers for determining contamination levels in assemblies. We observed a wide range of bacterial BLASTP hit counts, with one assembly, Entomoneis spp., yielding more than 20,000 hits at an E value of <1e-9. Eleven other species had more than 3,000 bacterial hits at this E value cutoff, and this hit count was used as the threshold to remove assemblies that were highly contaminated from downstream comparative analyses.(F) Examples of resequencing validations. Species with reference genomes (Table S1) were independently cultured and sequenced; sequencing reads were assembled de novo, and assembly statistics comparisons are shown. All whole-genome sequencing reads in this study are hosted at NCBI, Bioproject: PRJNA517804.View Large Image Figure ViewerDownload Hi-res image Download (PPT) (A–D) Based on evolutionary-informed expectations of the gene content of near-universal single-copy orthologs (BUSCO) metrics for the de novo-assembled genomes from short reads (n = 101). The BUSCO metrics are complementary to technical metrics included in the genome QUality ASsessment Tool (QUAST) results (Table S1). Percent complete BUSCO orthologs (x axis) versus percent missing or incomplete genes (y axis) in (A) Platanus and (B) ABySS assemblies. For Platanus, seven assemblies had evidence of poor quality based on BUSCO results and assembly size compared to expected sizes from close relatives. Percent single copy versus BUSCO genes with detected duplicates in (C) Platanus and (D) ABySS de novo assemblies. (E) BLASTP hits of predicted microalgal proteins with bacterial species were used as markers for determining contamination levels in assemblies. We observed a wide range of bacterial BLASTP hit counts, with one assembly, Entomoneis spp., yielding more than 20,000 hits at an E value of <1e-9. Eleven other species had more than 3,000 bacterial hits at this E value cutoff, and this hit count was used as the threshold to remove assemblies that were highly contaminated from downstream comparative analyses. (F) Examples of resequencing validations. Species with reference genomes (Table S1) were independently cultured and sequenced; sequencing reads were assembled de novo, and assembly statistics comparisons are shown. All whole-genome sequencing reads in this study are hosted at NCBI, Bioproject: PRJNA517804. Our selection of species was aimed at broad representation across microalgal phyla, with representatives from the Rhodophytes, Chlorophytes, Haptophytes, Cercozoans, Ochrophytes, Dinophyta, Euglenophyta, Heterokonta, Streptophyta, and Chromerida. Microalgae from these phyla vary considerably in size and content (see also Table S1). For example, the symbiont dinoflagellate species that support coral reefs have large genomes (1 Gbps+; Lin et al., 2015Lin S. Cheng S. Song B. Zhong X. Lin X. Li W. Li L. Zhang Y. Zhang H. Ji Z. et al.The Symbiodinium kawagutii genome illuminates dinoflagellate gene expression and coral symbiosis.Science. 2015; 350: 691-694Crossref PubMed Scopus (210) Google Scholar; Shoguchi et al., 2013Shoguchi E. Shinzato C. Kawashima T. Gyoja F. Mungpakdee S. Koyanagi R. Takeuchi T. Hisata K. Tanaka M. Fujiwara M. et al.Draft assembly of the Symbiodinium minutum nuclear genome reveals dinoflagellate gene structure.Curr. Biol. 2013; 23: 1399-1408Abstract Full Text Full Text PDF PubMed Scopus (264) Google Scholar); the free-living picoeukaryotes, which live in the same aquatic environment, have highly condensed genomes (10–13 Mbp, Blanc-Mathieu et al., 2017Blanc-Mathieu R. Krasovec M. Hebrard M. Yau S. Desgranges E. Martin J. Schackwitz W. Kuo A. Salin G. Donnadieu C. et al.Population genomics of picophytoplankton unveils novel chromosome hypervariability.Sci. Adv. 2017; 3: e1700239Crossref PubMed Scopus (25) Google Scholar; Nelson et al., 2019Nelson D.R. Chaiboonchoe A. Fu W. Hazzouri K.M. Huang Z. Jaiswal A. Daakour S. Mystikou A. Arnoux M. Sultana M. Salehi-Ashtiani K. Potential for Heightened sulfur-Metabolic Capacity in Coastal Subtropical Microalgae.iScience. 2019; 11: 450-465Abstract Full Text Full Text PDF PubMed Scopus (0) Google Scholar). Thus, the genomic diversity of microalgae in this work, combined with a shared phenotype (i.e., photosynthetic, eukaryotic, microbial), will provide the needed information to address questions about convergent evolution in photosynthetic microbes. In addition to these newly sequenced algal genomes (n = 107), we used available microalgal genomes from the NCBI and Phytozome to investigate genomic differences between microalgae from different habitats and clades (Table S1). The increased genomic assembly sample size (n = 174) allowed for a statistically significant resolution of comparative genomics questions, such as the impact of viral sequence contribution on microalgal evolution. We find distinct differences in the occurrences of multiple viral elements between salt and freshwater microalgae. Viral sequences correlated with functional domains and differed with phylogeny and environmental conditions, suggesting that they are core, defining features of microalgal lineages. Here, we outline how viral sequence acquisition in diverse microalgal groups underpins niche-specific biological processes, including core saltwater-specific proteins involved in membrane reinforcement. A Pearson's correlation of protein family (PFAM) domain counts by species demonstrated that microalgae from different lineages cluster by their environment irrespective of their phylogenetic affiliation (Figure 1; Data S4 and S5). This result indicated that the convergent evolution of hidden Markov model (HMM)-inferred functions occurred across widely divergent lineages of microalgae. Bi-clustering of PFAM count arrays revealed a subset of genes with high domain copy numbers (dCNs) in saltwater species (Figure 1C). Freshwater species were more diverse with regard to PFAM count distributions and types (Figure 1D). These characteristics were also indicated by the ranges in Pearson's coefficient values and more dispersed clustering in t-distributed stochastic neighbor embedding (tSNE, Figure S2). Freshwater Chlorophytes (FC), including Chlamydomonas, have higher basal nucleotide diversity (Flowers et al., 2015Flowers J.M. Hazzouri K.M. Pham G.M. Rosas U. Bahmani T. Khraiwesh B. Nelson D.R. Jijakli K. Abdrabu R. Harris E.H. et al.Whole-genome resequencing reveals extensive natural variation in the model green alga Chlamydomonas reinhardtii.Plant Cell. 2015; 27: 2353-2369Crossref PubMed Scopus (37) Google Scholar). We found that all freshwater lineages sampled had higher functional diversity. This result is consistent with relaxed selection in freshwater habitats. A similar divergence has been seen in the terrestrial expansion of diatoms from marine habitats into freshwater lakes, rivers, and estuaries (Alverson et al., 2011Alverson A.J. Beszteri B. Julius M.L. Theriot E.C. The model marine diatom Thalassiosira pseudonana likely descended from a freshwater ancestor in the genus Cyclotella.BMC Evol. Biol. 2011; 11: 125Crossref PubMed Scopus (51) Google Scholar; Alverson and Theriot, 2005Alverson A.J. Theriot E.C. Comments on recent progress toward reconstructing the diatom phylogeny.J. Nanosci. Nanotechnol. 2005; 5: 57-62Crossref PubMed Scopus (17) Google Scholar; Nakov et al., 2018bNakov T. Beaulieu J.M. Alverson A.J. Insights into global planktonic diatom diversity: the importance of comparisons between phylogenetically equivalent units that account for time.ISME J. 2018; 12: 2807-2810Crossref PubMed Scopus (5) Google Scholar; Onyshchenko et al., 2019Onyshchenko A. Ruck E.C. Nakov T. Alverson A.J. A single loss of photosynthesis in the diatom order Bacillariales (Bacillariophyta).Am. J. Bot. 2019; 106: 560-572Crossref PubMed Scopus (1) Google Scholar). For the saltwater species, our results suggested that constriction by ionic conditions forced convergence toward the PFAMs with increased dCNs shown in Figure 1C. Every member of this subset of PFAMs examined in a Uniform Manifold Approximation Projection (UMAP [McInnes et al., 2018McInnes L. Healy J. Melville J. UMAP: uniform manifold approximation and projection for dimension reduction.arXiv. 2018; https://arxiv.org/abs/1802.03426Google Scholar]) had at least one corresponding viral-related neighbor (Figure S3). We used HMMs to survey CDSs for VFAM domains that may be independent proteins or partial internal domains of proteins with incorporated cellular roles with an evolutionary relationship to origin (Skewes-Cox et al., 2014Skewes-Cox P. Sharpton T.J. Pollard K.S. DeRisi J.L. Profile hidden Markov models for the detection of viruses within metagenomic sequence data.PLoS One. 2014; 9: e105067Crossref PubMed Scopus (56) Google Scholar) (Figures 3 and 4; Table S2). This approach was used to describe new viral ecology dynamics for humans (Bzhalava et al., 2018Bzhalava Z. Hultin E. Dillner J. Extension of the viral ecology in humans using viral profile hidden Markov models.PLoS One. 2018; 13: e0190938Crossref PubMed Scopus (8) Google Scholar) and may potentially be used to address new viral threats without established solutions (Cohen and Kupferschmidt, 2020aCohen J. Kupferschmidt K. Labs scramble to produce new coronavirus diagnostics.Science. 2020; 367: 727Crossref PubMed Scopus (8) Google Scholar). In this study, we used it to find differences in viral element content among microalgae from different lineages and habitats. Multiple lineages of microalgae were found to contain remnants of infections from virus families (Figure 3; Data S6 and S7). These sequence records show that the host range for these virus families may be broader than previously appreciated. Each algal phylum had distinct collections of VFAMs that clustered according to Pearson’s correlation scores from domain counts (Figure 3). Clades whose members share environmental niches, such as the open ocean (Chlorophyts and Ochrophytes [e.g., picoeukaryotes and diatoms]) or within corals (Myzozoa and Chromerida [e.g., Dinoflagellates and Chromerids]), clustered together according to their VFAM domain counts. Genes containing shared VFAMs (VFAM-CDSs) often had high-confidence BLAST matches to their counterparts in unrelated clades. For example, B-type asparagine synthetase chloroplast precursor (Micromonas pusilla CCMP1545) included homologs from Porphyridium purpureum (Rhodophyte, E value = 3.00E-120), Porphyra umbilicalis (Rhodophyte, Evalue=5.00E-113), Emiliania huxleyi CCMP1516 (Coccolithophore, E value = 9.00E-102), and Fragilariopsis cylindrus CCMP1102 (Diatom, E value = 5.00E-101). This CDS had vFam_1034 (E value = 6.7E-77), which includes highly similar sequences from Micromonas sp. RCC1109 virus MpV1, Cafeteria roenbergensis virus BV-PW1, and Phaeocystis globosa virus. Our data suggest that unrelated species have acquired virus-sourced genes as a result of competition in shared niches. Our set of 91,757 VFAM-CDS revealed only 720 proteins with putative roles in DNA transposition (0.78%). Many transposons in extant multicellular organisms, especially in plants, are thought to have viral origins (Galindo-González et al., 2017Galindo-González L. Mhiri C. Deyholos M.K. Grandbastien M.A. LTR-retrotransposons in plants: engines of evolution.Gene. 2017; 626: 14-25Crossref PubMed Scopus (0) Google Scholar; Gao et al., 2018Gao D. Chu Y. Xia H. Xu C. Heyduk K. Abernathy B. Ozias-Akins P. Leebens-Mack J.H. Jackson S.A. Horizontal transfer of non-LTR retrotransposons from arthropods to flowering plants.Mol. Biol. Evol. 2018; 35: 354-364Crossref PubMed Scopus (14) Google Scholar; Krupovic and Koonin, 2015Krupovic M. Koonin E.V. Polintons: a hotbed of eukaryotic virus, transposon and plasmid evolution.Nat. Rev. Microbiol. 2015; 13: 105-115Crossref PubMed Scopus (86) Google Scholar; Ochoa Cruz et al., 2016Ochoa Cruz E.A. Cruz G.M. Vieira A.P. Van Sluys M.A. Virus-like attachment sites as structural landmarks of plants retrotransposons.Mob. DNA. 2016; 7: 14Crossref PubMed Scopus (1) Google Scholar; Ustyantsev et al., 2017Ustyantsev K. Blinov A. Smyshlyaev G. Convergence of retrotransposons in oomycetes and plants.Mob. DNA. 2017; 8: 4Crossref PubMed Scopus (6) Google Scholar; Woodrow et al., 2012Woodrow P. Ciarmiello L.F. Fantaccione S. Annunziata M.G. Pontecorvo G. Carillo P. Ty1-copia group retrotransposons and the evolution of retroelements in several angiosperm plants: evidence of horizontal transmission.Bioinformation. 2012; 8: 267-271Crossref PubMed Google Scholar). Phylogenies created from these proteins and high-confidence homologs in NCBI (NR and Viral databases) revealed genetic exchange among species from unrelated clades. For example, a Chlorella autotrophica protein (scaffold1110_cov163-Chlorella_autotrophica.AAC.7) coding for a putative multifunctional polymerase/reverse transcriptase aligned with a homolog from Chloroflexi (a green non-sulfur bacterium prone to horizontal gene transfers [HGT]). A homolog of this gene was also found in the genome of Chlamydomonas nivalis (scaffold12649_cov40-Chlamydomonas_nivalis.AAC.1), an alpine snow alga, among other green algae. Both of these predicted proteins had VFAMs 3,011, 37, and 3,987, indicating that this putative multifunctional protein is conserved in unrelated viral lineages. We used transcriptomic data from the marine microbial eukaryote transcriptome sequencing project (MMETSP [Keeling et al., 2014Keeling P.J. Burki F. Wilcox H.M. Allam B. Allen E.E. Amaral-Zettler L.A. Armbrust E.V. Archibald J.M. Bharti A.K. Bell C.J. et al.The marine microbial eukaryote transcriptome sequencing project (MMETSP): illuminating the functional diversity of eukaryotic life in the oceans through transcriptome sequencing.PLoS Biol. 2014; 12: e1001889Crossref PubMed Scopus (448) Google Scholar]) to validate the presence of VFAMs in expressed CDSs (Data S7). The analysis of the expression data from the MMETSP (Keeling et al., 2014Keeling P.J. Burki F. Wilcox" @default.
- W3120727991 created "2021-01-18" @default.
- W3120727991 creator A5022616979 @default.
- W3120727991 creator A5024112921 @default.
- W3120727991 creator A5028681888 @default.
- W3120727991 creator A5036694835 @default.
- W3120727991 creator A5040863408 @default.
- W3120727991 creator A5054138207 @default.
- W3120727991 creator A5055227566 @default.
- W3120727991 creator A5056245407 @default.
- W3120727991 creator A5066015469 @default.
- W3120727991 creator A5072537686 @default.
- W3120727991 creator A5073156813 @default.
- W3120727991 creator A5080080275 @default.
- W3120727991 creator A5081898882 @default.
- W3120727991 creator A5082802418 @default.
- W3120727991 creator A5082847503 @default.
- W3120727991 creator A5083969402 @default.
- W3120727991 creator A5084184184 @default.
- W3120727991 creator A5087722654 @default.
- W3120727991 date "2021-02-01" @default.
- W3120727991 modified "2023-10-13" @default.
- W3120727991 title "Large-scale genome sequencing reveals the driving forces of viruses in microalgal evolution" @default.
- W3120727991 cites W1533458170 @default.
- W3120727991 cites W1560020441 @default.
- W3120727991 cites W1743864981 @default.
- W3120727991 cites W1869210406 @default.
- W3120727991 cites W1932608348 @default.
- W3120727991 cites W1965076154 @default.
- W3120727991 cites W1965136440 @default.
- W3120727991 cites W1974880781 @default.
- W3120727991 cites W1975231099 @default.
- W3120727991 cites W1981422872 @default.
- W3120727991 cites W1985485809 @default.
- W3120727991 cites W1987468264 @default.
- W3120727991 cites W2010511029 @default.
- W3120727991 cites W2011657487 @default.
- W3120727991 cites W2015212944 @default.
- W3120727991 cites W2015324360 @default.
- W3120727991 cites W2019619926 @default.
- W3120727991 cites W2020301105 @default.
- W3120727991 cites W2021096035 @default.
- W3120727991 cites W2021795499 @default.
- W3120727991 cites W2024890561 @default.
- W3120727991 cites W2031611770 @default.
- W3120727991 cites W2031784780 @default.
- W3120727991 cites W2033874996 @default.
- W3120727991 cites W2037598647 @default.
- W3120727991 cites W2045204781 @default.
- W3120727991 cites W2059098646 @default.
- W3120727991 cites W2061387503 @default.
- W3120727991 cites W2062018285 @default.
- W3120727991 cites W2071352397 @default.
- W3120727991 cites W2073702132 @default.
- W3120727991 cites W2082819719 @default.
- W3120727991 cites W2084926119 @default.
- W3120727991 cites W2087170833 @default.
- W3120727991 cites W2090146405 @default.
- W3120727991 cites W2094109477 @default.
- W3120727991 cites W2096186766 @default.
- W3120727991 cites W2097435284 @default.
- W3120727991 cites W2103520854 @default.
- W3120727991 cites W2103949232 @default.
- W3120727991 cites W2107772251 @default.
- W3120727991 cites W2113188354 @default.
- W3120727991 cites W2115888213 @default.
- W3120727991 cites W2117354884 @default.
- W3120727991 cites W2120807642 @default.
- W3120727991 cites W2128011546 @default.
- W3120727991 cites W2129353705 @default.
- W3120727991 cites W2133537175 @default.
- W3120727991 cites W2134112959 @default.
- W3120727991 cites W2138032440 @default.
- W3120727991 cites W2140597284 @default.
- W3120727991 cites W2140872496 @default.
- W3120727991 cites W2145573696 @default.
- W3120727991 cites W2147648867 @default.
- W3120727991 cites W2151602856 @default.
- W3120727991 cites W2165439505 @default.
- W3120727991 cites W2167591209 @default.
- W3120727991 cites W2170856052 @default.
- W3120727991 cites W2171173972 @default.
- W3120727991 cites W2172195912 @default.
- W3120727991 cites W2256446069 @default.
- W3120727991 cites W2277953252 @default.
- W3120727991 cites W2295826673 @default.
- W3120727991 cites W2306715975 @default.
- W3120727991 cites W2337925323 @default.
- W3120727991 cites W2401275684 @default.
- W3120727991 cites W2413898437 @default.
- W3120727991 cites W2460813857 @default.
- W3120727991 cites W2462305122 @default.
- W3120727991 cites W2463593716 @default.
- W3120727991 cites W2500306741 @default.
- W3120727991 cites W2515395413 @default.
- W3120727991 cites W2555625912 @default.
- W3120727991 cites W2578857007 @default.
- W3120727991 cites W2579374544 @default.