Matches in SemOpenAlex for { <https://semopenalex.org/work/W3208277265> ?p ?o ?g. }
Showing items 1 to 59 of
59
with 100 items per page.
- W3208277265 abstract "Article Figures and data Abstract eLife digest Introduction Results Discussion Materials and methods Data availability References Decision letter Author response Article and author information Metrics Abstract Virophages can parasitize giant DNA viruses and may provide adaptive anti-giant virus defense in unicellular eukaryotes. Under laboratory conditions, the virophage mavirus integrates into the nuclear genome of the marine flagellate Cafeteria burkhardae and reactivates upon superinfection with the giant virus CroV. In natural systems, however, the prevalence and diversity of host-virophage associations has not been systematically explored. Here, we report dozens of integrated virophages in four globally sampled C. burkhardae strains that constitute up to 2% of their host genomes. These endogenous mavirus-like elements (EMALEs) separated into eight types based on GC-content, nucleotide similarity, and coding potential and carried diverse promoter motifs implicating interactions with different giant viruses. Between host strains, some EMALE insertion loci were conserved indicating ancient integration events, whereas the majority of insertion sites were unique to a given host strain suggesting that EMALEs are active and mobile. Furthermore, we uncovered a unique association between EMALEs and a group of tyrosine recombinase retrotransposons, revealing yet another layer of parasitism in this nested microbial system. Our findings show that virophages are widespread and dynamic in wild Cafeteria populations, supporting their potential role in antiviral defense in protists. eLife digest Viruses exist in all ecosystems in vast numbers and infect many organisms. Some of them are harmful but others can protect the organisms they infect. For example, a group of viruses called virophages protect microscopic sea creatures called plankton from deadly infections by so-called giant viruses. In fact, virophages need plankton infected with giant viruses to survive because they use enzymes from the giant viruses to turn on their own genes. A virophage called mavirus integrates its genes into the DNA of a type of plankton called Cafeteria. It lays dormant in the DNA until a giant virus called CroV infects the plankton. This suggests that the mavirus may be a built-in defense against CroV infections and laboratory studies seem to confirm this. But whether wild Cafeteria also use these defenses is unknown. Hackl et al. show that virophages are common in the DNA of wild Cafeteria and that the two appear to have a mutually beneficial relationship. In the experiments, the researchers sequenced the genomes of four Cafeteria populations from the Atlantic and Pacific Oceans and looked for virophages in their DNA. Each of the four Cafeteria genomes contained dozens of virophages, which suggests that virophages are important to these plankton. This included several relatives of the mavirus and seven new virophages. Virophage genes were often interrupted by so called jumping genes, which may take advantage of the virophages the way the virophages use giant viruses to meet their own needs. The experiments show that virophages often co-exist with marine plankton from around the world and these relationships are likely beneficial. In fact, the experiments suggest that the virophages may have played an important role in the evolution of these plankton. Further studies may help scientists learn more about virus ecology and how viruses have shaped the evolution of other creatures. Introduction Many eukaryotic genomes harbor endogenous viral elements (EVEs) (Feschotte and Gilbert, 2012). For retroviruses, integration as a provirus is an essential part of their replication cycles, but other viruses also occasionally endogenize, for instance with the help of cellular retroelements (Holmes, 2011). Some green algal genomes even contain giant EVEs of several hundred kilobase pairs (kbp) in length (Moniruzzaman et al., 2020), but unlike prophages in bacteria and archaea, most eukaryotic EVEs are thought to be ‘genomic fossils’ and incapable of virion formation and horizontal transmission. However, some viral genes may be co-opted for various host functions (Frank and Feschotte, 2017; Aswad and Katzourakis, 2012). In recent years, the exploration of protist-infecting giant viruses has uncovered a novel class of associated smaller DNA viruses with diverse and unprecedented genome integration capabilities. Viruses of the family Lavidaviridae, commonly known as virophages, depend for their replication on giant DNA viruses of the family Mimiviridae and can parasitize them during coinfection of a suitable protist host (La Scola et al., 2008; Krupovic et al., 2016a; Duponchel and Fischer, 2019). A striking example is the virophage mavirus, which strongly inhibits virion synthesis of the lytic giant virus CroV during coinfection of the marine heterotrophic nanoflagellate Cafeteria sp. (Stramenopiles; Bicosoecida) (Fischer and Suttle, 2011; Fischer and Hackl, 2016). Virophages possess 15–30 kbp long double-stranded (ds) DNA genomes of circular or linear topology that tend to have low GC-contents (27–51%) (Fischer, 2020). A typical virophage genome encodes 20–30 proteins, including a major capsid protein (MCP), a minor capsid or penton protein (PEN), a DNA packaging ATPase, and a maturation cysteine protease (PRO) (Krupovic et al., 2016a). In addition to this conserved morphogenesis module, virophages encode DNA replication and integration proteins that were likely acquired independently in different virophage lineages (Yutin et al., 2013). Viruses in the genus Mavirus contain a rve-family integrase (rve-INT) that is also found in retrotransposons and retroviruses, with close homologs among the eukaryotic Maverick/Polinton elements (MPEs) (Fischer and Suttle, 2011). MPEs were initially described as DNA transposons (Pritham et al., 2007; Kapitonov and Jurka, 2006), but many of them carry the morphogenesis gene module and thus qualify as endogenous viruses (Krupovic et al., 2014). Phylogenetic analysis suggests that mavirus-type virophages share a common ancestry with MPEs and the related Polinton-like viruses (PLVs) (Fischer and Suttle, 2011; Yutin et al., 2013). We therefore tested the integration capacity of mavirus using the cultured protist Cafeteria burkhardae (formerly Cafeteria roenbergensis; Fenchel and Patterson, 1988; Schoenle et al., 2020) and found that mavirus integrates efficiently into the nuclear host genome (Fischer and Hackl, 2016). The resulting mavirus provirophages are transcriptionally silent unless the host cell is infected with CroV, which leads to reactivation and virion formation of mavirus. Newly produced virophage particles then inhibit CroV replication and increase host population survival during subsequent rounds of coinfection (Fischer and Hackl, 2016). The mutualistic Cafeteria-mavirus symbiosis may thus act as an adaptive defense system against lytic giant viruses (Fischer and Hackl, 2016; Koonin and Krupovic, 2016). The integrated state of mavirus is pivotal to the proposed defense scheme as it represents the host’s indirect antigenic memory of CroV (Koonin and Krupovic, 2017). We therefore investigated endogenous virophages to assess the prevalence and potential significance of virophage-mediated defense systems in natural protist populations. Here, we report that mavirus-like EVEs are common, diverse, and most likely active mobile genetic elements (MGEs) of C. burkhardae. Our results suggest an influential role of these viruses on the ecology and evolution of their bicosoecid hosts. Results Endogenous virophages are abundant in Cafeteria genomes In preparation of screening for endogenous virophages, we generated high-quality de novo genome assemblies of four cultured C. burkhardae strains (Hackl et al., 2020). These strains, designated BVI, Cflag, E4-10P (E4-10), and RCC970-E3 (RCC970), were isolated from the Caribbean Sea in 2012, the Northwest Atlantic in 1986, the Northeast Pacific in 1989, and the Southeast Pacific in 2004, respectively. We sequenced their genomes using both short-read (Illumina MiSeq) and long-read (Pacific Biosciences RSII) technologies in order to produce assemblies that would resolve 20–30 kb long repetitive elements within the host genomic context. Each C. burkhardae genome assembly comprised of 34–36 megabase pairs with an average GC-content of 70% (Hackl et al., 2020). To identify endogenous virophages, we combined sequence similarity searches against known virophage genomes with genomic screening for GC-content anomalies. The two approaches yielded redundant results and virophage elements were clearly discernible from eukaryotic genome regions based on their low (30–50%) GC-content (Figure 1A). Each element had at least one open reading frame (ORF) with a top blastp hit to a mavirus protein, with no elements bearing close resemblance to Sputnik or other virophages outside the genus Mavirus. In the four Cafeteria genomes combined, we found 138 endogenous mavirus-like elements (EMALEs, Figure 1B and C; Figure 1—figure supplement 1, Supplementary file 1). Thirty-three of these elements were flanked by terminal inverted repeats (TIRs) and host DNA and can thus be considered full-length viral genomes. Figure 1 with 1 supplement see all Download asset Open asset Endogenous virophages in Cafeteria burkhardae. (A) GC-content graph signature of a virophage element embedded in a high-GC host genome. Shown is a region of contig BVI_c002 featuring an integrated virophage (pink box) flanked by host sequences. (B) Location of partial or complete virophage genomes and Ngaro retrotransposons in the genome assemblies of C. burkhardae strain BVI (see Figure 1—figure supplement 1 for all four strains). Horizontal lines represent contigs of decreasing length ordered from left to right and from top to bottom, with numbers shown for the first contig of each line; colored boxes indicate endogenous mavirus-like elements (EMALEs). Fully assembled elements are framed in black. Ngaro retrotransposon positions are marked by black symbols; open symbols indicate Ngaros integrated inside a virophage element. (C) Graphic summary of the number and types of all EMALEs identified in each of the four C. burkhardae strains. (D) Nucleotide contributions of EMALEs and Ngaros to Cafeteria genomes. Fractions for each strain are computed based on nucleotides in the assembly (left) and nucleotides in the reads (right) mapping to the different parts of the assembly. The remainder were partial virophage genomes that were located at contig ends or on short contigs. These cases arise from incomplete assembly rather than from biological truncations, since the assembly algorithm probably terminated due to the presence of multiple identical or highly similar EMALEs within the same host genome – a well-known issue for repetitive sequences (Kolmogorov et al., 2019). With 55 elements, C. burkhardae strain BVI contained nearly twice as many EMALEs as any of the other strains, where we found 27–29 elements per genome (Figure 1C, Supplementary file 1). Compared to the total assembly length, EMALEs composed an estimated 0.7–1.8% of each host assembly (Figure 1D). Contributions calculated from assemblies deviated only slightly (0.01–0.3%) from read-based calculations. Therefore, the assemblies seem to provide a good representation of the actual contribution of EMALEs to the overall host genomes. EMALEs are genetically diverse From here on, we focus our analysis on the 33 complete EMALE genomes, which were 5.5–21.5 kb long with a median length of 19.8 kb, and TIRs that varied in length from 0.2 to 2.3 kb with a median of 0.9 kb (Supplementary file 1, Figure 3—figure supplement 1). Their GC-contents ranged from 29.7% to 52.7%, excluding retrotransposon insertions where present. To classify EMALEs we used an all-versus-all DNA dot plot approach (Figure 2). It revealed two main blocks: The first block contained EMALEs with GC-contents of 29.7–38.5% (median 35.3%), whereas EMALEs in the second block had GC-contents ranging from 47.2% to 52.7% (median 49.3%). The C. burkhardae EMALEs can thus be roughly separated into low-GC and mid-GC groups. Figure 2 with 2 supplements see all Download asset Open asset Classification of endogenous virophages based on DNA dot plot analysis. The self-versus-self DNA dot plot of concatenated sequences of 33 complete EMALE genomes and mavirus reveals two main block patterns, corresponding to EMALEs with low (29–38%) GC-content and medium (47–53%) GC-content. Smaller block patterns define EMALE types 1–8. EMALE identifiers indicate the host strain and contig number where the respective element is found. Multiple EMALEs on a single contig are distinguished by terminal letters. Elements printed in bold represent the type species shown in Figure 3. Inset: GC-content distribution of complete and partial EMALEs labeled ‘complete: TRUE/FALSE’. Some partial EMALEs were too short for type assignment and are thus inconclusive. Retrotransposon insertions, where present, were removed prior to analysis. Based on the similarity patterns within each block, we further distinguish eight EMALE types, with low-GC EMALEs comprising types 1–4 and mid-GC EMALEs comprising types 5–8 (Figure 2). Representative genome diagrams for each EMALE type are shown in Figure 3, for a schematic of all 33 complete EMALEs, see Figure 3—figure supplement 1. According to this classification scheme, the reference mavirus strain Spezl falls within type 4 of the low-GC EMALEs (Figures 2 and 3, Figure 3—figure supplement 1). Partial EMALEs were classified based on their sequence similarity to full-length type species (Figure 2—figure supplement 1). Figure 3 with 6 supplements see all Download asset Open asset Genome organization of eight EMALE types found in Cafeteria burkhardae. Shown are schematic genome diagrams of the EMALE type species 1–8; for all 33 complete EMALEs, see Figure 3—figure supplement 1. The reference mavirus genome with genes MV01-MV20 is included for comparison. Homologous genes are colored identically; genes sharing functional predictions but lacking sequence similarity to the mavirus homolog are hatched. Open reading frames are numbered individually for each element. Ngaro retrotransposon insertion sites are indicated where present. The dotted line between EMALE01 and EMALE02 separates a homologous region (left) from unrelated DNA sequences (right) and thus indicates the location of a probable recombination event. The codon and amino acid composition of EMALE genes clearly correlated with the overall GC-content of the EMALE genomes (Figure 2—figure supplement 2). For each encoded amino acid, we observed a strong shift toward synonymous codons reflecting the overall GC trend, and across amino acids, we observed a shift from those encoded by high-GC codons to those encoded by low-GC codons in low-GC EMALEs and vice versa. This uniform trend across all amino acids likely indicates that selection and evolutionary processes driving GC-content variation in these viruses act on the nucleotide level, rather than on the encoded proteins. With few exceptions, EMALEs are predicted to encode 17–21 proteins each. None of the encoding genes was found to contain introns. The virion morphogenesis module in EMALE types 1 and 3–7 consists of the canonical virophage core genes corresponding to MCP, PEN, ATPase, and PRO proteins. Type 2 EMALEs likely encode a different set of capsid genes as discussed below, and the truncated EMALE type 8 lacks recognizable morphogenesis genes. Another highly conserved gene in EMALE types 1 and 3–7 is MV14, which is always found immediately upstream of the ATPase (Figure 3, Figure 3—figure supplement 1) and codes for a protein of unknown function that is part of the mavirus virion (Born et al., 2018). MV14 is present in various metagenomic virophage sequences (Paez-Espino et al., 2019) and, based on synteny and protein localization, likely encodes an important virion component in members of the genus Mavirus. The replication/integration module consists of the rve-INT gene and at least one additional ORF coding for a primase/helicase and a DNA polymerase. Low-GC EMALEs encode a mavirus-related primase/helicase and protein-primed family B DNA polymerase (pPolB) (Figure 3, Figure 3—figure supplement 1). Mid-GC EMALEs, on the other hand, lack the pPolB gene and feature a longer primase/helicase ORF that may include a DNA polymerase domain similar to the helicase-polymerase fusion genes described in PLVs (Krupovic et al., 2016b). Other mavirus genes frequently found in EMALEs include MV19 (encoding a putative protease domain), and two genes of unknown function, MV08 and MV12. Interestingly, all mid-GC EMALEs encode a predicted tyrosine recombinase (YR) in addition to the rve-INT and thus possess two predicted enzymes for genome integration. YRs have been found in other virophages and likely catalyze integration into giant virus genomes (Desnues et al., 2012; Yutin et al., 2015). Notable genes unique to one EMALE type include a putative DNA methylase and a ribonucleotide reductase small subunit gene found in EMALE07. The Tlr6F protein encoded by EMALE types 1 + 2 is present in diverse MGEs, including other virophages, PLVs, and large DNA viruses of the phylum Nucleocytoviricota (Koonin and Krupovic, 2017; Stough et al., 2019). In general, genes were syntenic between EMALEs of the same type, whereas gene order was poorly conserved among EMALEs of different types, with the following exceptions: MCP was always preceded by PEN, and ATPase was always preceded by MV14, whereas the MV14-ATPase-PRO-PEN-MCP morphogenesis gene order as seen in mavirus was present only in EMALE types 4–7. EMALE02 represents an interesting case, as it shares 6–7 kb of its 5’ part (we chose the primase/helicase genes to mark the 5’ end of all EMALEs) with EMALE01, while the remaining 11 kb are not closely related to other EMALEs or virophages (Figure 3—figure supplement 2). Genes encoded in the latter region are mostly ORFans, with the exception of an MV12-like gene and divergent MCP and ATPase genes with remote similarity to PLVs (Bellas and Sommaruga, 2021) and adintoviruses (Starrett et al., 2021). EMALE02 may thus be the result of a recombination event that exchanged the canonical virophage morphogenesis module of EMALE01 with capsid genes of a PLV (Figure 3, dashed line). Overall, these observations support the notion that recombination and non-homologous gene replacement are important factors in virophage genome evolution (Yutin et al., 2013). Core gene conservation and non-homologous gene replacement in EMALEs To validate our classification scheme for EMALEs and to place them in a phylogenetic context to other virophages, we used maximum likelihood reconstruction on the core proteins MCP, PEN, ATPase, and PRO, as well as on rve-INT (Figure 4). In the resulting phylogenetic trees, EMALE core proteins formed monophyletic clades with mavirus and related sequences from environmental samples, thus significantly expanding the known diversity of the genus Mavirus. The environmental sequences that clustered with EMALE core proteins include a single amplified genome (SAG) from an uncultured chrysophyte (Castillo et al., 2019), the metagenomic Ace Lake Mavirus (ALM) (Zhou et al., 2013), and four additional metagenomes that were identified in a global survey of virophage sequences (Paez-Espino et al., 2019). The chrysophyte SAG is nearly identical to mavirus strain Spezl and indicates that the host range of mavirus extends beyond bicosoecids. The metagenomic sequences either clustered with one of the EMALE types, or branched separately from them, which suggests the existence of additional sub-groups (e.g. M590M2_1006461). Figure 4 with 1 supplement see all Download asset Open asset Phylogenetic reconstruction of conserved EMALE proteins. Unrooted maximum likelihood trees were constructed from multiple sequence alignments of the four virophage core proteins major capsid protein (MCP), penton protein (PEN), ATPase, and protease (PRO), as well as of the retroviral integrase. Nodes with bootstrap values of 80% or higher are marked with dots. EMALEs are color-coded by type; cultured virophages are printed in bold. ALM, Ace Lake Mavirus; DSLV, Dishui Lake virophage; OLV, Organic Lake virophage; RVP, rumen virophage; TBE/TBH, Trout Bog Lake epi-/hypolimnion; YSLV, Yellowstone Lake virophage. Metagenomic sequences starting with Ga and M590 are derived from Paez-Espino et al., 2019. Within the Mavirus clade, EMALEs of a given type were monophyletic for each of the four core proteins, which corroborates their dot plot-based classification. It is worth noting that although EMALEs of types 5 and 6 are largely syntenic (Figure 3, Figure 3—figure supplement 1), they were clearly distinguishable in their phylogenetic signatures (Figure 4). A comparison of clade topologies revealed that even within the conserved morphogenesis module, individual proteins differed with regard to their neighboring clades, and low-GC and mid-GC EMALEs did not cluster separately from each other. These observations could suggest that the morphogenesis modules of different EMALE types diversified simultaneously and that adaptation of GC-content may occur rather quickly. In contrast, phylogenetic analysis of rve-INT proteins revealed separate clades for low-GC and mid-GC EMALEs (Figure 4). Each of these clades was affiliated with different cellular homologs that included MPEs and retroelements. Notably, the rve-INT genes of low-GC EMALEs were located near the 5’ end of the genomes, whereas in mid-GC EMALEs, they were located near the 3’ end (Figure 3, Figure 3—figure supplement 1). These observations suggest that EMALEs encode two different rve-INT versions, one specific for low-GC EMALEs that co-occurs with the pPolB and a shorter primase/helicase ORF, and one specific for mid-GC EMALEs that co-occurs with a longer primase/helicase ORF. The two integrase versions may have been acquired independently, or one version could have replaced the other during EMALE evolution. Such non-homologous gene replacement appears to have taken place among the primase/helicase genes, too, as previously noted for virophages in general (Yutin et al., 2013). EMALEs encode several different versions of primase/helicase genes with a degree of amino acid divergence that precluded their inclusion in a single multiple sequence alignment. The YR proteins encoded by EMALE types 5–8 formed a monophyletic clade and were part of a larger group of recombinases that included virophages from freshwater metagenomes, as well as microalgae and algal nucleocytoviruses (Figure 4—figure supplement 1). Cafeteria strains differ in their EMALE composition The four C. burkhardae strains displayed distinct EMALE signatures: strain BVI had the highest number of virophage elements with 13 complete and 42 partial EMALEs, whereas the other three strains had 6–7 complete and 20–22 partial EMALEs each (Figure 1C, Supplementary file 1). EMALE types 1, 3, 4, 5, and 6 were present in every host strain, EMALE07 was found in all strains except Cflag, and EMALE types 2 and 8 were detected in strains BVI and E4-10 only. We found no evidence for sequence-specific genome integration of EMALEs after inspecting the host DNA sequences that flanked EMALE integration sites, which confirms previous reports of mavirus integration (Fischer and Hackl, 2016). EMALEs were flanked by target site duplications (TSDs) that were predominantly 3–5 bp in length, although some were as short as 1 bp or as long as 9 bp (Supplementary file 1). By comparison, mavirus and MPEs generate 5–6 bp long TSDs upon integration (Fischer and Hackl, 2016; Pritham et al., 2007; Kapitonov and Jurka, 2006). To assess whether homologous EMALEs were found in identical loci in closely related host genomes, we conducted sequence similarity searches with the flanking regions of each of the 33 fully resolved EMALEs. Whenever these searches returned a homologous full or partial EMALE with at least one matching host flank, we considered the EMALE locus to be conserved in these host strains. We found varying degrees of conservation, with examples shown in Figure 3—figure supplement 3. In 11 cases, an EMALE insertion was conserved in at least two host strains (Supplementary file 1): three EMALE loci were shared by all four strains, four were shared by three strains, and another four were shared by two strains. Based on conserved EMALE loci, strains Cflag and RCC970 were most closely related with nine shared EMALE integrations, which is in line with phylogenetic and average nucleotide identity (ANI) analyses of these strains (Hackl et al., 2020). The four C. burkhardae genomes have ANIs of >99% and thus appear to differ mostly based on their content of EMALEs and other MGEs. The most parsimonious scenario for the origin of EMALEs that are located in identical loci in different host strains is that they derived from a single integration event. For instance, EMALE03 BVI_101 is orthologous to Cflag_017C and RCC970_016A (Figure 3—figure supplement 3C), which suggests that this element initially colonized the common ancestor of C. burkhardae strains BVI, Cflag, and RCC970. Further cases of redundant EMALEs are Cflag_017B & RCC970_016B (EMALE01) and BVI_029 & RCC970_095 (EMALE06). These elements may thus derive from relatively ancient integration events, whereas 18 of the 33 complete EMALEs represent integrations that were unique to a single host strain (Supplementary file 1). Strain BVI contained 10 of these 18 unique integrations, more than twice as many as any other strain. The genomic landscape around EMALE integration sites ranged from repeat-free flanking regions to complex host repeats (Figure 3—figure supplement 4). Of the 29 different integration sites represented by the 33 fully resolved EMALEs, 18 were located near repetitive host DNA (within 10 kb from the insertion site). These repeats, in addition to EMALE TIRs, multiple copies of the same EMALE type, and the putative heterozygosity of EMALE insertions, occasionally caused assembly problems, as illustrated in Figure 3—figure supplement 3. Next, we analyzed whether EMALE insertions interrupted coding sequences of the host. Fifteen integration sites were located within a predicted host gene (13 in exons, 2 in introns), four were found in predicted 3’ untranslated regions, and three were located in intergenic regions (Supplementary file 1). These data show that EMALE insertions may disrupt eukaryotic genes with potential negative consequences for the host. The apparent preference for integration in coding regions could be assembly related, driven by increased accessibility of euchromatin, or linked to host factors that could direct the rve-INT via its CHROMO domain (Gao et al., 2008). EMALEs are predicted to be functional and mobile Based on genomic features such as coding potential, ORF integrity, and host distribution, most EMALEs appear to be active MGEs. With the exception of EMALE08 and EMALE02, all endogenous Cafeteria virophages encode the canonical morphogenesis gene module consisting of MCP, PEN, ATPase, PRO, as well as MV14. EMALE02 likely encodes more distantly related capsid genes. Therefore, all EMALE types except EMALE08 should be autonomous for virion formation. In addition, all EMALEs contain at least one predicted enzyme for genome integration, an rve-INT in EMALE types 1–7 and a YR in EMALE types 5–8. EMALEs thus encode the enzymatic repertoire for colonizing new host genomes. Finally, the high variability of EMALE integration loci among otherwise closely related host strains strongly argues for ongoing colonization of natural Cafeteria populations by virophages. The genomic similarity to mavirus implies that EMALEs may also depend on a giant virus for activation and horizontal transmission. Shared regulatory sequences in virophages and their respective giant viruses suggest that the molecular basis of virophage activation lies in the recognition of virophage gene promoters by giant virus encoded transcription factors (Fischer and Suttle, 2011; Claverie and Abergel, 2009; Legendre et al., 2010). We therefore analyzed the 100 nt upstream regions of EMALE ORFs for conserved sequence motifs using MEME (Bailey et al., 2009). For all type 4 EMALEs, which include mavirus, we recovered the previously described mavirus promotor motif ‘TCTA’, flanked by AT-rich regions. This motif corresponds to the conserved late gene promoter in CroV (Fischer and Suttle, 2011; Fischer et al., 2010), thus possibly indicating that all type 4 EMALEs could be reactivated by CroV or close relatives. EMALEs of other types lacked the ‘TCTA’ motif, but contained putative promoter sequences that may be compatible with different giant viruses (Figure 3—figure supplement 5). MGEs are prone to various decay processes including pseudogenization, recombination, and truncation. Among the 33 fully resolved EMALEs are three truncated elements: Cflag_215 and RCC970_122 (both EMALE04), and BVI_005 (EMALE08) (Figure 3—figure supplement 1). Interestingly, even these shorter elements are flanked by TIRs, which must have regenerated after the truncation event. Whereas most EMALE ORFs appeared to be intact, as judged by comparison with homologous genes on syntenic elements, several EMALEs contained fragmented ORFs (e.g. ATPase and PEN genes in EMALE04 BVI_055B, Figure 3). To test whether premature stop codons may be the result of assembly artifacts, we amplified selected EMALEs by PCR and analyzed the products using Sanger sequencing. When we compared the Sanger assemblies with the Illumina/PacBio assemblies, we noticed that the latter contai" @default.
- W3208277265 created "2021-11-08" @default.
- W3208277265 creator A5015866218 @default.
- W3208277265 creator A5067784355 @default.
- W3208277265 creator A5076113894 @default.
- W3208277265 creator A5080341519 @default.
- W3208277265 creator A5086184788 @default.
- W3208277265 date "2021-09-24" @default.
- W3208277265 modified "2023-10-16" @default.
- W3208277265 title "Author response: Virophages and retrotransposons colonize the genomes of a heterotrophic flagellate" @default.
- W3208277265 doi "https://doi.org/10.7554/elife.72674.sa2" @default.
- W3208277265 hasPublicationYear "2021" @default.
- W3208277265 type Work @default.
- W3208277265 sameAs 3208277265 @default.
- W3208277265 citedByCount "1" @default.
- W3208277265 crossrefType "peer-review" @default.
- W3208277265 hasAuthorship W3208277265A5015866218 @default.
- W3208277265 hasAuthorship W3208277265A5067784355 @default.
- W3208277265 hasAuthorship W3208277265A5076113894 @default.
- W3208277265 hasAuthorship W3208277265A5080341519 @default.
- W3208277265 hasAuthorship W3208277265A5086184788 @default.
- W3208277265 hasBestOaLocation W32082772651 @default.
- W3208277265 hasConcept C104317684 @default.
- W3208277265 hasConcept C141231307 @default.
- W3208277265 hasConcept C2776028635 @default.
- W3208277265 hasConcept C4918238 @default.
- W3208277265 hasConcept C54355233 @default.
- W3208277265 hasConcept C59822182 @default.
- W3208277265 hasConcept C7029365 @default.
- W3208277265 hasConcept C70721500 @default.
- W3208277265 hasConcept C78458016 @default.
- W3208277265 hasConcept C86803240 @default.
- W3208277265 hasConceptScore W3208277265C104317684 @default.
- W3208277265 hasConceptScore W3208277265C141231307 @default.
- W3208277265 hasConceptScore W3208277265C2776028635 @default.
- W3208277265 hasConceptScore W3208277265C4918238 @default.
- W3208277265 hasConceptScore W3208277265C54355233 @default.
- W3208277265 hasConceptScore W3208277265C59822182 @default.
- W3208277265 hasConceptScore W3208277265C7029365 @default.
- W3208277265 hasConceptScore W3208277265C70721500 @default.
- W3208277265 hasConceptScore W3208277265C78458016 @default.
- W3208277265 hasConceptScore W3208277265C86803240 @default.
- W3208277265 hasLocation W32082772651 @default.
- W3208277265 hasOpenAccess W3208277265 @default.
- W3208277265 hasPrimaryLocation W32082772651 @default.
- W3208277265 hasRelatedWork W2002894090 @default.
- W3208277265 hasRelatedWork W2024251716 @default.
- W3208277265 hasRelatedWork W2070598597 @default.
- W3208277265 hasRelatedWork W2106760734 @default.
- W3208277265 hasRelatedWork W2110035093 @default.
- W3208277265 hasRelatedWork W2129682738 @default.
- W3208277265 hasRelatedWork W2152222332 @default.
- W3208277265 hasRelatedWork W2263391408 @default.
- W3208277265 hasRelatedWork W2981796144 @default.
- W3208277265 hasRelatedWork W3010826188 @default.
- W3208277265 isParatext "false" @default.
- W3208277265 isRetracted "false" @default.
- W3208277265 magId "3208277265" @default.
- W3208277265 workType "peer-review" @default.