Matches in SemOpenAlex for { <https://semopenalex.org/work/W4382561729> ?p ?o ?g. }
- W4382561729 endingPage "541" @default.
- W4382561729 startingPage "532" @default.
- W4382561729 abstract "Ribosomally synthesized and post-translationally modified peptides (RiPPs) from microorganisms show high chemical diversity and exhibit potent biological properties.The computational detection of novel classes of RiPPs is hampered by their short length and the lack of universally conserved genes.The high false-positive rate of class-independent computational detection approaches can be addressed by validation via mass spectrometry-based metabolomics. Microorganisms produce a vast array of low-molecular-weight metabolites known as natural products (NPs; see Glossary), also called ‘secondary metabolites' or ‘specialized metabolites’. These molecules are not immediately involved in cell survival but often display potent biological activities, a property used for the development of numerous drugs [1.Newman D.J. Cragg G.M. Natural products as sources of new drugs over the nearly four decades from 01/1981 to 09/2019.J. Nat. Prod. 2020; 83: 770-803Crossref PubMed Scopus (2372) Google Scholar]. A recent large-scale survey estimated that only 3% of NP biosynthetic pathways encoded in bacterial genomes have been experimentally characterized [2.Gavriilidou A. et al.Compendium of specialized metabolite biosynthetic diversity encoded in bacterial genomes.Nat. Microbiol. 2022; 7: 726-735Crossref PubMed Scopus (43) Google Scholar]. Therefore, microorganisms represent still largely untapped sources for NP drug discovery. Among the different classes of microbial NPs, ribosomally synthesized and post-translationally modified peptides (RiPPs) have received special attention due to their exceptionally large biosynthetic diversity [3.Arnison P.G. et al.Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature.Nat. Prod. Rep. 2013; 30: 108-160Crossref PubMed Scopus (1368) Google Scholar]. RiPPs are known for many interesting biological properties, including antibiotic, antiviral, and antineoplastic activities [4.Zhong G. et al.Recent advances in discovery, bioengineering, and bioactivity-evaluation of ribosomally synthesized and post-translationally modified peptides.ACS Bio. Med. Chem. Au. 2023; 3: 1-31Crossref PubMed Scopus (2) Google Scholar]. For example, the recently described RiPP darobactin A (Figure 1A , structure 1) selectively kills Gram-negative bacteria by inhibition of the outer membrane protein BamA. This novel antibiotic mode of action, the first one since the 1960s, represents a promising avenue toward the development of new antibiotics [5.Lewis K. Platforms for antibiotic discovery.Nat. Rev. Drug Discov. 2013; 12: 371-387Crossref PubMed Scopus (987) Google Scholar, 6.Imai Y. et al.A new antibiotic selectively kills Gram-negative pathogens.Nature. 2019; 576: 459-464Crossref PubMed Scopus (321) Google Scholar, 7.Ritzmann N. et al.Monitoring the antibiotic darobactin modulating the β-barrel assembly factor BamA.Structure. 2022; 30: 350-359.e3Abstract Full Text Full Text PDF PubMed Scopus (10) Google Scholar, 8.Seyfert C.E. et al.Darobactins exhibiting superior antibiotic activity by cryo-EM structure guided biosynthetic engineering.Angew. Chem. Int. Ed. Engl. 2023; 62e202214094PubMed Google Scholar]. Growing interest in the scientific and commercial potential of RiPPs has led to the discovery of no fewer than 17 new classes of RiPPs between 2011 and 2020 [9.Skinnider M.A. et al.Genomic charting of ribosomally synthesized natural product chemical space facilitates targeted mining.Proc. Natl. Acad. Sci. U. S. A. 2016; 113: E6343-E6351Crossref PubMed Scopus (100) Google Scholar,10.Montalbán-López M. et al.New developments in RiPP discovery, enzymology and engineering.Nat. Prod. Rep. 2021; 38: 130-239Crossref PubMed Google Scholar]. It is generally believed that the currently known 40+ distinct classes of RiPPs [10.Montalbán-López M. et al.New developments in RiPP discovery, enzymology and engineering.Nat. Prod. Rep. 2021; 38: 130-239Crossref PubMed Google Scholar] are only the most widely distributed ones and that there is large ‘hidden’ RiPP biosynthetic potential left to discover. The overwhelming majority of RiPP classes was discovered serendipitously: promising biological activity or an interesting signal in a metabolomics experiment was investigated, and the responsible molecules were isolated. Only after structural elucidation of the NP, followed by the genome sequencing of the producing organism, could the biosynthetic origin be elucidated [10.Montalbán-López M. et al.New developments in RiPP discovery, enzymology and engineering.Nat. Prod. Rep. 2021; 38: 130-239Crossref PubMed Google Scholar, 11.Zdouc M.M. et al.A biaryl-linked tripeptide from Planomonospora reveals a widespread class of minimal RiPP gene clusters.Cell Chem. Biol. 2021; 28: 733-739.e4Abstract Full Text Full Text PDF PubMed Scopus (17) Google Scholar, 12.Nanudorn P. et al.Atropopeptides are a novel family of ribosomally synthesized and posttranslationally modified peptides with a complex molecular shape.Angew. Chem. Int. Ed Engl. 2022; 61e202208361Crossref PubMed Scopus (8) Google Scholar]. Such ‘isolation-first’ strategies, also known as ‘grind and find’, carry the risk of rediscovery of known metabolites, are resource-intense, and are of limited compatibility with modern high-throughput approaches. Therefore, computational methods for the detection and prioritization of biosynthetic pathways in genomics data have been developed. Predictions can be further validated by using metabolomics data, but automated data integration is not yet trivial. In this review, we first discuss the biosynthetic principles that complicate the detection of new classes of RiPPs by genome mining. We continue with an overview of recently developed generalist and RiPP-specialized software tools for automated integration of genomic and metabolomics data, then address current challenges, and finally highlight opportunities for further development. Canonically, the biosynthesis of microbial NPs is governed by a set of genes colocalized in the same genomic region, known as a biosynthetic gene cluster (BGC). RiPPs follow this biosynthetic logic and consist of at least two components: first, one or more small structural genes encoding short precursor peptides, and second, one or more genes encoding precursor-peptide-modifying ‘tailoring’ enzymes. These two components alone can be sufficient to produce a mature product [11.Zdouc M.M. et al.A biaryl-linked tripeptide from Planomonospora reveals a widespread class of minimal RiPP gene clusters.Cell Chem. Biol. 2021; 28: 733-739.e4Abstract Full Text Full Text PDF PubMed Scopus (17) Google Scholar,12.Nanudorn P. et al.Atropopeptides are a novel family of ribosomally synthesized and posttranslationally modified peptides with a complex molecular shape.Angew. Chem. Int. Ed Engl. 2022; 61e202208361Crossref PubMed Scopus (8) Google Scholar]. Additionally, accessory genes related to maturation, transport, autoresistance, or regulation are commonly colocalized in RiPP BGCs [3.Arnison P.G. et al.Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature.Nat. Prod. Rep. 2013; 30: 108-160Crossref PubMed Scopus (1368) Google Scholar]. The RiPP precursor peptides consist of a core peptide, usually flanked by an N-terminal ‘leader’ peptide (Figure 1B). In some cases, a C-terminal recognition sequence (the ‘follower’) is present, either on its own or together with the ‘leader’ peptide [3.Arnison P.G. et al.Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature.Nat. Prod. Rep. 2013; 30: 108-160Crossref PubMed Scopus (1368) Google Scholar]. After transcription and translation, the precursor peptide is modified by tailoring enzymes, which introduce post-translational modifications (PTMs). PTMs greatly expand the chemical space of proteinogenic amino acids, including the introduction of β- or d-amino acids, alterations to the peptide conformation, and additions of heteroatoms or other functional groups [10.Montalbán-López M. et al.New developments in RiPP discovery, enzymology and engineering.Nat. Prod. Rep. 2021; 38: 130-239Crossref PubMed Google Scholar]. RiPPs are grouped into classes (or families) based on shared structural and biosynthetic concepts. These range from ‘simple’ macrocyclization (e.g., the lasso-fold structure observed in lassopeptides) to complex biosynthetic cascades (e.g., thiopeptides, also known as pyritides) [3.Arnison P.G. et al.Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature.Nat. Prod. Rep. 2013; 30: 108-160Crossref PubMed Scopus (1368) Google Scholar,13.Kunakom S. et al.Cytochromes P450 involved in bacterial RiPP biosyntheses.J. Ind. Microbiol. Biotechnol. 2023; 50kuad005Crossref PubMed Scopus (1) Google Scholar]. After modification by tailoring enzymes, the leader and/or recognition sequences flanking the core peptide are removed by proteolysis, resulting in the mature modified core peptide, which is eventually exported from the cell [3.Arnison P.G. et al.Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature.Nat. Prod. Rep. 2013; 30: 108-160Crossref PubMed Scopus (1368) Google Scholar]. The conserved architecture of colocalized genes in the BGCs of microbial NPs can be detected and annotated computationally by a strategy known as ‘genome mining’ [14.Ziemert N. et al.The evolution of genome mining in microbes – a review.Nat. Prod. Rep. 2016; 33: 988-1005Crossref PubMed Google Scholar, 15.Medema M.H. et al.Mining genomes to illuminate the specialized chemistry of life.Nat. Rev. Genet. 2021; 22: 553-571Crossref PubMed Scopus (74) Google Scholar, 16.Biermann F. et al.Navigating and expanding the roadmap of natural product genome mining tools.Beilstein J. Org. Chem. 2022; 18: 1656-1671Crossref PubMed Scopus (4) Google Scholar]. Most commonly, BGCs are detected by using hardcoded rulesets based on conserved ‘signature’ genes (e.g., antiSMASH [17.Medema M.H. et al.antiSMASH: rapid identification, annotation and analysis of secondary metabolite biosynthesis gene clusters in bacterial and fungal genome sequences.Nucleic Acids Res. 2011; 39: W339-W346Crossref PubMed Scopus (1274) Google Scholar,18.Blin K. et al.antiSMASH 7.0: new and improved predictions for detection, regulation, chemical structures and visualisation.Nucleic Acids Res. 2023; (Published online May 4, 2023. https://doi.org/10.1093/nar/gkad344)Crossref Scopus (5) Google Scholar] or PRISM [19.Skinnider M.A. et al.Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM).Nucleic Acids Res. 2015; 43: 9645-9662PubMed Google Scholar,20.Skinnider M.A. et al.Comprehensive prediction of secondary metabolite structure and biological activity from microbial genome sequences.Nat. Commun. 2020; 11: 6058Crossref PubMed Scopus (109) Google Scholar]). Detected BGCs can be annotated by matching against experimentally characterized BGCs, using community resources such as MIBiG [21.Medema M.H. et al.Minimum Information about a biosynthetic gene cluster.Nat. Chem. Biol. 2015; 11: 625-631Crossref PubMed Scopus (547) Google Scholar,22.Terlouw B.R. et al.MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters.Nucleic Acids Res. 2023; 51: D603-D610Crossref PubMed Scopus (9) Google Scholar]. Large databases of putatively detected BGCs are available for comparisons (e.g., antiSMASH-DB [23.Blin K. et al.The antiSMASH database version 3: increased taxonomic coverage and new query features for modular enzymes.Nucleic Acids Res. 2021; 49: D639-D643Crossref PubMed Scopus (58) Google Scholar], IGM-ABC [24.Palaniappan K. et al.IMG-ABC v.5.0: an update to the IMG/Atlas of Biosynthetic Gene Clusters Knowledgebase.Nucleic Acids Res. 2020; 48: D422-D430PubMed Google Scholar]). On the basis of the observation that similar BGCs often produce similar compounds, BGCs can be further grouped into so-called gene cluster families (GCFs). In GCFs, annotations of identified BGCs can be propagated to their neighbors in the network, which allows one to formulate hypotheses about their encoded products [25.Navarro-Muñoz J.C. et al.A computational framework to explore large-scale biosynthetic diversity.Nat. Chem. Biol. 2020; 16: 60-68Crossref PubMed Scopus (306) Google Scholar,26.Doroghazi J.R. et al.A roadmap for natural product discovery based on large-scale genomics and metabolomics.Nat. Chem. Biol. 2014; 10: 963-968Crossref PubMed Scopus (0) Google Scholar]. Furthermore, subcluster analysis can predict putative substructures of the encoded (unknown) metabolites [27.Del Carratore F. et al.Computational identification of co-evolving multi-gene modules in microbial biosynthetic gene clusters.Commun. Biol. 2019; 2: 83Crossref PubMed Scopus (17) Google Scholar,28.Louwen J.J.R. et al.iPRESTO: Automated discovery of biosynthetic sub-clusters linked to specific natural product substructures.PLoS Comput. Biol. 2023; 19e1010462Crossref PubMed Scopus (1) Google Scholar]. Therefore, genome mining allows automated assessment of the ‘theoretical’ biosynthetic capacity encoded in a microbial genome (i.e., the ‘biosynthetic blueprint’) and to compare it with the existing body of knowledge [15.Medema M.H. et al.Mining genomes to illuminate the specialized chemistry of life.Nat. Rev. Genet. 2021; 22: 553-571Crossref PubMed Scopus (74) Google Scholar]. Genome mining is also suitable for the detection of RiPP BGCs: antiSMASH can detect at least 28 different classes of RiPPs [18.Blin K. et al.antiSMASH 7.0: new and improved predictions for detection, regulation, chemical structures and visualisation.Nucleic Acids Res. 2023; (Published online May 4, 2023. https://doi.org/10.1093/nar/gkad344)Crossref Scopus (5) Google Scholar], whereas RiPP-PRISM can detect no fewer than 21 different classes [9.Skinnider M.A. et al.Genomic charting of ribosomally synthesized natural product chemical space facilitates targeted mining.Proc. Natl. Acad. Sci. U. S. A. 2016; 113: E6343-E6351Crossref PubMed Scopus (100) Google Scholar]. In the antiSMASH database (version 3), the 14 most abundant classes of RiPPs amount to at least 44 000 predicted RiPP BGCs across publicly available bacterial, archaeal, and fungal reference genomes [23.Blin K. et al.The antiSMASH database version 3: increased taxonomic coverage and new query features for modular enzymes.Nucleic Acids Res. 2021; 49: D639-D643Crossref PubMed Scopus (58) Google Scholar]. Once an RiPP class is described, the involved enzymatic machinery can easily be detected by gene homology-based approaches. However, genome mining for completely novel RiPP classes is much more challenging: because RiPP biosynthetic classes do not share universally conserved core enzymes or motif sequences, they remain ‘invisible’ to rule-based genome mining tools. Furthermore, RiPP structural genes encoding precursor peptides can be extremely short: the smallest reported structural gene [bytA, encoding the biarylitide YYH (Figure 1A, structure 2) precursor] is only 18 base pairs long, making it also the shortest known coding gene [11.Zdouc M.M. et al.A biaryl-linked tripeptide from Planomonospora reveals a widespread class of minimal RiPP gene clusters.Cell Chem. Biol. 2021; 28: 733-739.e4Abstract Full Text Full Text PDF PubMed Scopus (17) Google Scholar]. Considering all possible short open reading frames in a genome may lead to a prohibitively high number of potential candidates, including many false-positives, whereas defining a minimal gene length for structural peptides may also exclude novel classes of short RiPPs. To address the limitations of homology-dependent BGC detection, tools using alternative concepts for BGC detection were developed: besides tools using concepts applicable to all classes of microbial BGCs, such as ClusterFinder [29.Cimermancic P. et al.Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters.Cell. 2014; 158: 412-421Abstract Full Text Full Text PDF PubMed Scopus (608) Google Scholar], EvoMining [30.Sélem-Mojica N. et al.EvoMining reveals the origin and fate of natural product biosynthetic enzymes.Microb. Genom. 2019; 5e000260PubMed Google Scholar], or DeepBGC [31.Hannigan G.D. et al.A deep learning genome-mining strategy for biosynthetic gene cluster prediction.Nucleic Acids Res. 2019; 47e110Crossref PubMed Google Scholar], a few tools have been designed specifically for the detection of novel RiPP BGCs. The tool DeepRiPP uses a deep-learning approach based on natural language processing (NLPPrecursor) to identify new RiPP precursor peptides linked to known classes [32.Merwin N.J. et al.DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products.Proc. Natl. Acad. Sci. U. S. A. 2020; 117: 371-380Crossref PubMed Scopus (59) Google Scholar]. Similarly, neuRiPP uses a deep neural network architecture to recognize RiPP structural genes independent of their biosynthetic class [33.de Los Santos E.L.C. NeuRiPP: Neural network identification of RiPP precursor peptides.Sci. Rep. 2019; 9: 13406Crossref PubMed Scopus (37) Google Scholar]. Another tool, decRiPPter, uses a support vector machine and a set of rules to differentiate putative RiPP precursor peptides from small noncoding genes [34.Kloosterman A.M. et al.Expansion of RiPP biosynthetic space through integration of pan-genomics and machine learning uncovers a novel class of lanthipeptides.PLoS Biol. 2020; 18e3001026Crossref PubMed Scopus (53) Google Scholar]. A drawback of such homology-independent methods is their high rate of false-positive detection due to lack of indicatory signature enzymes, requiring extensive manual follow-up validation [34.Kloosterman A.M. et al.Expansion of RiPP biosynthetic space through integration of pan-genomics and machine learning uncovers a novel class of lanthipeptides.PLoS Biol. 2020; 18e3001026Crossref PubMed Scopus (53) Google Scholar]. One strategy to reduce false-positives and to improve throughput in the discovery of novel classes of RiPPs is to validate predictions from genome mining via detection of products using liquid chromatography–tandem mass spectrometry (LC-MS/MS)-based metabolomics [35.van der Hooft J.J.J. et al.Linking genomics and metabolomics to chart specialized metabolic diversity.Chem. Soc. Rev. 2020; 49: 3297-3314Crossref PubMed Google Scholar]. In LC-MS/MS analysis, NPs are separated, ionized, and fragmented by collisional dissociation. In the resulting tandem mass (MS/MS) fragmentation spectra, individual fragments typically correspond to parts of the parent molecule structures (i.e., substructures). This makes MS/MS spectra useful for diagnostic purposes, such as the annotation of substructures and the identification of the chemical compound class [36.Niessen W.M.A. et al.Interpretation of MS-MS Mass Spectra of Drugs and Pesticides. John Wiley & Sons, 2017Crossref Scopus (27) Google Scholar, 37.Beniddir M.A. et al.Advances in decomposing complex metabolite mixtures using substructure- and network-based computational metabolomics approaches.Nat. Prod. Rep. 2021; 38: 1967-1993Crossref PubMed Google Scholar, 38.van der Hooft J.J.J. et al.Topic modeling for untargeted substructure exploration in metabolomics.Proc. Natl. Acad. Sci. U. S. A. 2016; 113: 13738-13743Crossref PubMed Scopus (182) Google Scholar, 39.de Jonge N.F. et al.MS2Query: reliable and scalable MS2 mass spectra-based analogue search.Nat. Commun. 2023; 14: 1752Crossref PubMed Scopus (1) Google Scholar, 40.Ernst M. et al.MolNetEnhancer: enhanced molecular networks by integrating metabolome mining and annotation tools.Metabolites. 2019; 9: 144Crossref PubMed Scopus (178) Google Scholar, 41.Dührkop K. et al.Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra.Nat. Biotechnol. 2021; 39: 462-471Crossref PubMed Scopus (170) Google Scholar]. MS/MS fragmentation spectra can also be considered as characteristic molecular fingerprints, with similar molecules usually showing similar MS/MS fragmentation. Modification-tolerant matching of spectra allows clustering of molecules into networks based on MS/MS spectral similarity [also known as ‘molecular families’ (MFs)], thereby organizing data and propagating annotations [42.Bandeira N. Spectral networks: a new approach to de novo discovery of protein sequences and posttranslational modifications.Biotechniques. 2007; 42: 687-695Crossref PubMed Scopus (33) Google Scholar, 43.Watrous J. et al.Mass spectral molecular networking of living microbial colonies.Proc. Natl. Acad. Sci. U. S. A. 2012; 109: E1743-E1752Crossref PubMed Scopus (631) Google Scholar, 44.Wang M. et al.Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking.Nat. Biotechnol. 2016; 34: 828-837Crossref PubMed Scopus (2032) Google Scholar, 45.Gurevich A. et al.Increased diversity of peptidic natural products revealed by modification-tolerant database search of mass spectra.Nat. Microbiol. 2018; 3: 319-327Crossref PubMed Scopus (60) Google Scholar]. Therefore, experimentally observed NPs can be annotated and ‘mapped’ back to BGCs to confirm initial predictions. This matching also allows one to prioritize BGCs that show expression over those that do not (many BGCs are ‘silent’ under laboratory conditions). Hence, genomic and metabolomic data are complementary in forming and confirming hypotheses and reducing false-positives. Such integrated metabolomics and genomics data are generally referred to as paired omics datasets [35.van der Hooft J.J.J. et al.Linking genomics and metabolomics to chart specialized metabolic diversity.Chem. Soc. Rev. 2020; 49: 3297-3314Crossref PubMed Google Scholar]. In recent years, different tools for the processing and analysis of paired omics datasets have been developed [35.van der Hooft J.J.J. et al.Linking genomics and metabolomics to chart specialized metabolic diversity.Chem. Soc. Rev. 2020; 49: 3297-3314Crossref PubMed Google Scholar,46.Soldatou S. et al.Linking biosynthetic and chemical space to accelerate microbial secondary metabolite discovery.FEMS Microbiol. Lett. 2019; 366fnz142Crossref PubMed Scopus (23) Google Scholar, 47.Caesar L.K. et al.Metabolomics and genomics in natural products research: complementary tools for targeting new chemical entities.Nat. Prod. Rep. 2021; 38: 2041-2065Crossref PubMed Google Scholar, 48.Louwen J.J.R. van der Hooft J.J.J. Comprehensive large-scale integrative analysis of omics data to accelerate specialized metabolite discovery.mSystems. 2021; 6e0072621Crossref PubMed Scopus (7) Google Scholar]. We first survey generalist tools that are also applicable to RiPP NPs, followed by tools that are specifically designed for the analysis of RiPPs (see overview in Table 1). We limit our discussion to tools that require both genomics and metabolomics data as input.Table 1Recently developed paired genomics and metabolomics software addressing ribosomally synthesized and post-translationally modified peptides with several key factors to consider upon their useTool [latest version]ApproachRiPP specific?Open sourceFree academic license?NoteRefsRipp2Path[2016]Feature- basedYesYesYesPart of Pep2Path package[56.Medema M.H. et al.Pep2Path: automated mass spectrometry-guided genome mining of peptidic natural products.PLoS Comput. Biol. 2014; 10e1003822Crossref Scopus (72) Google Scholar]RippQuest[2014]Feature- basedYesNo–Superseded by MetaMiner[57.Mohimani H. et al.Automated genome mining of ribosomal peptide natural products.ACS Chem. Biol. 2014; 9: 1545-1551Crossref PubMed Scopus (98) Google Scholar]MetaMiner[2019]Feature- basedYesNoYesNPDtools package, GNPS website[58.Cao L. et al.MetaMiner: a scalable peptidogenomics approach for discovery of ribosomal peptide natural products with blind modifications from microbial communities.Cell Syst. 2019; 9: 600-608.e4Abstract Full Text Full Text PDF PubMed Scopus (32) Google Scholar]DeepRiPP[2021]Feature- basedYesNoYesaRequires login and approval of extensive end user license agreement.–[32.Merwin N.J. et al.DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products.Proc. Natl. Acad. Sci. U. S. A. 2020; 117: 371-380Crossref PubMed Scopus (59) Google Scholar]Metabolo-genomics[2023]Correlation- basedNoNoNoNo public release of program[26.Doroghazi J.R. et al.A roadmap for natural product discovery based on large-scale genomics and metabolomics.Nat. Chem. Biol. 2014; 10: 963-968Crossref PubMed Scopus (0) Google Scholar,63Caesar L.K. et al.Correlative metabologenomics of 110 fungi reveals metabolite-gene cluster pairs.Nat. Chem. Biol. 2023; (Published online March 6, 2023. https://doi.org/10.1038/s41589-023-01276-8)Crossref PubMed Scopus (3) Google Scholar]NPLinker[2023]HybridNoYesYesUndergoing refactoring, see https://github.com/NPLinker/nplinker[50.Hjörleifsson Eldjárn G. et al.Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions.PLoS Comput. Biol. 2021; 17e1008920Crossref PubMed Scopus (15) Google Scholar]NPOmix[2022]HybridNoYesYesInput must be similar to reference database; undergoing refactoring, https://github.com/tiagolbiotech/NPOmix_python[52.Leão T.F. et al.NPOmix: a machine learning classifier to connect mass spectrometry fragmentation data to biosynthetic gene clusters.PNAS Nexus. 2022; 1gac257Crossref PubMed Google Scholar]a Requires login and approval of extensive end user license agreement. Open table in a new tab Generalist tools pair BGCs to MS/MS spectra by relying on information that is applicable to all biosynthetic classes [35.van der Hooft J.J.J. et al.Linking genomics and metabolomics to chart specialized metabolic diversity.Chem. Soc. Rev. 2020; 49: 3297-3314Crossref PubMed Google Scholar]. A common strategy is the analysis of presence–absence patterns of BGCs and MS/MS spectra associated with microbial strains, so-called strain-correlation-based approaches (Figure 2). BGCs and MS/MS spectra are first organized into GCFs and MFs, respectively, using different clustering tools. Therefore, GCFs and MFs each can be traced back to sets of strains, allowing the calculation of linking scores based on strain overlap [35.van der Hooft J.J.J. et al.Linking genomics and metabolomics to chart specialized metabolic diversity.Chem. Soc. Rev. 2020; 49: 3297-3314Crossref PubMed Google Scholar]. Such a generalist approach was first introduced under the name ‘metabologenomics’ by Doroghazi and colleagues, who matched GCFs and detected molecules using a point-based system relying on strain contribution, followed by manual verification of the putative links [26.Doroghazi J.R. et al.A roadmap for natural product discovery based on large-scale genomics and metabolomics.Nat. Chem. Biol. 2014; 10: 963-968Crossref PubMed Scopus (0) Google Scholar]. Similarly, Duncan and others applied ‘pattern-based genome mining’, which relied on a manual comparison of the presence–absence patterns of GCFs and MFs [49.Duncan K.R. et al.Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species.Chem. Biol. 2015; 22: 460-471Abstract Full Text Full Text PDF PubMed Scopus (128) Google Scholar]. Some other generalist tools use a ‘hybrid’ approach by combining both correlation- and feature-based concepts in pairing. One of them, NPLinker [50.Hjörleifsson Eldjárn G. et al.Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions.PLoS Comput. Biol. 2021; 17e1008920Crossref PubMed Scopus (15) Google Scholar], expands and refines the scoring algorithm first introduced by the ‘metabologenomics’ approach [26.Doroghazi J.R. et al.A roadmap for natural product discovery based on large-scale genomics and metabolomics.Nat. Chem. Biol. 2014; 10: 963-968Crossref PubMed Scopus (0) Google Scholar] and combines it with the feature-based IOKR score, which calculates binary molecule fingerprints from MS/MS fragmentation spectra and structures predicted from BGCs for improved pairing. Recently, NPLinker was enhanced by a new scoring function called ‘NPClassScore’, which uses chemical compound classes predicted from BGCs and MS/MS fragmentation patterns to eliminate a substantial number of false-positive BGC-MS/MS links [51.Louwen J.J.R. et al.Enhanced correlation-based linking of biosynthetic gene clusters to their metabolic products through chemical class matching.Microbiome. 2023; 11: 13Crossref PubMed Scopus (4) Google Scholar]. Another hybrid tool is NPOmix, which uses a k-nearest neighbor-based classifier to compare similarity fingerprints calculated from the association of microbial strains to GCFs and MFs. NPOmix further uses information regarding predicted molecular substructures and biosynthetic class to supplement the classifier-based score [52.Leão T.F. et al.NPOmix: a machine learning classifier to connect mass spectrometry fragmentation data to biosynthetic gene clusters.PNAS Nexus. 2022; 1gac257Crossref PubMed Google Scholar]. Intuitive and generally applicable, these correlation-based concepts were used in manual or semiautomated fashion for the discovery of new RiPPs from known classes, such as the chymotrypsin inhibitor microviridin 1777 [53.Sieber S. et al.Microviridin 1777: a toxic chymotrypsin inhibitor discovered by a metabologenomic approach.J. Nat. Prod. 2020; 83: 438-446Crossref PubMed Scopus (22) Google Scholar] or new congeners of the antibiotic siomycin [54.Zdouc M.M" @default.
- W4382561729 created "2023-06-30" @default.
- W4382561729 creator A5034845813 @default.
- W4382561729 creator A5075156076 @default.
- W4382561729 creator A5078965540 @default.
- W4382561729 date "2023-08-01" @default.
- W4382561729 modified "2023-10-18" @default.
- W4382561729 title "Metabolome-guided genome mining of RiPP natural products" @default.
- W4382561729 cites W1960974681 @default.
- W4382561729 cites W1990935073 @default.
- W4382561729 cites W2009257824 @default.
- W4382561729 cites W2013455629 @default.
- W4382561729 cites W2041537722 @default.
- W4382561729 cites W2043472907 @default.
- W4382561729 cites W2052591459 @default.
- W4382561729 cites W2069928158 @default.
- W4382561729 cites W2079228339 @default.
- W4382561729 cites W2101560381 @default.
- W4382561729 cites W2135639274 @default.
- W4382561729 cites W2137957056 @default.
- W4382561729 cites W2144803530 @default.
- W4382561729 cites W2162854463 @default.
- W4382561729 cites W2172592212 @default.
- W4382561729 cites W2187341651 @default.
- W4382561729 cites W2302501749 @default.
- W4382561729 cites W2410975458 @default.
- W4382561729 cites W2504691963 @default.
- W4382561729 cites W2528897472 @default.
- W4382561729 cites W2551876238 @default.
- W4382561729 cites W2793924581 @default.
- W4382561729 cites W2893437991 @default.
- W4382561729 cites W2895341279 @default.
- W4382561729 cites W2912880347 @default.
- W4382561729 cites W2955342277 @default.
- W4382561729 cites W2956419562 @default.
- W4382561729 cites W2967688728 @default.
- W4382561729 cites W2972473680 @default.
- W4382561729 cites W2981262779 @default.
- W4382561729 cites W2989562929 @default.
- W4382561729 cites W2990471452 @default.
- W4382561729 cites W2997918501 @default.
- W4382561729 cites W3003751929 @default.
- W4382561729 cites W3011366455 @default.
- W4382561729 cites W3024192869 @default.
- W4382561729 cites W3087348949 @default.
- W4382561729 cites W3096851604 @default.
- W4382561729 cites W3107629351 @default.
- W4382561729 cites W3108604517 @default.
- W4382561729 cites W3111861200 @default.
- W4382561729 cites W3117886555 @default.
- W4382561729 cites W3123030036 @default.
- W4382561729 cites W3131765257 @default.
- W4382561729 cites W3158208374 @default.
- W4382561729 cites W3165511231 @default.
- W4382561729 cites W3176450507 @default.
- W4382561729 cites W3194396730 @default.
- W4382561729 cites W3196903026 @default.
- W4382561729 cites W3204672378 @default.
- W4382561729 cites W3206284847 @default.
- W4382561729 cites W3210489033 @default.
- W4382561729 cites W4200419512 @default.
- W4382561729 cites W4206946384 @default.
- W4382561729 cites W4225416490 @default.
- W4382561729 cites W4226342948 @default.
- W4382561729 cites W4281688302 @default.
- W4382561729 cites W4290660717 @default.
- W4382561729 cites W4306409651 @default.
- W4382561729 cites W4309305948 @default.
- W4382561729 cites W4309976130 @default.
- W4382561729 cites W4310780836 @default.
- W4382561729 cites W4311640314 @default.
- W4382561729 cites W4312045084 @default.
- W4382561729 cites W4317811065 @default.
- W4382561729 cites W4319662820 @default.
- W4382561729 cites W4323320079 @default.
- W4382561729 cites W4327709669 @default.
- W4382561729 cites W4361292278 @default.
- W4382561729 cites W4368358469 @default.
- W4382561729 doi "https://doi.org/10.1016/j.tips.2023.06.004" @default.
- W4382561729 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/37391295" @default.
- W4382561729 hasPublicationYear "2023" @default.
- W4382561729 type Work @default.
- W4382561729 citedByCount "1" @default.
- W4382561729 crossrefType "journal-article" @default.
- W4382561729 hasAuthorship W4382561729A5034845813 @default.
- W4382561729 hasAuthorship W4382561729A5075156076 @default.
- W4382561729 hasAuthorship W4382561729A5078965540 @default.
- W4382561729 hasBestOaLocation W43825617291 @default.
- W4382561729 hasConcept C104317684 @default.
- W4382561729 hasConcept C135870905 @default.
- W4382561729 hasConcept C141231307 @default.
- W4382561729 hasConcept C151730666 @default.
- W4382561729 hasConcept C21565614 @default.
- W4382561729 hasConcept C2776608160 @default.
- W4382561729 hasConcept C41008148 @default.
- W4382561729 hasConcept C54355233 @default.
- W4382561729 hasConcept C60644358 @default.
- W4382561729 hasConcept C70721500 @default.