This lesson is still being designed and assembled (Pre-Alpha version)

Other Resources

Overview

Teaching: 10 min
Exercises: 10 min
Questions
  • What else can I do?

Objectives
  • First learning objective. (FIXME)

Other resources

We have seen how to conduct a genome mining project with some particular characteristics. However, each research project may need some deviations to these workflows. You may need to make use of RNA-Seq data to search just for some specific domains instead of complete BGCs or to combine metabolomics or proteomics with genomic data. Here we will provide a list of resources that are also useful for genome mining projects.

Assemblers

First, since BGCs usually encompass genes with high levels of repeat regions (mainly NRPSs and PKSs), genome assemblers are not always capable of reconstructing BGCs and sometimes divide them into two contigs. To overcome this difficulty, you can use biosyntheticSPADes to assemble your reads into complete BGCs. This algorithm is implemented within the SPAdes tool.

Other genome mining programs

Depending on your interests, you can use some alternative program to search for BGCs in your genomes. The Secondary Metabolite Bioinformatics Portal lists and introduces many diverse bioinformatic tools that pursue this goal. For example, CLUSEAN and NaPDoS allow to detect PKS and NRPS domains, while BAGEL, RODEO and RiPPMiner are dedicated to find ribosomally synthesized and post-translationally modified peptides (RIPPs). Then, implementing a further step, ARTS prioritizes previously detected BGCs (with antiSMASH) and identifies drug targets from these data.

There are also some tools and workflows that are dedicated to finding new biosynthetic systems (new types of BGCs or biosynthetic genes). Among them, as it was explained in previous lessons, EvoMining arose as a promising tool to detect biosynthetic enzymes that may have evolved from developing core functions (‘central’ or ‘primary’ metabolism) to carry on specialized functions (secondary metabolism). Likewise, ClusterFinder enables the prediction of uncharacterized BGCs in genomes through different algorithms.

Tools for metabolomics

You can also try to link diverse -omics data with genome predictions. SeMa-Trap allows the use RNA-Seq data to find co-expression patterns between certain genes and BGCs. Similarly, if you need to combine metabolomic/proteomic data with putative BGCs-products, you can use Pep2Path, MetaMiner or GNPS, amongst others.

Phylogeny based tools

Finally, to prioritize genomes or species, autoMLST allows you not only to find phylogenetically closed strains/species from your input genome but also to explore the biosynthetic potential of those relatives.

Index of novelty

Novelty Index

Visualize annotated genomes

CGView

Specialized data bases

MAssive studies

Carpentries Philosophy

A good lesson should be as complete and transparent as easy to teach by any instructor. Carpentries lessons are developed for the community; now, you are part of us. This lesson is being developed, and we are sure that you can collaborate and help us improve it.

Key Points

  • First key point. Brief Answer to questions. (FIXME)