This lesson is still being designed and assembled (Pre-Alpha version)

Genome Mining in Prokaryotes

Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This workshop uses Data Carpentry’s approach to teach data management and analysis for genome mining research including: best practices for organization of bioinformatics projects and data, use of command-line utilities, use of command-line tools to analyze sequence quality, use of R studio and use of R libraries to compare diversity between samples, and connecting to and using cloud computing.

Prerequisitos

FIX ME

Data

This worksop uses data from experiment: Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pan genome”, by Hervé Tettelin, Vega Masignani, Michael J. Cieslewicz, Claire M et al.

All of the data used in this workshop can be downloaded from: DOI More information about this data is available on the Data page.

Workshop Overview

Schedule

Setup Download files required for the lesson
00:00 1. Introduction to Genome Mining What is Genome Mining?
00:10 2. Secondary metabolite biosynthetic gene cluster identification How can I annotate known BGC?
Which kind of analysis antiSMASH can perform?
Which file extension accepts antiSMASH?
00:40 3. Genome Mining Databases Where can I find experimentally validated BGCs?
Where is information about all predicted BGCs?
01:05 4. BGC Similarity Networks How can I measure similarity between BGCs?
01:50 5. Homologous BGC Clusterization How can I identify Gene Cluster Families?
How can I predict the production of similar metabolites
How can I clusterize BGCs into groups that produce similar metabolites?
How can I compare the metabolic capability of different bacterial lineages?
02:40 6. Finding Variation on Genomic Vicinities How can I follow variation in genomic vicinities given a reference BGC?
Which gene families are the conserved part of a BGC family?
03:15 7. Evolutionary Genome Mining What is Evolutionary Genome Mining?
Which kind of BGCs can EvoMining find?
What do I need in order to run an evolutionary genome mining analysis?
03:55 8. GATOR-GC: Genomic Assessment Tool for Orthologous Regions and Gene Clusters What is GATOR-GC, and how does it differ from other BGC exploration tools?
How does GATOR-GC establish BGC boundaries using evolutionary principles?
What types of biosynthetic diversity can GATOR-GC identify?
What do I need to perform a targeted exploration using GATOR-GC?
04:55 9. Metabolomics workshop How can I evaluate the similarity between MS spectra?
06:25 10. Other Resources What else can I do?
06:45 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.