This lesson is in the early stages of development (Alpha version)

Pangenomics Workshop Overview

Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time and with less pain. This workshop uses Data Carpentry’s approach to teach data management for pangenome analysis in prokaryotes including: best practices for the organization of bioinformatics projects and data, use of command-line tools to compare genetic diversity between genomes, organize this diversity in the core and dispensable set of a pangenome, and connecting to and using cloud computing. This workshop is designed to be taught over two full days of instruction.

Please note that workshop materials for working with Pangenome Analysis in Prokaryotes data are in “beta” development. These lessons are available for review and informal teaching experiences but are not yet part of The Carpentries’ official lesson offerings.

Would you be interested in teaching these materials? We have a Slack channel, where we will gladly help you!

Frequently Asked Questions

Read our FAQ to learn more about Data Carpentry’s Pangenomics Workshop, as an Instructor or a workshop host. FIXME 💢

Getting Started

This lesson assumes that learners have no prior experience with the tools covered in the workshop. However, learners are expected to have some familiarity with biological concepts, including the concepts of prokaryotic genome and gene family. Participants should bring their laptops and plan to participate actively.

Data

This workshop uses data from the research “Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial ‘pan-genome’ “, PNAS 2005. doi 10.1073/pnas.0506758102. In this research, while studying the available genomes of S. agalactiae, Tettelin and collaborators discovered that there was inter-species genome variation, and in consequence, one genome is not enough to describe the genetic repertoire of a species. All of the data used in this workshop can be downloaded from: DOI More information about this data is available on the Data page.

Workshop Overview

Lesson Overview Estimated time
Introduction to the Command Line for Pangenomics Learn to navigate your file system, create, copy, move, and remove files and directories, and automate repetitive tasks using scripts and wildcards. 03:45 hrs
Introduction to Python Learn the basics of the Python language. 02:45 hrs
Pangenome Analysis in Prokaryotes Use command-line tools to download and annotate prokaryotic genomes. Learn pangenome analyses and visualizations. 05:40 hrs
Topological Data Analysis for Pangenomics Learn how to apply the Topological Data Analysis in Pangenomics analysis. 03:45 hrs

Teaching Platform

This workshop is designed to be run on pre-imaged Amazon Web Services (AWS) instances. All the software and data used in the workshop are hosted on an Amazon Machine Image (AMI). If you want to run your own instance of the server used for this workshop, follow the directions in the Setup tab.

Citation