This lesson is being piloted (Beta version)
If you teach this lesson, please tell the authors and provide feedback by opening an issue in the source repository

Metagenomics Workshop Overview: Data

Features of the dataset

The dataset in Zenodo contains two files: Three samples from Cuatro Ciénegas sediments with their corresponding taxonomic assignation files.
MGRAST_MetaData_JP.xlsx Metadata about our three Cuatro Cienegas samples accepted by MGRAST

The contains the following files. A tree structure showind directories and files                                                   contained in the compressed file

The directories are: hidden, data, mags and taxonomy.

Directory hidden

hidden contains a hidden file that will be used in the lesson Introduction to the Command Line for Metagenomics episode 03 Navigating Files and Directories when learners will discover how to find hidden files.

Directory data

data contains four fastq files from two samples: JC1A and JP4D. These files are the inputs of FastQC tool in the lesson Data Processing and Visualization for Metagenomics next episodes two and three Assessing Read Quality and Trimming and Filtering. In these episodes learners will remove bad quality nucleotides and prepare files for assembly and taxonomic assignation.

Directory mags

mags contains the assembly of the JP4D sample.

Directory taxonomy

Since Kraken2 won’t be run in the lesson, this directory contains taxonomic assignment obtained by running Kraken2 on the trimmed reads.
From these files users can obtained biom files that will be the input for the R analysis and visualization of abundance.

Finally, it contains a subdirectory with the taxonomic assignment of the first bin from sample JP4D.

Introduction to the dataset


This workshop uses data from enviromental experiment: Genomic adaptations in information processing underpin trophic strategy in a whole-ecosystem nutrient enrichment experiment, by Jordan G Okie et al. 2020 In this research, authors compared the differences between the microbial community in its natural, oligotrophic, phosphorus-deficient environment, a pond from the Cuatro Ciénegas Basin (CCB), and the same microbial community under a fertilization treatment.

All of the data used in this workshop can be downloaded from DOI


Okie, J. G., Poret-Peterson, A. T., Lee, Z. M. P., Richter, A., Alcaraz, L. D., Eguiarte, L. E., Siefert, J. L., Souza, V., Dupont, C. L., & Elser, J. J. (2020). Genomic adaptations in information processing underpin trophic strategy in a whole-ecosystem nutrient enrichment experiment. ELife, 9.