This lesson is still being designed and assembled (Pre-Alpha version)

EukRef Pipeline

The PR2 database was initiated in 2010 in the frame of the BioMarks project from work that had developed in the previous ten years in the Plankton Group of the Station Biologique of Roscoff. Its aim is to provide a reference database of carefully annotated 18S rRNA sequences using eight unique taxonomic fields (from kingdom to species). At present it contains about 184,000 sequences. A number of metadata fields are available for many sequences, including geo-localisation, whether it originates from a culture or a natural sample, host type etc… The annotation of PR2 is performed by experts from each taxonomic groups. One very important project in this respect is EukRef which has recently decided to merge its effort with PR2. EukRef has built bioinformatics pipelines that have been used during three workshops dedicated to specific taxonomic groups. As an example, part of the ciliate annotation originate from the first EukRef workshop.


This lesson is intended to be used by microbial ecologists or genomicists at the doctoral level or above.


Setup Download files required for the lesson
00:00 1. Getting Started Key question (FIXME)
00:00 2. Retrieve an Initial Set of Sequences and Cluster Key question (FIXME)
00:00 3. Building Initial Alignment Key question (FIXME)
00:00 4. Build an Initial Tree Key question (FIXME)
00:00 5. Download Databases Key question (FIXME)
00:00 6. Retrieve All Sequences That Belong To Your Clade Key question (FIXME)
00:00 7. Build an Alignment with the Reference Sequences Key question (FIXME)
00:00 8. Build RaxML Trees and Clean Up Key question (FIXME)
00:00 9. Visualize Your Tree and Remove Errant Sequences Key question (FIXME)
00:00 10. Build Reference Tree Key question (FIXME)
00:00 11. Annotation Key question (FIXME)
00:00 12. Getting Started Key question (FIXME)
00:00 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.