This lesson is still being designed and assembled (Pre-Alpha version)

Visualize Your Tree and Remove Errant Sequences

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • Key question (FIXME)

Objectives
  • First learning objective. (FIXME)

Download your tree e.g. “RAxML_bestTree.clade”. Open tree in FigTree and annotate taxa with either “remove” for taxa you wish to remove, or you can modify/add to the existing name. You can only annotate taxa, NOT nodes or clades. You can select entire clades to annotate all taxa at once. Annotated nodes cannot be collapsed (annotations within the collapsed node will not be reported). “Save as” the tree with annotations for example “RAxML_bestTree.clade_annotated.tre”, which is a nexus file.

(NOTE: Add screenshots illustrating how to do the above in FigTree)

Inputs:

-t RAxML_bestTree.clade_annotated.tre -m metadata.txt

python eukref_treeparse.py -t RAxML_bestTree.clade_annotated.tre -m metadata.txt

Output: metadata_out.txt

The “metadata_out.txt” will be your tab-delimited metadata file with “remove” in the last column for the sequences you want to remove. You can sort your metadata_out.txt file in excel to get all of the accessions of sequences that should be removed from your tree. Save these accessions to be removed in a file called accessions_remove.txt.

Inputs:

-i current_DB.clustered.fasta -s accessions_remove.txt

python eukref_filter_fasta.py -i current_DB.clustered.fasta -s accessions_remove.txt -o current_DB.cleaned.fasta

Output:

current_DB.cleaned.fasta

Go back to Step 7 to align and build a new tree, repeat until you are happy with your current set of sequences.

Key Points

  • First key point. Brief Answer to questions. (FIXME)