There are several pieces of software you need to install before the workshop. Though installation help will be provided at the workshop, we recommend that these tools are installed (or at least downloaded) beforehand. Anaconda Python is a very large download.
Lesson Data Files
The files used in this lesson can be downloaded:
Once downloaded, please extract to the directory you wish to work in for all the
hands-on exercises. If you are working on the Linux or Mac command-line, you can
do something like the following. We assume you have downloaded the file to
~/Downloads
:
cd
mkdir workflow-workshop
cd workflow-workshop
mv ../Downloads/workflow-engines-lesson.tar.gz .
tar xf workflow-engines-lesson.tar.gz
On Windows, please use the Windows Explorer to create a directory and extract the Zip file into it.
Solutions for most episodes can be found in the .solutions
directory inside
the code download.
A requirements.txt
file is included in the download. As described below,
this can be used to install the required Python packages.
Python 3 / Anaconda
You will need Python 3.6 or better to use the sample files, since some relatively new Python features are used. Instructions are given here for installing Anaconda, although any Python will work.
- Visit the Anaconda download page
- Select your operating system (Windows, macOS, or Linux).
- Download the Python 3 64-bit graphical installer.
- After the download completes, run the installer to install Anaconda.
Follow the Installation Guide
If you need more detailed guidance, then please follow the Anaconda Installation guide.
Updating Anaconda
- Once Anaconda is installed, it is a good idea to update it.
- The article Keeping Anaconda Up To Date is a good guide to updating Anaconda after it is installed.
- It boils down to opening an Anaconda terminal and running the command:
conda update --all
Updating is not essential for this course
If you would rather skip updating, then it is not essential for this course.
Additional Libraries
The example code also requires the matplotlib
and numpy
libraries. They
are installed by default with Anaconda, but if you are using a different
Python you may need to install them manually. Using pip
, the command would
be:
pip install --user numpy matplotlib
The example files download also contains a requirements.txt
file that can
be used to specify the required packages for a Python virtual environment.
There are many guides to this process online, including the official
documentation.
Snakemake
Once you have Python 3 available, you need to install the Snakemake library.
Anaconda
You can install Snakemake at an Anaconda prompt:
conda install -c bioconda -c conda-forge snakemake-minimal
Note
At the time of writing, the
snakemake
conda package was not installing correctly, however thesnakemake-minimal
was working. This lesson does not require any features beyond those included with the minimal install.
Vanilla Python
You can install Snakemake with:
pip install --user snakemake
If you used the supplied requirements.txt
to create a Python virtual
environment then snakemake
should already be installed. For more information,
please refer to the Snakemake installation
documentation.
Windows-Specific Instructions
Some of the commands used in this lesson assume that some common Linux commands
are available. These include rm
, and tar
. These commands are not available
by default on Windows systems. While there are many solutions, including using
the Windows Subsystem for Linux (WSL), a simple approach that we have tested is
to install Git for Windows. This installs a lightweight command-line environment
that contains the required Linux commands and can be configured to use Anaconda
Python.
First, download and install the Windows Git client.
Next, make sure you have installed Anaconda as described above.
The final, and critical, step is tell Git Bash where Anaconda is installed, so
that you can use Anaconda Python from Git Bash. First, open an Anaconda terminal
and identify the installation location of Anaconda with this command (if the
which
command is not available, please ensure you have installed Git Bash with
all default options):
which anaconda
Take note of the result. It will typically be something like
/c/Users/YourUserName/Anaconda3/Scripts/conda
for user installs, and
/c/Users/YourUserName/AppData/Local/Continuum/Anaconda3/Scripts/conda
for
system-wide installs.
Now you can add Anaconda to your Git Bash $PATH
:
- Open Git Bash from the Start Menu.
- Using the location of Anaconda you identified in the previous step, run the
following command:
export PATH=$PATH:~/AppData/Local/Continuum/Anaconda3/Scripts
- You can replace
/c/Users/YourUserName
with~
- You should not include the trailing
/conda
- You can replace
- Run the following command:
source activate
Now you are good to go!
Making it Permanent
You will need to repeat the previous three steps everytime you open a new Git Bash shell. If you want to make these changes permanent then you can use the following commands in Git Bash to add the changes to
$PATH
to your.bashrc
file (remember to substitute the actual location of your Anaconda installation):cd echo 'export PATH=$PATH:~/AppData/Local/Continuum/Anaconda3/Scripts' >> .bashrc` echo 'source activate >> .bashrc'
Note that you must use single quotes, not double quotes due to the way that
echo
expands values.