Snakemake and the Cluster

Last updated on 2023-08-02 | Edit this page

Overview

Questions

  • How can we express a one-task cluster operation in Snakemake?

Objectives

  • Write a Snakefile that executes a job on the cluster
  • Use MPI options to ensure the job runs in parallel

Snakemake and the Cluster


Snakemake has provisions for operating on an HPC cluster.

Various command-line arguments can be provided to tell Snakemake not to run things locally, but do run things via the queuing system instead.

In this lesson, we will repeat the first module, running the admahl code on the cluster, but will use snakemake to make it happen.

Write a cluster Snakemake rule file


Open your favorite editor, do the thing. Specify resources. Provide command line arguments to do the cluster operations by hand.

Run Snakemake


Throw the switch!

Challenge

How can you control the degree of parallelism of your cluster task?

Use the “mpi” option in the resource block of the Snakemake rule, and specify the number of tasks. This will be mapped to the -n argument of the equivalent sbatch command.

Key Points

  • Snakemake rule files can submit cluster jobs.
  • There are a lot of options.