Amdahl Parallel Runs

Last updated on 2023-08-02 | Edit this page

Overview

Questions

  • How can we collect data on Amdahl run times?

Objectives

  • Collect systematic data on the runtime of the amdahl code

Systematic Data Collection


Using what we have learned so far, including Snakemake profiles and rules, we will now compose a Snakefile that runs the Amdahl example code over a range of parallel widths. This workflow will generate the data we will use in the next module to demonstrate the diminishing returns of increasing parallelism.

Write a File


Compose the Snakemake file that does what we want.

We can put the widths in a list and iterate over them. We will use the profile generated previously to ensure that the jobs run on the cluster.

Run Snakemake


Throw the switch!

Challenge

Our example has a single paramter, the parallelism, that we vary. How would you generalize this to arbitrary parameters?

Arbitrary parameters are still finite, so you could just generate a flat list of all the combinations, and iterate over that. Or you could generate two lists and do a nested loop.

Key Points

  • A relatively compact snakemake file collects interesting data.