This lesson is being piloted (Beta version)

Parallel Programming in Python: Glossary

Key Points

Introduction
  • Programs are parallelizable if you can identify independent tasks.

  • To make programs scalable, you need to chunk the work.

  • Parallel programming often triggers a redesign; we use different patterns.

  • Doing work in parallel does not always give a speed-up.

Measuring performance
  • It is often non-trivial to understand performance

  • Memory is just as important as speed

  • Measuring is knowing

Accellerators: vectorized Numpy and Numba
  • Always profile your code to see which parallelization method works best

  • Vectorized algorithms are both a blessing and a curse.

  • Numba can help you speed up code

Dask abstractions: delays
  • Use abstractions to keep programs manageable

Threading and Multiprocessing
  • If we want the most efficient parallelism on a single machine, we need to circumvent the GIL.

  • If your code releases the GIL, threading will be more efficient than multiprocessing.

Dask abstractions: bags
  • Use abstractions to keep programs manageable

Snakemake
  • Snakemake is ideal for specifying workflows with large granularity.

  • Dependencies are automatically resolved; Snakemake is ‘demand driven’ programming.

  • You may freely intermix shell commands with Python code.

  • Results are persistent: you can resume computations when needed.

Exercise: Photo Mosaic
  • You sometimes have to try different strategies to find out what works best.

Exercise: Mandelbrot fractals
  • You sometimes have to try different strategies to find out what works best.

Dynamic programming
  • FIXME

Calling external C and C++ libraries from Python
  • Multiple options are available in calling external C and C++ libraries and that the best choice can depend on the complexity of your problem.

  • Obviously, there is an extra compile and link step, but you will get a much faster execution compared to pure Python.

  • Also, the GIL will be circumvented in calling these libaries.

  • Numba might also offer you the speedup you want with even less effort.

Asyncio fundamentals
  • Programs are parallelizable if you can identify independent tasks.

Glossary

FIXME