Lesson Title: Key Points

Beta

Lesson Title

IntroductionCommon problems Overview and rationaleWhy Python?What is parallel computing?

Programs are parallelizable if you can identify independent tasks.
To make programs scalable, you need to chunk the work.
Parallel programming often triggers a redesign; we use different patterns.
Doing work in parallel does not always give a speed-up.

BenchmarkingA first example with Dask Memory profilingUsing many cores

Understanding performance is often non-trivial.
Memory is just as important as speed.
To measure is to know.

Computing $\pi$Parallelizing a Python application Using Numba to accelerate Python code

Always profile your code to see which parallelization method works best.
Vectorized algorithms are both a blessing and a curse.
Numba can help you speed up code.

Threads And ProcessesThreading Multiprocessing

If we want the most efficient parallelism on a single machine, we need to work around the GIL.
If your code disables the GIL, threading will be more efficient than multiprocessing.
If your code keeps the GIL, some of your code is still in Python and you are wasting precious compute time!

Delayed EvaluationDask Delayed

We can change the strategy by which a computation is evaluated.
Nothing is computed until we run compute().
With delayed evaluation Dask knows which jobs can be run in parallel.
Call compute only once at the end of your program to get the best results.

Map and Reduce

Use abstractions to keep programs manageable.

Exercise with FractalsThe Mandelbrot and Julia fractals

Actually making code faster is not always straightforward.
Easy one-liners can get you 80% of the way.
Writing clean and modular code often makes parallelization easier later on.

AsyncioIntroduction to Asyncio A first programWorking with `asyncio` outside Jupyter

Use the async keyword to write asynchronous code.
Use await to call coroutines.
Use asyncio.gather to collect work.
Use asyncio.to_thread to perform CPU intensive tasks.
Inside a script: always create an asynchronous main function, and run it with asyncio.run.

Calling External C and C++ Libraries from PythonCalling C and C++ libraries Call the C library from multiple threads simultaneously.

Multiple options are available to call external C and C++ libraries, and the best choice depends on the complexity of your problem.
Obviously, there is an extra compile-and-link step, but the execution will be much faster than pure Python.
Also, the GIL will be circumvented in calling these libraries.
Numba might also offer you the speed-up you want with even less effort.