Skip to main content
Beta
This lesson is in the beta phase, which means that it is ready for teaching by instructors outside of the original author team.
Lesson Title
IntroductionCommon problemsOverview and rationaleWhy Python?What is parallel computing?
- Programs are parallelizable if you can identify independent
tasks.
- To make programs scalable, you need to chunk the work.
- Parallel programming often triggers a redesign; we use different
patterns.
- Doing work in parallel does not always give a speed-up.
- Understanding performance is often non-trivial.
- Memory is just as important as speed.
- To measure is to know.
- Always profile your code to see which parallelization method works
best.
- Vectorized algorithms are both a blessing and a curse.
- Numba can help you speed up code.
- If we want the most efficient parallelism on a single machine, we
need to work around the GIL.
- If your code disables the GIL, threading will be more efficient than
multiprocessing.
- If your code keeps the GIL, some of your code is still in Python and
you are wasting precious compute time!
- We can change the strategy by which a computation is evaluated.
- Nothing is computed until we run
compute()
.
- With delayed evaluation Dask knows which jobs can be run in
parallel.
- Call
compute
only once at the end of your program to
get the best results.
- Use abstractions to keep programs manageable.
- Actually making code faster is not always straightforward.
- Easy one-liners can get you 80% of the way.
- Writing clean and modular code often makes parallelization easier
later on.