Summary and Schedule
This is the GPU programming lesson.
Do you want to teach this lesson?
Do you want to teach GPU programming? This material is open-source and freely available. Are you planning on using our material in your teaching? Send us an email at training@esciencecenter.nl. We would love to help you prepare to teach the lesson and receive feedback on how it could be further improved, based on your experience in the workshop.
| Setup Instructions | Download files required for the lesson | |
| Duration: 00h 00m | 1. Introduction |
“What is a Graphics Processing Unit?” “Can a GPU be used for anything other than graphics?” “For which kinds of tasks are GPUs faster than CPUs?” :::::::::::::::::::::::::::::::::::::: |
| Duration: 00h 15m | 2. Using your GPU with CuPy |
“How can I run NumPy code on the GPU?” “How can I copy NumPy arrays to the GPU?” “How do I reliably measure the execution time of GPU code?” :::::: |
| Duration: 04h 45m | 3. Accelerate your Python code with Numba |
“How can I run my own Python functions on the
GPU?” :::::::::::::::::::::::::::::::::::::: |
| Duration: 05h 45m | 4. A Better Look at the GPU |
“How does a GPU work?” :::::::::::::::::::::::::::::::::::::: |
| Duration: 06h 05m | 5. Your First GPU Kernel |
“How do I identify data parallelism in my code?” “How do I write a GPU program?” “What is CUDA?” “How are CUDA threads organised into blocks and grids?” :::::::::::::::::::::::::::::::::::::: |
| Duration: 07h 15m | 6. Registers, Global, and Local Memory |
“What are registers?” “How do I share data between host and GPU?” “What are the differences between the memory spaces available in CUDA?” “What happens when a thread uses more variables than there are available registers?” :::::::::::::::::::::::::::::::::::::: |
| Duration: 08h 00m | 7. Shared Memory and Synchronization |
“Is there a way to share data between threads of a same block?” “Can threads inside a block wait for other threads?” :::::::::::::::::::::::::::::::::::::: |
| Duration: 08h 55m | 8. Constant Memory |
“Is there a way to have a read-only cache in
CUDA?” :::::::::::::::::::::::::::::::::::::: |
| Duration: 09h 35m | 9. Concurrent access to the GPU |
“Is it possible to concurrently execute more than one kernel on a single
GPU?” :::::::::::::::::::::::::::::::::::::: |
| Duration: 10h 15m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
Programming environment
The GPU programming lesson can be taught using Jupyter Notebook, a programming environment that runs in a web browser. For this to work we need a reasonably up-to-date browser. The current versions of the Chrome, Safari and Firefox browsers are all supported.
In case you do not have any GPU available on your laptop, a good alternative is to use Google Colab.
Local setup
To setup locally, depending on how you installed Python, there are
two alternatives: - use pip if you installed Python
normally using your OS’s package manager or app store, - use
conda or mamba if you installed the conda
distribution of Python.
In case you don’t have Python installed, we recommend you start with
Miniforge.
Miniforge by default sets the conda-forge
channel as the default, and provides the alternative package manager
mamba. mamba is a lot more performant compared
to conda, making the user experience significantly
smoother.
Whichever case it is for you, the first step is to create an isolated environment for the workshop, this way you won’t interfere with your existing setup. You can install all the dependencies for the workshop within this environment. In the Python ecosystem, these kinds of isolated environments are known as virtual environments.
Using pip
To create a virtual environment using pip, you need to
install the virtualenv package using your OS’s package
manager (it may have alternate names like python-virtualenv
or python3-virtualenv). After you have done this, you can
follow the steps below:
BASH
cd /path/to/workshop/dir
python3 -m virtualenv --prompt gpu-workshop venv
source venv/bin/activate
pip install -U pip # update pip to the latest version
pip install cupy-cuda12x numba jupyterlab matplotlib scipy astropy
We are installing the precompiled cupy libraries
compiled against CUDA 12. This is always faster to install, but if you
want to use a custom CUDA installation, you can
pip install cupy instead. Also note, if you also want the
cuda compiler nvcc, you have to install the CUDA toolkit
manually. However, this is not required to follow the workshop. More
information can be found in the cupy
documentation.
Using conda or mamba
conda or mamba have support for virtual
environments built-in. You can create a new virtual environment with
BASH
mamba create -n gpu-workshop python=3.11
mamba activate gpu-workshop
mamba install cupy numba jupyterlab matplotlib scipy astropy
If you are using conda, you can simply replace
mamba with conda in the commands above.
Starting a Jupyter server
Now you can start your Jupyter server as shown below, which will open a tab with Jupyter in your default browser:
If you do not want Jupyter to open a tab in your browser automatically, you can use the alternative below:
This will print out a url in your terminal, which you can then open in the browser of your choice.