This lesson is being piloted (Beta version)

GPU Programming


The participant should:

  • be familiar with Python
  • be comfortable working in Jupyter
  • have the ability to read and understand C code


  • knowledge of NumPy
  • familiarity with high-performance computing concepts


Setup Download files required for the lesson
00:00 1. Introduction What is a Graphics Processing Unit?
Can a GPU be used for anything else than graphics?
Are GPUs faster than CPUs?
00:15 2. Using your GPU with CuPy How can I increase the performance of code that uses NumPy?
How can I copy NumPy arrays to the GPU?
02:15 3. Accelerate your Python code with Numba How can I run my own Python functions on the GPU?
03:15 4. A Better Look at the GPU How does a GPU work?
03:35 5. Your First GPU Kernel How can I parallelize a Python application on a GPU?
How to write a GPU program?
What is CUDA?
04:45 6. Registers, Global, and Local Memory What are registers?
How to share data between host and GPU?
Which memory is accessible to threads and thread blocks?
05:30 7. Shared Memory and Synchronization Is there a way to share data between threads of a same block?
Can threads inside a block wait for other threads?
06:25 8. Constant Memory Is there a way to have a read-only cache in CUDA?
07:05 9. Concurrent access to the GPU Is it possible to concurrently execute more than one kernel on a single GPU?
07:45 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.