Instructor Notes

Setup before the lesson

The required python packages for this lesson often result in installation issues, so it is advisable to organize a pre-workshop setup session where learners can show their installation and get help with problems.

Installations on learners’ devices have the advantage of lowering the threshold to continue with the material beyond the workshop. Note though, that this lesson can also be taught on a cloud environment such as Google colab or My Binder. This can serve as a backup environment if local installations fail. Some cloud environments offer the possibility to run the code on a GPU, which significantly increases the runtime of deep learning code.

Deep learning workflow

The episodes are quite long, because they cover a full cycle of the deep learning workflow. It really helps to structure your teaching by making it clear where in the 10-step deep learning workflow we are. You can for example use headers in your notebook for each of the steps in the workflow.

Episode 3: Monitor the training process

When episode 3 is taught on a different day then episode 2, it is very useful to start with a recap of episode 2. The Key Points of episode 2 can be iterated, and you can go through the code of the previous session (without actually running it). This will help learners in the big exercise on creating a neural network.

The following exercises work well to do in groups / break-out rooms: - Split data into training, validation, and test set - Create the neural network. Note that this is a fairly challenging exercise, but learners should be able to do this based on their experiences in episode 2 (see also remark about recap). - Predict the labels for both training and test set and compare to the true values - Try to reduce the degree of overfitting by lowering the number of parameters - Create a similar scatter plot for a reasonable baseline - Open question: What could be next steps to further improve the model? All other exercises are small and can be done individually.

Presentation slides

There are no official presentation slides for this lesson, but this material does include some example slides from when this course was taught by different institutions. These slides can be found in the slides folder.

Introduction

Instructor Note

There is an issue when rendering the MSE formula in the following box involving the Chrome browser on MacOS. To solve it:

Right-click on some of the misrendered MathJax.
Click on “Math Settings”.
Click on “Math Renderer”.
Click on “Common HTML”.

from: https://physics.meta.stackexchange.com/questions/14408/bug-in-mathjax-rendering-using-chrome

BREAK

This is a good time for switching instructor and/or a break.

Classification by a neural network using Keras

Instructor Note

It is good to stress the goal for this episode a few times, because learners will usually have a lot of questions like: ‘Why don’t we normalize our features’ or ‘Why do we choose Adam optimizer?’. It can be a good idea to park some of these questions for discussion in episode 3 and 4.

BREAK

This is a good time for switching instructor and/or a break.

Instructor Note

For optional question 3 in the challenge below named ‘Visualizing the model’, the goal is to visualize the network. It supplements the textual explanation of output from model.summary(). You could choose to show and discuss the resulting visualization to the learners, so that learners who did not finish the optional exercise can also learn from the visualization of the model.

BREAK

This is a good time for switching instructor and/or a break.

Monitor the training process

Copy-pasting code

In this episode we first introduce a simple approach to the problem, then we iterate on that a few times to, step-by-step, working towards a more complex solution. Unfortunately, this involves using the same code repeatedly over and over again, only slightly adapting it.

To avoid too much typing, it can help to copy-paste code from higher up in the notebook. Be sure to make it clear where you are copying from and what you are actually changing in the copied code. It can for example help to add a comment to the lines that you added.

BREAK

This is a good time for switching instructor and/or a break.

BREAK

This is a good time for switching instructor and/or a break.

Advanced layer types

Framing the classification task

The sample images from the dataset, shown below, provide a good opportunity to lead a discussion with learners about the nature of the images and the classification task we will be training a model to perform. For example, although the images can all be assumed to include the object they are labelled with, not all images are of those objects i.e. the object is one of several present in the image. This makes the task of the classifier more difficult, as does the more culturally diverse set of objects present in the image, but both of these properties make the trained model more robust. After training, we can consider ourselves to be asking the model “which of these ten objects is present in this image?”, as opposed to e.g. “which of these ten objects is this an image of?”

Demonstrate searching for existing architectures

At this point it can be nice to apply above callout box and demonstrate searching for state-of-the-art implementations. If you google for ‘large CNN image classification Keras implementation’ one of the top search results links to an example from the Keras documentation for a small version of the Xception model.

It can be a nice learning opportunity to go through the notebook and show that the learners should already be familiar with a lot of the syntax (for example Conv2D, Dense, BatchNorm layers, adam optimizer, the deep learning workflow). You can show that even though the model is much deeper, the input and output layer are still the same. The aim is to demonstrate that what we are learning is really the basis for more complex models, and you do not need to reinvent the wheel.

BREAK

This is a good time for switching instructor and/or a break.

Comparison with a network with only dense layers

The callout box below compares the CNN approach with a network with only dense layers. Depending on time, the following discussion can be extended in depth up to your liking. You have several options:

It can be used as a good recap exercise. The exercise question is then: ‘How does this simple CNN compare to a neural network with only dense layers? Implement a dense neural network and compare its performance to that of the CNN’. This will take 30-45 minutes and might deviate the focus away from CNNs.
You can demonstrate (no typing along), just to show how the network would look like and make the comparison.
You can just mention that a simple network with only dense layers reaches 18% accuracy, considerably worse than our simple CNN.

Do a live demo instead of live coding

You might want to demonstrate this section on hyperparameter tuning instead of doing live coding. The goal is to show that hyperparameter tuning can be done easily with keras_tuner, not to memorize all the exact syntax of how to do it. This will probably save you half an hour of participants typing over code that they already know from before. In addition, on really slow machines running the grid search could possibly take more than 10 minutes.

Instructor Notes

Setup before the lesson

Deep learning workflow

Episode 3: Monitor the training process

Presentation slides

Introduction

Instructor Note

BREAK

Classification by a neural network using Keras

Instructor Note

BREAK

Instructor Note

BREAK

Monitor the training process

Copy-pasting code

BREAK

BREAK

Advanced layer types

Framing the classification task

Demonstrate searching for existing architectures

BREAK

Comparison with a network with only dense layers

Do a live demo instead of live coding

Transfer learning

Outlook

Instructor Note