Instructor Notes
Setup before the lesson
The required python packages for this lesson often result in installation issues, so it is advisable to organize a pre-workshop setup session where learners can show their installation and get help with problems.
Installations on learners’ devices have the advantage of lowering the threshold to continue with the material beyond the workshop. Note though, that this lesson can also be taught on a cloud environment such as Google colab or My Binder. This can serve as a backup environment if local installations fail. Some cloud environments offer the possibility to run the code on a GPU, which significantly increases the runtime of deep learning code.
Deep learning workflow
The episodes are quite long, because they cover a full cycle of the deep learning workflow. It really helps to structure your teaching by making it clear where in the 10-step deep learning workflow we are. You can for example use headers in your notebook for each of the steps in the workflow.
Episode 3: Monitor the training process
When episode 3 is taught on a different day then episode 2, it is very useful to start with a recap of episode 2. The Key Points of episode 2 can be iterated, and you can go through the code of the previous session (without actually running it). This will help learners in the big exercise on creating a neural network.
The following exercises work well to do in groups / break-out rooms: - Split data into training, validation, and test set - Create the neural network. Note that this is a fairly challenging exercise, but learners should be able to do this based on their experiences in episode 2 (see also remark about recap). - Predict the labels for both training and test set and compare to the true values - Try to reduce the degree of overfitting by lowering the number of parameters - Create a similar scatter plot for a reasonable baseline - Open question: What could be next steps to further improve the model? All other exercises are small and can be done individually.
Presentation slides
There are no official presentation slides for this lesson, but this material does include some example slides from when this course was taught by different institutions. These slides can be found in the slides folder.
Introduction
BREAK
This is a good time for switching instructor and/or a break.
Classification by a neural network using Keras
Instructor Note
It is good to stress the goal for this episode a few times, because learners will usually have a lot of questions like: ‘Why don’t we normalize our features’ or ‘Why do we choose Adam optimizer?’. It can be a good idea to park some of these questions for discussion in episode 3 and 4.
BREAK
This is a good time for switching instructor and/or a break.
BREAK
This is a good time for switching instructor and/or a break.
Monitor the training process
Copy-pasting code
In this episode we first introduce a simple approach to the problem, then we iterate on that a few times to, step-by-step, working towards a more complex solution. Unfortunately, this involves using the same code repeatedly over and over again, only slightly adapting it.
To avoid too much typing, it can help to copy-paste code from higher up in the notebook. Be sure to make it clear where you are copying from and what you are actually changing in the copied code. It can for example help to add a comment to the lines that you added.
BREAK
This is a good time for switching instructor and/or a break.
BREAK
This is a good time for switching instructor and/or a break.
Advanced layer types
Framing the classification task
The sample images from the dataset, shown below, provide a good opportunity to lead a discussion with learners about the nature of the images and the classification task we will be training a model to perform. For example, although the images can all be assumed to include the object they are labelled with, not all images are of those objects i.e. the object is one of several present in the image. This makes the task of the classifier more difficult, as does the more culturally diverse set of objects present in the image, but both of these properties make the trained model more robust. After training, we can consider ourselves to be asking the model “which of these ten objects is present in this image?”, as opposed to e.g. “which of these ten objects is this an image of?”
Demonstrate searching for existing architectures
At this point it can be nice to apply above callout box and demonstrate searching for state-of-the-art implementations. If you google for ‘large CNN image classification Keras implementation’ one of the top search results links to an example from the Keras documentation for a small version of the Xception model.
It can be a nice learning opportunity to go through the notebook and show that the learners should already be familiar with a lot of the syntax (for example Conv2D, Dense, BatchNorm layers, adam optimizer, the deep learning workflow). You can show that even though the model is much deeper, the input and output layer are still the same. The aim is to demonstrate that what we are learning is really the basis for more complex models, and you do not need to reinvent the wheel.
BREAK
This is a good time for switching instructor and/or a break.
Comparison with a network with only dense layers
The callout box below compares the CNN approach with a network with only dense layers. Depending on time, the following discussion can be extended in depth up to your liking. You have several options:
- It can be used as a good recap exercise. The exercise question is then: ‘How does this simple CNN compare to a neural network with only dense layers? Implement a dense neural network and compare its performance to that of the CNN’. This will take 30-45 minutes and might deviate the focus away from CNNs.
- You can demonstrate (no typing along), just to show how the network would look like and make the comparison.
- You can just mention that a simple network with only dense layers reaches 18% accuracy, considerably worse than our simple CNN.
Do a live demo instead of live coding
You might want to demonstrate this section on hyperparameter tuning
instead of doing live coding. The goal is to show that hyperparameter
tuning can be done easily with keras_tuner
, not to memorize
all the exact syntax of how to do it. This will probably save you half
an hour of participants typing over code that they already know from
before. In addition, on really slow machines running the grid search
could possibly take more than 10 minutes.
Transfer learning
Outlook
Instructor Note
You don’t have to use this project as an example. It works best to use a suitable deep learning project that you know well and are passionate about.