Introduction


Figure 1

An infographic showing the relation of AI, ML, NN and DL. NN are methods in DL which is a subset of ML algorithms that falls within the umbrella of AI
Image credit: Tukijaaliwa, CC BY-SA 4.0, via Wikimedia Commons, original source

Figure 2

A diagram of a single artificial neuron combining inputs and weights using an activation function.

Figure 3

A diagram of a three layer neural network with an input layer, one hidden layer, and an output layer.
Image credit: Glosser.ca, CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0, via Wikimedia Commons, original source

Figure 4

A diagram of a neural network with 2 inputs, 2 hidden layer neurons, and 1 output.

Figure 5

Plot of the sigmoid function
A. Sigmoid activation function

Figure 6

Plot of the ReLU function
B. ReLU activation function

Figure 7

Plot of the Identity function
C. Identity (or linear) activation function

Figure 8

An example of a deep neural network

Figure 9

Line plot comparing squared error loss function with the Huber loss function where delta = 1, showing the cost of prediction error of both functions equal where y_true - y_pred is between -1 and 1, then rising linearly with the Huber loss function as y_true diverges further from y_pred, as opposed to expontentially for the squared error function.

Figure 10

A graph showing an exponentially decreasing loss over the first 1500 epochs of training an example network.

Classification by a neural network using Keras


Figure 1

Illustration of the three species of penguins found in the Palmer Archipelago, Antarctica: Chinstrap, Gentoo and Adele
Artwork by @allison_horst

Figure 2

Illustration of the beak dimensions called culmen length and culmen depth in the dataset
Artwork by @allison_horst

Figure 3

Pair plot showing the separability of the three species of penguin for combinations of dataset attributes

Figure 4

Pair plot showing the separability of the two sexes of penguin for combinations of dataset attributes

Figure 5

Training loss curve of the neural network training which depicts exponential decrease in loss before a plateau from ~10 epochs

Figure 6

  • (optional) Something went wrong here during training. What could be the problem, and how do you see that in the training curve? Also compare the range on the y-axis with the previous training curve. Very jittery training curve with the loss value jumping back and forth between 2 and 4. The range of the y-axis is from 2 to 4, whereas in the previous training curve it was from 0 to 2. The loss seems to decrease a litle bit, but not as much as compared to the previous plot where it dropped to almost 0. The minimum loss in the end is somewhere around 2.

  • Figure 7

    Confusion matrix of the test set with high accuracy for Adelie and Gentoo classification and no correctly predicted Chinstrap

    Monitor the training process


    Figure 1

    18 European locations in the weather prediction dataset
    European locations in the weather prediction dataset

    Figure 2

    Plot of the loss as a function of the weights. Through gradient descent the global loss minimum is found

    Figure 3

    Plot of the RMSE over epochs for the trained model that shows a decreasing error metric

    Figure 4

    Scatter plot between predictions and true sunshine hours in Basel on the train set showing a concise spread

    Figure 5

    Scatter plot between predictions and true sunshine hours in Basel on the test set showing a wide spread

    Figure 6

    Scatter plot of predicted vs true sunshine hours in Basel for the test set where today's sunshine hours is considered as the true sunshine hours for tomorrow

    Figure 7

    Plot of RMSE vs epochs for the training set and the validation set which depicts a divergence between the two around 10 epochs.

    Figure 8

    Plot of RMSE vs epochs for the training set and the validation set with similar performance across the two sets.

    Figure 9

    Plot of RMSE vs epochs for the training set and the validation set displaying similar performance across the two sets.

    Figure 10

    Output of plotting sample

    Figure 11

    Scatter plot between predictions and true sunshine hours for Basel on the test set

    Figure 12

    Scatterplot of predictions and true number of sunshine hours

    Figure 13

    Which will show an interface that looks something like this: Screenshot of tensorboard


    Advanced layer types


    Figure 1

    A 5 by 5 grid of 25 sample images from the dollar street 10 data-set. Each image is labelled with a category, for example: 'street sign' or 'soap dispenser'.
    Sample images from the dollar street 10 data-set. Each image is labelled with a category, for example: ‘street sign’ or ‘soap dispenser’

    Figure 2

    Example of a convolution matrix calculation

    Figure 3

    Convolution example on an image of a cat to extract features

    Figure 4

    Plot of training accuracy and validation accuracy vs epochs for the trained model

    Figure 5

    Plot of training loss and validation loss vs epochs for the trained model

    Figure 6

    Plot of training accuracy and validation accuracy vs epochs for a model with only dense layers

    Figure 7

    Plot of training accuracy and validation accuracy vs epochs for the trained model

    Figure 8

    A sketch of a neural network with and without dropout

    Figure 9

    Plot of training accuracy and validation accuracy vs epochs for the trained model

    Figure 10

    Plot of vall loss vs dropout rate used in the model. The val loss varies between 2.3 and 2.0 and is lowest with a dropout_rate of 0.9

    Transfer learning


    Figure 1

    Training history for training the pre-trained-model. The training accuracy slowly raises from 0.2 to 0.9 in 20 epochs. The validation accuracy starts higher at 0.25, but reaches a plateau around 0.64 The final validation accuracy reaches 64%, this is a huge improvement over 30% accuracy we reached with the simple convolutional neural network that we build from scratch in the previous episode.


    Outlook