Summary and Setup

Edit this page

A workshop teaching concepts and skills required for researchers to develop and validate artificial intelligence models.

Target Audience


Our primary audience is graduate students/early career researchers who have or are going to have data and want to begin applying ML/DL/AI methods to extract insights. We also hope to help: Research group leaders; educators; others who want to expand their understanding of the technologies so they can better advise other people.

Learning Objectives


By the end of the workshop, learners will be able to…

  1. Define common terms encountered in artificial intelligence, including deep learning, machine learning, and large language models.
  2. Summarise the difference between supervised and unsupervised methods, and the kinds of tasks these different methods are suited to.
  3. Discuss how experimental design and choices made when data is collected can influence the quality and evaluation of a machine learning model.
  4. Prepare data for use in a machine learning application, through normalisation, labeling, and other pre-processing steps.
  5. Train machine learning and deep learning models for regression and classification tasks.
  6. Compare some popular metrics to evaluate the quality of a model and apply these.
  7. Identify common issues with a model including bias and overfitting.

Lessons


Note: the curriculum for this workshop is in early but active development. We recommend the Introduction to Deep Learning lesson in The Carpentries Lab if you want to start learning similar skills right away.

Lesson Overview
Connecting Key Concepts in Machine Learning Build understanding of key concepts in machine learning and artificial intelligence, describe relationships between these concepts, and connect them to the research context.
Preparing Data for Machine Learning Organise, clean, label, and format data so that it is ready to be used in the training and validation of a model.
Developing and Evaluating a Deep Learning Model Use supervised learning to train random forest and neural network models that can predict the similarity or categories of input data. Apply methods to validate the models you develop and estimate the quality of their predictions.

FIXME: Setup instructions live in this document. Please specify the tools and the data sets the Learner needs to have installed.

Data Sets


Download the data zip file and unzip it to your Desktop

Software Setup


Discussion

Details

Setup for different systems can be presented in dropdown menus via a spoiler tag. They will join to this discussion block, so you can give a general overview of the software used in this lesson here and fill out the individual operating systems (and potentially add more, e.g. online setup) in the solutions blocks.

Use PuTTY

Use Terminal.app

Use Terminal