Instructor Notes

Thank you for teaching high dimensional statistics with R! We hope you enjoy teaching the lesson. This page contains additional information for instructors.

The materials for each episode are self-contained and can be found through the episode links on the home page.

In previous rounds of teaching, the lesson was taught in four sessions, each lasting 2 hours and 50 minutes. We also advise allowing around 40 minutes of additional time for breaks. The recommended timings for each session are as follows:

Session 1:

  • Introduction to high-dimensional data (episode 1): 30 minutes of teaching time, 20 minutes for exercises (total: 50 minutes).
  • Regression with many outcomes (episode 2): 70 minutes of teaching time, 50 minutes for exercises (total: 120 minutes).

Session 2:

  • Regularised regression (episode 3): 110 minutes of teaching time, 60 minutes for exercises (total: 170 minutes).

Session 3:

  • Principal component analysis (episode 4): 90 minutes of teaching time, 40 minutes for exercises (total: 130 minutes).
  • Factor analysis (episode 5): 30 minutes of teaching time, 10 minutes for exercises (total: 40 minutes).

Session 4:

  • K-means (episode 6): 60 minutes of teaching time, 20 minutes for exercises (total: 80 minutes).
  • Hierarchical clustering (episode 7): 70 minutes of teaching time, 20 minutes for exercises (total: 90 minutes).

This lesson was designed for learners with good familiarity with R programming and foundational knowledge of statistics. To manage cognitive load, it may be necessary to omit later chapters, especially if some learners have difficulties with the programming techniques shown here or foundational statistical concepts. In this case, we recommend omitting Chapter 5: Factor analysis and one or both of the clustering chapters (Chapter 6: K-means and Chapter 7: Hierarchical clustering).

Introduction to high-dimensional data


Regression with many outcomes


Regularised regression


Principal component analysis


Factor analysis


K-means


Hierarchical clustering