This lesson is still being designed and assembled (Pre-Alpha version)

Simple linear regression for public health

Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This lesson was designed for researchers interested in working with public health data in R, but may be of interest to researchers in other fields as well.

This lesson provides an introduction to simple linear regression. The episodes cover the concept of simple linear regression, the use of simple linear regression with various types of predictor variables (single continuous variable, single two-group factor variable and single factor variable with more than two groups), post-hoc comparisons between groups, the assessment of model fit, the assessment and importance of model assumptions and making predictions.

Getting started

To get started, see the instructions in the Setup page. There you will learn how to obtain the data and packages used in this lesson.


This lesson does not require a formal background in statistics.

This lesson requires:

  • Working copies of R and RStudio. See here for installation instructions.
  • An understanding of how to use the Tidyverse packages to summarise and manipulate data in RStudio. See these episodes on data handling and data manipulation.
  • An understanding of how to use the ggplot2 package to plot data in RStudio. See this episode on data visualisation.
  • An understanding of the concepts covered in the statistical thinking for public health lesson.


Setup Download files required for the lesson
00:00 1. An introduction to linear regression ABC
00:10 2. Linear regression with one continuous explanatory variable GHI
00:20 3. Linear regression with a two-level factor explanatory variable GHI
00:30 4. Linear regression with a multi-level factor explanatory variable GHI
00:40 5. Assessing simple linear regression model fit and assumptions GHI
00:50 6. Making predictions from a simple linear regression model GHI
01:00 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.