This lesson is being piloted (Beta version)
If you teach this lesson, please tell the authors and provide feedback by opening an issue in the source repository

Logistic regression for public health

Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This lesson was designed for researchers interested in working with public health data in R, but may be of interest to researchers in other fields as well.

This lesson provides an introduction to binary logistic regression. This model has a binary outcome variable, i.e. a variable that can only take one of two values. The episodes in this lesson cover binary response variables, the uses and equation of logistic regression, fitting and evaluating logistic regression models with one continous explanatory variable, making predictions and assessing model fit and assumptions.

Getting started

To get started, see the instructions in the Setup page. There you will learn how to obtain the data and packages used in this lesson.

Prerequisites

This lesson does not require a formal background in statistics.

This lesson requires:

Schedule

Setup Download files required for the lesson
00:00 1. An introduction to binary response variables How can we calculate probabilities of success and failure?
How do we interpret the expectation of a binary variable?
How can we calculate and interpret the odds?
How can we calculate and interpret the log odds?
01:00 2. An introduction to logistic regression In what scenario is a logistic regression model useful?
How is the logistic regression model expressed in terms of the log odds?
How is the logistic regression model expressed in terms of the probability of success?
What is the effect of the explanatory variable in terms of the odds?
02:30 3. Logistic regression with one continuous explanatory variable How can we visualise the relationship between a binary response variable and a continuous explanatory variable in R?
How can we fit a logistic regression model in R?
How can we interpret the output of a logistic regression model in terms of the log odds in R?
How can we interpret the output of a logistic regression model in terms of the multiplicative change in the odds of success in R?
How can we visualise a logistic regression model in R?
03:20 4. Making predictions from a logistic regression model How can we calculate predictions from a logistic regression model manually?
How can we calculate predictions from a logistic regression model in R?
04:00 5. Assessing logistic regression fit and assumptions How can we interpret McFadden’s $R^2$ and binned residual plots?
What are the assumptions of logistic regression?
05:15 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.