This lesson is still being designed and assembled (Pre-Alpha version)

Statistical Inference for Biology: Setup


R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis. To interact with R, we use RStudio.

  1. Install the latest version of R from CRAN.

  2. Install the latest version of RStudio here. Choose the free RStudio Desktop version for Windows, Mac, or Linux.

  3. Start RStudio. The tidyverse contains several packages that work together for everyday use in data science. You can install them from the Console or from the RStudio Packages tab.


Make sure that the installation was successful by loading the tidyverse library. Do this in the Console as below, or check the box next to the tidyverse library in the RStudio Packages tab.


Also install and load the libraries for downloader, UsingR and rafalib by following the same procedure that you followed for the tidyverse.

Data files and project organization

  1. Make a new folder in your Desktop called inference. Move into this new folder.

  2. Create a data folder to hold the data, a scripts folder to house your scripts, and a results folder to hold results.

Alternatively, you can use the R console to run the following commands for steps 1 and 2.


Please download the following files and place them in your data folder. You can download the files from the URLs below and move the files the same way that you would for downloading and moving any other kind of data.

Alternatively, you can copy and paste the following into the R console to download the data.

download.file(url = "", destfile = "data/femaleMiceWeights.csv")

download.file(url = "", destfile = "data/femaleControlsPopulation.csv")
download.file(url = "", destfile = "data/mice_pheno.csv")