This lesson will introduce you to open data science using R, RStudio and GitHub so you can work with data in an open, reproducible, and collaborative way. “Open data science” means that methods, data, and code are available so that others can access, reuse, and build from it without much fuss. Here you will learn a workflow with R, RStudio, git and GitHub. A great deal of this lesson is based on a publication by Lowndes et al. 2017, Nature Ecology & Evolution: Our path to better science in less time using open data science tools and on the Ocean Health Index book - Introduction to Open Data Science.
This is going to be fun, because learning these open data science tools and good practices is empowering! This training material is written (and continously upgraded) so you can use it as self-paced learning, or it can be used to teach an in-person workshop where the instructor live codes. Either way, you should do everything hands-on on your own computer as you learn.
Before you begin, be sure you are all set up (see below).
Before you start
- Before the training, please make sure you have completed the Setup instructions.
- There are two options to install necessary softwares and packages:
- (Option 1) The favored one is to install everything using a homemade ad hoc Docker image available from the Docker Hub.
- (Option 2) Alternatively, you can install everything manually. Follow the Setup instructions to do so.
- Please read the workshop Code of Conduct to make sure this workshop stays welcoming for everybody.
- Get comfortable: if you’re not in a physical workshop, be set up with two screens if possible. You will be following along in RStudio on your own computer while also following this tutorial on your own.