This lesson is still being designed and assembled (Pre-Alpha version)

Data Science for Practicing Clinicians: Setup

Pre-requisites for Data Science for Doctors.

Data Science for Doctors teaching is hands-on, so participants need to bring their own laptops to insure the proper setup of tools for an efficient workflow. In exchange, you will leave with a working version of R, and the skills to use it.

During the course we will use RStudio Cloud. We would encourage you to start a new project, which we have already templated for you. You can then download this project onto your local system to use as an exemplar for future work.

We recommend you install the programmes below before the course begins.

Please note all of the requirements specified below are free. Get in contact if you are about to hand over any money when setting up the pre-requisites, because you shouldn’t be.

Pre-course Objectives

Steps

1. Charge your laptop

We’ll provide a power bar, but it will be easier if you begin with a fully charged laptop. Don’t forget to bring your laptop and power cord with you, as the course requires you to use your laptop continuously during the 2 days.

Please do not come with an iPad. It is not sufficient for the course.

2. Install R and R studio

We’re going to get you up and running in R which is a powerful, free computer langugage that is particularly well suited to data science and statistics. It is favoured the world over for:

Download and install R from here

Download and install RStudio. This is a nice interface for R, and the easiest way to use it. Download it here. There should be an ‘installer’ for your operating system.

Operating system specific instructions and links are detailed below

Windows

Install R by downloading and running this .exe file from CRAN. Also, please install the RStudio IDE.

Mac OS X

Install R by downloading and running this .pkg file from CRAN. Also, please install the RStudio IDE.

Linux

You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run sudo apt-get install r-base and for Fedora run sudo yum install R). Also, please install the RStudio IDE.

3. Download and install a spreadsheet program

This only applies if you don’t already have Excel or simillar installed.

You are likely to interact with data via a spreadsheet, as it is the most common method for people to store data. To interact with spreadsheets, we can use LibreOffice, Microsoft Excel, Apple Numbers, Gnumeric, OpenOffice.org, or other programs. Commands may differ a bit between programs, but general ideas for thinking about spreadsheets are the same.

For this lesson, if you don’t have a spreadsheet program already, you can use LibreOffice. It’s a free, open source spreadsheet program.

4. Join our group on Slack

Slack is a geeky version of WhatsApp, but much better for coding and collaborating. We will use it to share files. You can use it to chat, ask questions, share material, and provide feedback. You should recieve an email invite to our group before the course.

5. Choose the dataset that you want to bring on the course

The final workshop of this course allows you to practice on your own data. You can either come in with your own dataset, and let us help you get started, or you can use the synthetic CC-HIC dataset provided.

If you have problems with installing software, there is a small amount of setup time allocated for this at the beginning of the course. Bring your laptop, and we will be happy to help.

See you there!