This lesson is in the early stages of development (Alpha version)

Statistical Inference for Biology

More data are better than less data, right? When interpreted through sophisticated analytical skills the answer could be yes. Absent these skills, analysts can be tricked by patterns in “big data” that appear by chance. This lesson presents statistical skills and knowledge to help data analysts in the life sciences to avoid some of the most common pitfalls of big data. Lesson material is derived from the HarvardX Biomedical Data Science series, part of which is published as the book Data Analysis for the Life Sciences (Irizarry & Love, 2016).

Prerequisites

This lesson assumes basic skills in the R statistical programming language and the RStudio integrated development environment.

To get started, follow the directions in the Setup tab to get access to the required software and data for this workshop.

Schedule

Setup Download files required for the lesson
00:00 1. Introduction What is statistical inference?
Why do biomedical researchers need to learn statistics now?
00:05 2. Inference What does inference mean?
Why do we need p-values and confidence intervals?
What is a random variable?
What exactly is a distribution?
02:00 3. Populations, Samples and Estimates What is a parameter from a population?
What are sample estimates?
How can we use sample estimates to make inferences about population parameters?
02:00 4. Central Limit Theorem and the t-distribution What is a parameter from a population?
02:00 5. Central Limit Theorem in practice How is the CLT used in practice?
02:00 6. t-tests in practice How are t-tests used in practice?
02:00 7. Confidence Intervals What is a confidence interval?
02:00 8. Power Calculations What is statistical power?
How is power calculated?
02:00 9. Monte Carlo simulation How are Monte Carlo simulations used in practice?
02:00 10. Permutations ?
02:00 11. Association tests ?
02:00 12. Exploratory Data Analysis ?
02:00 13. Plots to avoid ?
02:00 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.

This lesson was funded by NIH grant R25GM123516 (Churchill) and The Jackson Laboratory Director's Innovation Fund (McClatchy & Churchill).