Summary and Setup
In the last few years, the profiling of a large number of genome-wide features in individual cells has become routine. Consequently, a plethora of tools for the analysis of single-cell data has been developed, making it hard to understand the critical steps in the analysis workflow and the best methods for each objective of one’s study.
This Carpentries-style tutorial aims to provide a solid foundation in using Bioconductor tools for single-cell RNA-seq (scRNA-seq) analysis by walking through various steps of typical workflows using example datasets.
This tutorial is based on the the online book “Orchestrating Single-Cell Analysis with Bioconductor” (OSCA), published in 2020, and continuously updated by many contributors from the Bioconductor community. Like the book, this tutorial strives to be of interest to the experimental biologists wanting to analyze their data and to the bioinformaticians approaching single-cell data.
Prerequisites
- Familiarity with R/Bioconductor, such as the Introduction to data analysis with R and Bioconductor lesson.
- Familiarity with multivariate analysis and dimensionality reduction, such as Chapter 7 of the book Modern Statistics for Modern Biology by Holmes and Huber.
- Familiarity with the biology of gene expression and scRNA-seq, such as the review article A practical guide to single-cell RNA-sequencing by Haque et.al.
If you use materials of this lesson in published research, please cite:
Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, Geistlinger L, Marini F, Rue-Albrecht K, Risso D, Soneson C, Waldron L, Pagès H, Smith ML, Huber W, Morgan M, Gottardo R, Hicks SC. Orchestrating single-cell analysis with Bioconductor. Nature Methods, 2020. doi: 10.1038/s41592-019-0654-x
R and RStudio
You need to install R and RStudio from the links provided. They are separate downloads and installations. R is a programming language and collection of software that implements that language. RStudio is a graphical integrated development environment (IDE) that makes using R easier and more interactive. You need to install R before you install RStudio. After installing both programs, you will need to install some R libraries from within RStudio. There are addition platform-specific details in the Introduction to Bioconductor module.
Package installation
After installing R and RStudio, you need to install some packages that will be used during the workshop. We will also learn about package installation during the course to explain the following commands. For now, simply start RStudio by double-clicking the icon and enter these commands:
R
install.packages(c("BiocManager", "remotes"))
BiocManager::install(c("AUCell", "batchelor", "BiocStyle", 
                       "CuratedAtlasQueryR", "DropletUtils", "duckdb",
                       "EnsDb.Mmusculus.v79", "MouseGastrulationData",
                       "scDblFinder", "Seurat", "lgeistlinger/SeuratData",
                       "SingleR", "TENxBrainData", "zellkonverter"),
                       Ncpus = 4)
You can adjust Ncpus as needed for your machine.