Summary and Schedule
In the last few years, the profiling of a large number of genome-wide features in individual cells has become routine. Consequently, a plethora of tools for the analysis of single-cell data has been developed, making it hard to understand the critical steps in the analysis workflow and the best methods for each objective of one’s study.
This Carpentries-style tutorial aims to provide a solid foundation in using Bioconductor tools for single-cell RNA-seq (scRNA-seq) analysis by walking through various steps of typical workflows using example datasets.
This tutorial is based on the the online book “Orchestrating Single-Cell Analysis with Bioconductor” (OSCA), published in 2020, and continuously updated by many contributors from the Bioconductor community. Like the book, this tutorial strives to be of interest to the experimental biologists wanting to analyze their data and to the bioinformaticians approaching single-cell data.
Prerequisites
- Familiarity with R/Bioconductor, such as the Introduction to data analysis with R and Bioconductor lesson.
- Familiarity with multivariate analysis and dimensionality reduction, such as Chapter 7 of the book Modern Statistics for Modern Biology by Holmes and Huber.
- Familiarity with the biology of gene expression and scRNA-seq, such as the review article A practical guide to single-cell RNA-sequencing by Haque et.al.
If you use materials of this lesson in published research, please cite:
Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, Geistlinger L, Marini F, Rue-Albrecht K, Risso D, Soneson C, Waldron L, Pagès H, Smith ML, Huber W, Morgan M, Gottardo R, Hicks SC. Orchestrating single-cell analysis with Bioconductor. Nature Methods, 2020. doi: 10.1038/s41592-019-0654-x
Setup Instructions | Download files required for the lesson | |
Duration: 00h 00m | 1. Introduction to Bioconductor and the SingleCellExperiment class |
What is Bioconductor? How is single-cell data stored in the Bioconductor ecosystem? What is a SingleCellExperiment
object?
|
Duration: 00h 30m | 2. Exploratory data analysis and quality control |
How do I examine the quality of single-cell data? What data visualizations should I use during quality control in a single-cell analysis? How do I prepare single-cell data for analysis? |
Duration: 01h 15m | 3. Cell type annotation |
How can we identify groups of cells with similar expression
profiles? How can we identify genes that drive separation between these groups of cells? How to leverage reference datasets and known marker genes for the cell type annotation of new datasets? ::: |
Duration: 02h 00m | 4. Multi-sample analyses |
How can we integrate data from multiple batches, samples, and
studies? How can we identify differentially expressed genes between experimental conditions for each cell type? How can we identify changes in cell type abundance between experimental conditions? |
Duration: 02h 45m | 5. Working with large data |
How do we work with single-cell datasets that are too large to fit in
memory? How do we speed up single-cell analysis workflows for large datasets? How do we convert between popular single-cell data formats? |
Duration: 02h 57m | 6. Accessing data from the Human Cell Atlas (HCA) | How to obtain single-cell reference maps from the Human Cell Atlas? |
Duration: 03h 27m | Finish |
The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.
R and RStudio
R and RStudio are separate downloads and installations. R is a programming language and collection of software that implements that language. RStudio is a graphical integrated development environment (IDE) that makes using R easier and more interactive. You need to install R before you install RStudio. After installing both programs, you will need to install some R libraries from within RStudio. If you need to install R and/or RStudio, there are platform-specific installation instructions in the Introduction to Bioconductor module.
Package installation
After installing R and RStudio, you need to install some packages that will be used during the workshop. We will also learn about package installation during the course to explain the following commands. For now, simply start RStudio by double-clicking the icon and enter these commands:
R
install.packages(c("BiocManager", "remotes"))
BiocManager::install(c("AUCell", "batchelor", "BiocStyle",
"CuratedAtlasQueryR", "DropletUtils",
"EnsDb.Mmusculus.v79", "MouseGastrulationData",
"scDblFinder", "Seurat", "lgeistlinger/SeuratData",
"SingleR", "TENxBrainData", "zellkonverter"),
Ncpus = 4)
You can adjust Ncpus
as needed for your machine.