Code
# For COVID-19 scRNA-seq analysis
source("R/covid/R_Package_Installation_Pipeline.R")This page provides instructions for setting up the computational environment needed to run the preprocessing pipelines and analysis tutorials. The workflow requires both R and Python packages.
If you’re primarily interested in running the tutorials on pre-processed data (fast), you only need the R dependencies. If you also want to preprocess the data from scratch (slow), you will need to complete the full R + Python setup.
All R dependencies can be installed with a single setup script. The package requirements are largely identical for both COVID-19 and MERFISH analyses.
Run this R script once to install all required packages:
Or equivalently for MERFISH colitis analysis:
Note: Both scripts install the same core packages with identical versions. Running either one will set you up for both analyses.
From CRAN (15 packages):
dplyr, tidyr, tibbleMatrixggplot2, cowplot, patchwork, GGally, ggridges, pheatmap, circlize, viridisreticulate, remotes, hereFrom Bioconductor (14 packages):
SingleCellExperiment, SpatialExperiment, HDF5Array, DelayedArray, BiocParallel, BiocSingularscran, scater, SingleRComplexHeatmapbiomaRt, zellkonverter, MerfishDatascDiagnostics (devel v1.5.1)From GitHub (2 packages):
Seurat (v5.3.1.0 - specific version for Azimuth compatibility)Azimuth (reference-based annotation)After running the script, verify everything is installed correctly:
Only needed if you run CellTypist or scVI/scArches annotation from scratch.
This is the recommended approach for the complete preprocessing pipeline.
Important: scVI/scArches is computationally intensive and is designed to run on GPU hardware. We performed all analyses on GPU nodes (NVIDIA L40S GPUs on HMS O2 cluster). CPU-only execution will be significantly slower.
Step 1: Create the conda environment
Step 2: Activate it when running Python scripts
The environment-scvi.yml file is in the repository root and includes:
scvi-tools and scarches (GPU-enabled annotation tools)scanpy (data analysis)GPU Access: Ensure your compute environment has GPU access. Example SLURM submission on O2:
For using CellTypist separately (CPU-compatible), run the Python environment setup script:
The environmentSetupCellTypist() function:
celltypist_env conda environmentscanpy, pandas, numpy, celltypistIf you prefer manual setup:
R package installation failure:
Rtools on Windows, build-essential on Linux, Xcode on Mac)BiocManager::install("package_name")Python environment issues:
conda --versionconda clean --allconda env remove -n scvi-env && conda env create -f environment-scvi.ymlnvidia-smi (if running on HPC cluster, check with module list)reticulate can’t find Python:
In R, explicitly set the Python path: