Workshop website: https://phipsonlab.github.io/single_cell_workshop/
Single-cell RNA sequencing (scRNA-seq) has revolutionised our ability to study gene expression at the resolution of individual cells, enabling the discovery of novel cell types and providing insights into the cellular composition of complex tissues. This workshop provides a comprehensive introduction to the computational analysis of scRNA-seq data using R and Bioconductor.
We analyse single-nucleus RNA-sequencing (snRNA-seq) data from human heart tissue across three developmental stages: foetal, young, and adult. The dataset originates from Sim et al. (2021) examining sex-specific control of human heart maturation (Circulation).
This workshop is designed for researchers and students who:
- Have basic familiarity with R programming (data manipulation, plotting)
- Are interested in single-cell transcriptomics analysis
- Want to understand best practices for scRNA-seq data processing
No prior experience with single-cell analysis or Bioconductor is required. All concepts are introduced from first principles with detailed explanations.
| Resource | Minimum | Recommended |
|---|---|---|
| RAM | 8 GB | 16 GB |
| Disk space | 5 GB free | 10 GB free |
| R version | 4.3+ | 4.5.2 |
| RStudio | 2023.06+ | Latest |
Supported platforms: Windows 10/11, macOS 12+ (Intel and Apple Silicon), Ubuntu 22.04+ / equivalent Linux. A C/C++ build toolchain is required on each platform — Module 0 walks through the install (Rtools45 on Windows, Xcode Command Line Tools on macOS, build-essential + dev headers on Linux).
| Module | Topic | Duration |
|---|---|---|
| Module 1 | Quality Control | 45 min |
| Break | 10 min | |
| Module 2 | Normalisation & Integration | 50 min |
| Break | 10 min | |
| Module 3 | Cell Type Annotation | 20 min |
| Module 4 | Differential Expression | 55 min |
| Wrap-up & Q&A | 10 min |
| Module | Topic | Duration |
|---|---|---|
| Module 5 | Continuous Phenotyping with Φ-Space | 45 min |
| Module 6 | Pseudotime Trajectory Analysis | 45 min |
| Module 7 | Cell-specific Co-expression Networks (NeighbourNet) | 45 min |
By the end of this workshop, participants will be able to:
- Load and explore 10X Genomics scRNA-seq data in R using Seurat
- Calculate and interpret per-cell quality control metrics
- Apply appropriate filtering thresholds to remove low-quality cells
- Normalise data using SCTransform and correct batch effects with Harmony
- Perform graph-based clustering and visualise results with UMAP
- Annotate cell types using canonical marker genes
- Understand the pseudoreplication problem in single-cell differential expression
- Perform statistically rigorous differential expression analysis using pseudobulk methods
- Analyse cell type composition changes using propeller
- Replace hard cell-type labels with continuous Φ-Space phenotype scores
- Fit and compare pseudotime trajectories (slingshot, DPT) on multiple embeddings
- Build cell-specific co-expression meta-networks with NeighbourNet
The workshop uses snRNA-seq data from human heart tissue (Sim et al., 2021):
| Group | Samples | Age Range | Description |
|---|---|---|---|
| Foetal | 3 | 19-20 weeks | Developing heart |
| Young | 3 | 4-14 years | Postnatal maturation |
| Adult | 3 | 35-42 years | Mature heart |
Total: 9 samples, ~47,000 nuclei after quality control
| Analysis Step | Method | Package |
|---|---|---|
| Quality control | Per-cell metrics, filtering | Seurat |
| Normalisation | SCTransform v2 | Seurat, glmGamPoi |
| Batch correction | Harmony | harmony |
| Dimensionality reduction | PCA, UMAP | Seurat |
| Clustering | Louvain algorithm | Seurat |
| Cell type annotation | Marker-based (manual) | Seurat |
| Differential expression | Pseudobulk + limma-voom | edgeR, limma |
| Composition analysis | propeller | speckle |
| Soft annotation | PLS on reference atlas | PhiSpace |
| Pseudotime | Principal curves + diffusion pseudotime | slingshot, destiny |
| Co-expression networks | Cell-specific networks + meta-networks | NeighbourNet |
Please complete setup at least one day before the workshop.
- Clone or download this repository (Windows users: clone to a short
path like
C:\workshop\to avoid the 260-character path limit). - Open
single_cell_workshop.Rprojin RStudio. - Follow Module 0: Setup from start to finish.
The setup runs as a single unified flow that covers both sessions:
- Step 2 — System build tools (Rtools45 on Windows, Xcode CLT on macOS,
build-essentialon Linux). Required because a few packages compile from source. - Step 3 — R packages:
renv::restore()for the locked core, thenBiocManager::install(...)+remotes::install_github(...)for the extras (PhiSpace,NeighbourNet,slingshot,destiny,scater,ComplexHeatmap). - Step 4 — Workshop data from Zenodo (~420 MB).
Total time: roughly 20–40 minutes depending on whether the GitHub-only packages need to compile from source.
A separate Zenodo record hosts pre-computed checkpoints so you can start at any module boundary — useful for skipping straight to a particular technique, or for starting Session 2 without first running Session 1. Each file replaces the output of one or more upstream modules:
| File | Lets you skip |
|---|---|
01_qc_filtered.rds |
Module 1 |
02_integrated_clustered.rds |
Modules 1 + 2 |
03_annotated.rds |
Modules 1 + 2 + 3 |
afternoonSession.zip |
All of Session 1 — start at Module 5 |
afternoonSession.zip contains the Session 2 (Module 5, 6, 7) input and
intermediate results. Download, unzip, and the files land in data/ and
results/ per the instructions in Module 0.
The download chunk lives in Module 0, "Optional: Backup Checkpoints".
The core packages are pinned in renv.lock for reproducibility. The
afternoon-session extras are installed at the latest Bioconductor 3.22 /
GitHub HEAD versions (see Module 0 Step 3b).
| Package | Source | Package | Source |
|---|---|---|---|
| R 4.5.2 | renv.lock | Bioconductor 3.22 | renv.lock |
| Seurat 5.4.0 | renv.lock | edgeR 4.8.2 | renv.lock |
| SeuratObject 5.3.0 | renv.lock | limma 3.66.0 | renv.lock |
| harmony 1.2.4 | renv.lock | speckle 1.10.0 | renv.lock |
| glmGamPoi 1.22.0 | renv.lock | ||
| ComplexHeatmap | Bioconductor | slingshot | Bioconductor |
| destiny | Bioconductor | scater | Bioconductor |
| PhiSpace | GitHub (jiadongm/PhiSpace) |
NeighbourNet | GitHub (meiosis97/NeighbourNet) |
| Module | Topic | Description |
|---|---|---|
| Module 0 | Setup | Environment setup and data download |
| Module 1 | Quality Control | QC metrics, cell filtering |
| Module 2 | Integration | Normalisation, batch correction, clustering |
| Module 3 | Annotation | Marker genes and cell type assignment |
| Module 4 | DE Analysis | Pseudobulk DE and composition analysis |
| Module | Topic | Description |
|---|---|---|
| Module 5 | Continuous Phenotyping with Φ-Space | Soft cell-type + stage scores via PLS on a reference atlas |
| Module 6 | Pseudotime Trajectory Analysis | Slingshot and DPT on PCA and Φ-Space embeddings |
| Module 7 | Cell-specific Co-expression Networks | NeighbourNet meta-networks from maturation-associated targets |
If you use materials from this workshop, please cite:
Original dataset:
Sim CB, Phipson B, Ziemann M, et al. Sex-Specific Control of Human Heart Maturation by the Progesterone Receptor. Circulation. 2021;143(10):1614-1628. doi:10.1161/CIRCULATIONAHA.120.051921
This workshop was developed by the Phipson Lab using data from the Porrello and Hewitt laboratories. We thank the original authors for making their data publicly available.
This project is licensed under the MIT License - see the LICENSE file for details.