Skip to content

Add cell cycle scoring to QC subworkflow#266

Merged
nictru merged 9 commits intonf-core:devfrom
tlebchan:nfch2026-ru-team-cell-cycle
Mar 14, 2026
Merged

Add cell cycle scoring to QC subworkflow#266
nictru merged 9 commits intonf-core:devfrom
tlebchan:nfch2026-ru-team-cell-cycle

Conversation

@tlebchan
Copy link
Contributor

@tlebchan tlebchan commented Mar 13, 2026

Implements per-cell cell cycle scoring as a QC step, placing it after doublet detection within the QUALITY_CONTROL subworkflow. Scores are stored as obs columns and merged into the final h5ad via the existing FINALIZE_QC_ANNDATAS mechanism, making them available as covariates in downstream steps (e.g. --S_score,G2M_score).

Implementation details

New module: SCANPY_CELLCYCLE

  • Wraps sc.tl.score_genes_cell_cycle() from Scanpy (Tirosh et al. 2015 gene sets, same as Seurat)
  • Outputs a .pkl of obs columns (S_score, G2M_score, phase) that flows into FINALIZE_QC_ANNDATAS alongside cell type annotations — the module does not rewrite the main h5ad
  • Also emits an intermediate .h5ad for inspection during development
  • Handles datasets where gene symbols are stored in a var column rather than the index

Gene lists as bundled assets

  • Human and mouse S-phase and G2M gene lists are stored as plain-text files under assets/cell_cycle_genes/, making them transparent, auditable, and version-controlled
  • Power users can supply custom lists via --s_genes / --g2m_genes (e.g. for rat or zebrafish homologs)
  • Species is selected via --species (default: human); the pipeline resolves the correct asset files automatically

Parameters

Parameter Default Description
--species human Selects bundled gene lists; human or mouse
--cell_cycle_scoring true Enable/disable the step
--s_genes (auto from species) Path to custom S-phase gene list
--g2m_genes (auto from species) Path to custom G2M gene list

Usage

# human (default)
nextflow run nf-core/scdownstream --input samplesheet.csv --outdir results

# mouse
nextflow run nf-core/scdownstream --input samplesheet.csv --outdir results --species mouse

# custom gene lists
nextflow run nf-core/scdownstream --input samplesheet.csv --outdir results \
    --s_genes my_s_genes.txt --g2m_genes my_g2m_genes.txt

# skip cell cycle scoring
nextflow run nf-core/scdownstream --input samplesheet.csv --outdir results \
    --cell_cycle_scoring false

Testing

  • Module tests (modules/local/scanpy/cellcycle/tests/) cover human scoring and stub run
  • Subworkflow tests (subworkflows/local/quality_control/tests/) updated with the 3 new inputs; a dedicated "Should run with cell cycle scoring" test added with cell_cycle_scoring = true
  • All tests pass locally with --profile docker,test

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/scdownstream branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@tlebchan tlebchan requested a review from nictru as a code owner March 13, 2026 14:54
# Versions

versions = {
"${task.process}": {"python": platform.python_version(), "scanpy": sc.__version__}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not pin the python version in the conda env, this might break tests when running them via conda. You can either remove the python version from the version capturing, or pin it in the environment yml file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will fix it the same way as in the scanpy/pca module:
- conda-forge::python=3.12.11

Just put some emodzi and I will push these changes :)

Copy link
Collaborator

@nictru nictru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good job overall!

@tlebchan
Copy link
Contributor Author

@nictru I encountered with a small problem during test snapshot gathering; I was running nf-test with the following command nf-test test tests/main_pipeline_reference_mapping.nf.test --profile docker --update-snapshot --verbose, but by some reason singler is not even running during these tests (no outputs and no versions in snapshot). BUT when I add test into the --profile, then singler is running and everything is ok. I still don't quite understand, what is the underlying reason.

Am I right, that in github tests the running carried out with test profile and we should do the same while testing locally?

@fasterius
Copy link
Contributor

@nictru I encountered with a small problem during test snapshot gathering; I was running nf-test with the following command nf-test test tests/main_pipeline_reference_mapping.nf.test --profile docker --update-snapshot --verbose, but by some reason singler is not even running during these tests (no outputs and no versions in snapshot).

When you specify the test you should run using the --profile +docker: notice the plus sign, which adds whatever profile (docker, here) to what's already the default, which is the test profile. The idea is that the test profile should always be included when performing the tests, and you add whatever environment profile you want on top of that. I'm not sure this is the entire error you described, but start there and see how it goes.

@tlebchan tlebchan force-pushed the nfch2026-ru-team-cell-cycle branch from 0483e27 to 4372914 Compare March 14, 2026 16:24
@tlebchan tlebchan linked an issue Mar 14, 2026 that may be closed by this pull request
@tlebchan tlebchan self-assigned this Mar 14, 2026
@nictru nictru merged commit 9be28d1 into nf-core:dev Mar 14, 2026
57 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Implement cell cycle analysis

3 participants