Reduce containers overhead

### Description of the bug

Currently, there are more than 20 container images in the pipeline totaling ~60 GB of disk usage
I remember that for nf-core modules the recommended principle is "one tool – one container", but here we mostly have local modules and I think some redundancy (caused by incremental development) can be reduced.

IMAGE | ID | DISK USAGE | CONTENT SIZE
-- | -- | -- | --
SEQERA: anndata2ri_bioconductor-singlecellexperiment_anndata_r-seurat:5fae42aabf7a1c5f | 291a48658716 | 3.38GB | 846MB
SEQERA: anndata:0.10.9--1eab54e300e1e584 | 3471efcc8b48 | 936MB | 231MB
SEQERA: anndata_pyyaml:82c6914e861435f7 | 7946ce8a97db | 1.06GB | 261MB
SEQERA: anndata_upsetplot:784e0f450da10178 | 766c2dff4b54 | 1.18GB | 306MB
SEQERA: bbknn_pyyaml_scanpy:4cf2984722da607f | 453b5e1f8972 | 1.6GB | 382MB
SEQERA: bioconductor-celldex_bioconductor-hdf5array_bioconductor-singlecellexperiment_r-yaml:13bf33457e3e7490 | fdb9aa052292 | 2.33GB | 634MB
SEQERA: celltypist_scanpy:44b604b24dd4cf33 | bfe009b0a96c | 1.78GB | 431MB
SEQERA: harmonypy_pyyaml_scanpy:f6cc57196369fb1e | 0c62b23a31d6 | 1.63GB | 392MB
SEQERA: leidenalg_python-igraph_pyyaml_scanpy:4936fa196b5f4340 | 8644a451da2a | 1.66GB | 401MB
SEQERA: liana_pyyaml:776fdd7103df146d | 131e6bd9dccb | 2.24GB | 507MB
SEQERA: multiqc:1.33--ee7739d47738383b | abd5751768f8 | 2.01GB | 432MB
SEQERA: pandas:2.2.3--9b034ee33172d809 | 50da2ef5f060 | 765MB | 190MB
SEQERA: python-igraph_pyyaml_scanpy:cc0304f4731f72f9 | 8f65ff8a2191 | 1.66GB | 401MB
SEQERA: python_pyyaml_scanpy:b5509a698e9aae25 | e0dac9eda4d7 | 1.85GB | 461MB
SEQERA: python_pyyaml_scanpy_scikit-image:750e7b74b6d036e4 | e2816307a73f | 2.04GB | 509MB
SEQERA: pyyaml_scanpy:3c9e9f631f45553d | 7ed2839670f9 | 1.63GB | 392MB
SEQERA: pyyaml_scanpy:a3a797e09552fddc | 228c2994c5f4 | 1.86GB | 466MB
SEQERA: scanpy_upsetplot:1ce883f3ff369ca8 | a91e0a660553 | 1.67GB | 414MB
SEQERA: scvi-tools:1.3.3--df115aabdccb7d6b | 551e3b44c383 | 4.66GB | 1.08GB
SEQERA: scvi-tools:1.4.1--47f5b0e6b70fd131 | 0ac460cb48b1 | 3.47GB | 797MB
nicotru/celda:1d48a68e9d534b2b | 3a4f38d26238 | 2.95GB | 759MB
nicotru/scds:7788dbeb87bc7eec | e6aac618e327 | 2.48GB | 651MB
nicotru/seurat:b3b12d17271014d9 | 22f891364efc | 3.35GB | 853MB
nicotru/soupx:f6297681695fbfcf | 222d79287a15 | 2.82GB | 700MB
saditya88/singler:0.0.1 | cb267ab7d826 | 9.13GB | 2.64GB

(This issue has been brought to my attention as I rent the server and also pay for disk space 😃)

I asked Codex to analyze repo structure and find some ways to optimize container usage, not touching nf-core/modules and accounting for python version pinning you [mentioned](https://github.com/nf-core/scdownstream/pull/266#discussion_r2931786801). Here's the output:

> ## Implementation Plan
> - Use nf-core module containers as the canonical baseline for overlapping local tool families.
>   - Align local `SCVITOOLS_SCVI` and `SCVITOOLS_SCANVI` to the same `scvi-tools=1.3.3` container/env family already used by vendored `SCVITOOLS_SOLO` and `SCVITOOLS_SCAR`.
> - Collapse the local generic `scanpy` `1.11.5` / `1.11.2` split onto one pinned local baseline that is compatible with the nf-core `scrublet` stack.
>   - Standardize local `scanpy` modules that only need core scanpy functionality on `python=3.12.11`, `scanpy=1.11.2`, `pyyaml=6.0.2`.
>   - Apply the same base version to additive local scanpy envs (`neighbors`, `paga`, `leiden`, `harmony`, `bbknn`) while keeping their extra packages.
> - Collapse the local upsetplot fork.
>   - Change `ADATA_UPSETGENES` to use `anndata` directly instead of `scanpy` for reading `.h5ad`.
>   - Move `ADATA_UPSETGENES` and `DOUBLET_REMOVAL` onto one shared pinned local env: `python=3.12.11`, `anndata=0.12.7`, `upsetplot=0.9.0`.
> - Replace `docker.io/saditya88/singler:0.0.1` with a local Dockerfile built from a minimal R/Wave-compatible base containing the actual R dependencies used by `singleR.R`, including `bioconductor-hdf5array` and `anndataR`.
> 
> ## Test Plan
> - Run nf-tests for affected local modules and subworkflows:
>   - `integrate`
>   - `quality_control`
>   - `doublet_detection`
>   - `celltype_assignment`
>   - affected local modules under `scanpy`, `scvitools`, `adata/upsetgenes`, `doublet_detection/doublet_removal`, and `celltypes/singler`
> - Run pipeline tests with `-profile test,docker` and `-profile test_full,docker`.

Does this plan make sense?
You also mentioned that private docker hub images can be replaced with Seqera ones, but Codex thought it's too much for one conservative pass :)





### Command used and terminal output

```console

```

### Relevant files

_No response_

### System information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce containers overhead #267

Description of the bug

Implementation Plan

Test Plan

Command used and terminal output

Relevant files

System information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

IMAGE	ID	DISK USAGE	CONTENT SIZE
SEQERA: anndata2ri_bioconductor-singlecellexperiment_anndata_r-seurat:5fae42aabf7a1c5f	291a48658716	3.38GB	846MB
SEQERA: anndata:0.10.9--1eab54e300e1e584	3471efcc8b48	936MB	231MB
SEQERA: anndata_pyyaml:82c6914e861435f7	7946ce8a97db	1.06GB	261MB
SEQERA: anndata_upsetplot:784e0f450da10178	766c2dff4b54	1.18GB	306MB
SEQERA: bbknn_pyyaml_scanpy:4cf2984722da607f	453b5e1f8972	1.6GB	382MB
SEQERA: bioconductor-celldex_bioconductor-hdf5array_bioconductor-singlecellexperiment_r-yaml:13bf33457e3e7490	fdb9aa052292	2.33GB	634MB
SEQERA: celltypist_scanpy:44b604b24dd4cf33	bfe009b0a96c	1.78GB	431MB
SEQERA: harmonypy_pyyaml_scanpy:f6cc57196369fb1e	0c62b23a31d6	1.63GB	392MB
SEQERA: leidenalg_python-igraph_pyyaml_scanpy:4936fa196b5f4340	8644a451da2a	1.66GB	401MB
SEQERA: liana_pyyaml:776fdd7103df146d	131e6bd9dccb	2.24GB	507MB
SEQERA: multiqc:1.33--ee7739d47738383b	abd5751768f8	2.01GB	432MB
SEQERA: pandas:2.2.3--9b034ee33172d809	50da2ef5f060	765MB	190MB
SEQERA: python-igraph_pyyaml_scanpy:cc0304f4731f72f9	8f65ff8a2191	1.66GB	401MB
SEQERA: python_pyyaml_scanpy:b5509a698e9aae25	e0dac9eda4d7	1.85GB	461MB
SEQERA: python_pyyaml_scanpy_scikit-image:750e7b74b6d036e4	e2816307a73f	2.04GB	509MB
SEQERA: pyyaml_scanpy:3c9e9f631f45553d	7ed2839670f9	1.63GB	392MB
SEQERA: pyyaml_scanpy:a3a797e09552fddc	228c2994c5f4	1.86GB	466MB
SEQERA: scanpy_upsetplot:1ce883f3ff369ca8	a91e0a660553	1.67GB	414MB
SEQERA: scvi-tools:1.3.3--df115aabdccb7d6b	551e3b44c383	4.66GB	1.08GB
SEQERA: scvi-tools:1.4.1--47f5b0e6b70fd131	0ac460cb48b1	3.47GB	797MB
nicotru/celda:1d48a68e9d534b2b	3a4f38d26238	2.95GB	759MB
nicotru/scds:7788dbeb87bc7eec	e6aac618e327	2.48GB	651MB
nicotru/seurat:b3b12d17271014d9	22f891364efc	3.35GB	853MB
nicotru/soupx:f6297681695fbfcf	222d79287a15	2.82GB	700MB
saditya88/singler:0.0.1	cb267ab7d826	9.13GB	2.64GB

Reduce containers overhead #267

Description

Description of the bug

Implementation Plan

Test Plan

Command used and terminal output

Relevant files

System information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions