Skip to content

feat(template): add axis-decomposed process_cpus_* / process_mem_* labels#4265

Open
pinin4fjords wants to merge 8 commits into
nf-core:devfrom
pinin4fjords:pinin4fjords/process-axis-labels
Open

feat(template): add axis-decomposed process_cpus_* / process_mem_* labels#4265
pinin4fjords wants to merge 8 commits into
nf-core:devfrom
pinin4fjords:pinin4fjords/process-axis-labels

Conversation

@pinin4fjords
Copy link
Copy Markdown
Member

@pinin4fjords pinin4fjords commented May 6, 2026

Summary

Template change for nf-core/proposals#139 (approved Stage 2, 10/10) and the corresponding Stage 3 RFC at nf-core/proposals#140. Companion docs PR: nf-core/website#4212.

Introduces axis-decomposed resource labels alongside the existing bundled ones:

  • process_cpus_{single,low,medium,high} - cpus only
  • process_mem_{low,medium,high} - memory only
  • process_time_{short,medium,long} - time only

A process stacks one of each axis to express its resource shape independently. e.g. a cpu-bound, memory-light, fast tool uses process_cpus_high + process_mem_low + process_time_short.

The bundled labels (process_single, process_low, process_medium, process_high, process_long, process_high_memory, process_low_memory) stay in place for backwards compatibility. Per the RFC, deprecation follows in a later PR once nf-core/modules and the institutional configs have migrated.

This supersedes the short-term process_low_memory workaround merged in #4264.

Motivation

Bundled labels couple cpus, memory and time at fixed ratios (e.g. process_high pins all three at 12 cpus / 72 GB / 16 h), which is the wrong shape for cpu-bound memory-light tools (Rust streaming binaries - trim_galore 2.x peaks at ~100 MB on 30M PE in ~1.5 min) or memory-hungry single-threaded ones. Splitting the axes lets module authors pick each independently and stops over-allocating on one axis just to get headroom on another. Full rationale and the agreed migration path live in the RFC.

Values

withLabel:process_cpus_single  { cpus   = { 1                    } }
withLabel:process_cpus_low     { cpus   = { 2     * task.attempt } }
withLabel:process_cpus_medium  { cpus   = { 6     * task.attempt } }
withLabel:process_cpus_high    { cpus   = { 12    * task.attempt } }
withLabel:process_mem_low      { memory = { 1.GB  * task.attempt } }
withLabel:process_mem_medium   { memory = { 12.GB * task.attempt } }
withLabel:process_mem_high     { memory = { 72.GB * task.attempt } }
withLabel:process_time_short   { time   = { 1.h   * task.attempt } }
withLabel:process_time_medium  { time   = { 8.h   * task.attempt } }
withLabel:process_time_long    { time   = { 20.h  * task.attempt } }
  • CPU values are 1:1 with the existing bundled-label cpus tiers.
  • Memory covers a wide range, with mem_low set sub-process_single (1 GB rather than 6 GB) so streaming tools have a meaningful low-memory bucket.
  • Time uses 1.h / 8.h / 20.h. short = 1.h is a meaningful step down from the 4 h template default; medium = 8.h and long = 20.h match the existing process_medium and process_long values exactly so migrating tools don't change their time budget.
  • process_high_memory (200 GB) is not folded in by this PR. It stays as a separate niche label until we decide whether a mem_xl tier is warranted (open question in the RFC).

Changes

  • nf_core/pipeline-template/conf/base.config - new axis labels added above the bundled block, with a header comment marking them preferred and the bundled block marked as pending deprecation.
  • nf_core/modules/lint/main_nf.py - extend correct_process_labels to accept both new and bundled labels.
  • nf_core/components/create.py - extend the nf-core modules create autocomplete with the new labels.
  • CHANGELOG.md - entries under Linting and Template.

Out of scope (tracked in the RFC)

  • Deprecation lint warning and eventual removal of bundled labels.
  • Three-question dialogue in nf-core modules create (currently still single-label autocomplete).
  • Bulk migration in nf-core/modules and nf-core/configs.

…bels

Introduces a parallel set of resource labels that decompose along the
cpu and memory axes, so a process can express its shape independently:

  process_cpus_{single,low,medium,high}  - cpus only
  process_mem_{low,medium,high}          - memory only

Stack one of each (and optionally a time-only label like process_long)
on a process. The combined labels (process_single, process_low,
process_medium, process_high, process_high_memory) stay in place for
backwards compatibility while the ecosystem migrates; deprecation will
follow in a separate PR once enough downstream pipelines / configs have
adopted the new scheme.

Why: the existing combined labels couple cpus and memory at fixed
ratios (e.g. process_high = 12 cpus + 72 GB), which is wrong for tools
that are cpu-bound but memory-light (Rust streaming binaries, e.g.
trim_galore 2.x at ~100 MB peak_rss) or vice versa. Splitting the axes
lets module authors pick each independently and stops over-allocating
on one axis to get headroom on the other.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 78.32%. Comparing base (4c78254) to head (44476ca).

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Completes the axis-decomposed scheme: alongside cpus_* and mem_*, a
process can now also pin its time budget independently. Values:

  process_time_short  = 1.h  * task.attempt
  process_time_medium = 8.h  * task.attempt
  process_time_long   = 20.h * task.attempt

short = 1.h is a meaningful step down from the 4.h template default
for fast tools (streaming utilities, QC, samtools view), and lets
schedulers route them to short-queue priority pools. medium / long
match the existing process_medium and process_long values exactly,
so module authors migrating from those don't change their effective
time budget.

The existing process_long label stays in the legacy block; it has
the same value as process_time_long but removing it would break
pipelines, so it deprecates on the same slow-deprecation timeline
as the other combined labels.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants