Skip to content

split outputs#11054

Draft
suzannejin wants to merge 8 commits intomasterfrom
try-new-output-structure
Draft

split outputs#11054
suzannejin wants to merge 8 commits intomasterfrom
try-new-output-structure

Conversation

@suzannejin
Copy link
Copy Markdown
Contributor

PR checklist

Closes #XXX

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Broadcast software version numbers to topic: versions - See version_topics
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

Copilot AI review requested due to automatic review settings March 26, 2026 11:49
@suzannejin suzannejin marked this pull request as draft March 26, 2026 11:49
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the DIFFERENTIAL_FUNCTIONAL_ENRICHMENT subworkflow to split/reshape enrichment outputs into more granular emitted channels (rather than large “artifacts” mixes), while also exposing some generic outputs.

Changes:

  • Refactors emitted outputs for g:Profiler2 and GSEA into multiple, more specific channels.
  • Adds “generic” emitted channels for rds and session_info.
  • Refactors version collection wiring to align with the new output structure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

.join(GSEA_GSEA.out.gene_set_html, remainder: true)
.join(GSEA_GSEA.out.gene_set_heatmap, remainder: true)
.join(GSEA_GSEA.out.gene_set_enplot, remainder: true)
.join(GSEA_GSEA.out.gene_set_dist, remainder: true)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gsea_gene_set_info joins multiple optional outputs with remainder: true and is emitted as-is. This will produce tuples containing null entries for missing optional files (tsv/html/png/enplot/dist), which is error-prone for consumers and awkward to document. Consider mapping to a list of present files (dropping nulls) or splitting these into separate emitted outputs.

Suggested change
.join(GSEA_GSEA.out.gene_set_dist, remainder: true)
.join(GSEA_GSEA.out.gene_set_dist, remainder: true)
.map { row ->
/*
* row is a tuple of the form:
* [meta, gene_set_sizes, gene_set_tsv?, gene_set_html?,
* gene_set_heatmap?, gene_set_enplot?, gene_set_dist?]
* where optional entries may be null due to `remainder: true`.
* Keep meta as-is and collect only non-null file outputs.
*/
def (meta, *files) = row
def present_files = files.findAll { it != null }
tuple(meta, present_files)
}

Copilot uses AI. Check for mistakes.
Comment on lines 163 to +205
// gprofiler2-specific outputs
gprofiler2_plot_html = GPROFILER2_GOST.out.plot_html
gprofiler2_all_enrich = GPROFILER2_GOST.out.all_enrich
gprofiler2_sub_enrich = GPROFILER2_GOST.out.sub_enrich
gprofiler2_artifacts = GPROFILER2_GOST.out.plot_png
.mix(GPROFILER2_GOST.out.sub_plot)
.mix(GPROFILER2_GOST.out.rds)
.mix(GPROFILER2_GOST.out.filtered_gmt)
.mix(GPROFILER2_GOST.out.session_info)
gprofiler2_html = GPROFILER2_GOST.out.plot_html
gprofiler2_enrich_tsv = GPROFILER2_GOST.out.all_enrich
.join(GPROFILER2_GOST.out.sub_enrich, remainder: true)
gprofiler2_enrich_png = GPROFILER2_GOST.out.plot_png
.join(GPROFILER2_GOST.out.sub_plot, remainder: true)
gprofiler2_filtered_gmt = GPROFILER2_GOST.out.filtered_gmt

// gsea-specific outputs
gsea_report = GSEA_GSEA.out.report_tsvs_ref.join(GSEA_GSEA.out.report_tsvs_target)
gsea_artifacts = GSEA_GSEA.out.rpt
.mix(GSEA_GSEA.out.index_html)
.mix(GSEA_GSEA.out.heat_map_corr_plot)
.mix(GSEA_GSEA.out.report_tsvs_ref)
.mix(GSEA_GSEA.out.report_htmls_ref)
.mix(GSEA_GSEA.out.report_tsvs_target)
.mix(GSEA_GSEA.out.report_htmls_target)
.mix(GSEA_GSEA.out.ranked_gene_list)
.mix(GSEA_GSEA.out.gene_set_sizes)
.mix(GSEA_GSEA.out.histogram)
.mix(GSEA_GSEA.out.heatmap)
.mix(GSEA_GSEA.out.pvalues_vs_nes_plot)
.mix(GSEA_GSEA.out.ranked_list_corr)
.mix(GSEA_GSEA.out.butterfly_plot)
.mix(GSEA_GSEA.out.gene_set_tsv)
.mix(GSEA_GSEA.out.gene_set_html)
.mix(GSEA_GSEA.out.gene_set_heatmap)
.mix(GSEA_GSEA.out.snapshot)
.mix(GSEA_GSEA.out.gene_set_enplot)
.mix(GSEA_GSEA.out.gene_set_dist)
.mix(GSEA_GSEA.out.archive)
gsea_report_tsv = GSEA_GSEA.out.report_tsvs_ref
.join(GSEA_GSEA.out.report_tsvs_target)
gsea_html = GSEA_GSEA.out.report_htmls_ref
.join(GSEA_GSEA.out.report_htmls_target)
.join(GSEA_GSEA.out.index_html)
.join(GSEA_GSEA.out.heat_map_corr_plot)
.join(GSEA_GSEA.out.snapshot, remainder: true)
gsea_plots = GSEA_GSEA.out.histogram
.join(GSEA_GSEA.out.heatmap)
.join(GSEA_GSEA.out.pvalues_vs_nes_plot)
.join(GSEA_GSEA.out.ranked_list_corr)
.join(GSEA_GSEA.out.butterfly_plot, remainder: true)
gsea_ranked_gene_list = GSEA_GSEA.out.ranked_gene_list
gsea_gene_set_info = GSEA_GSEA.out.gene_set_sizes
.join(GSEA_GSEA.out.gene_set_tsv, remainder: true)
.join(GSEA_GSEA.out.gene_set_html, remainder: true)
.join(GSEA_GSEA.out.gene_set_heatmap, remainder: true)
.join(GSEA_GSEA.out.gene_set_enplot, remainder: true)
.join(GSEA_GSEA.out.gene_set_dist, remainder: true)
gsea_archive = GSEA_GSEA.out.archive
gsea_rpt = GSEA_GSEA.out.rpt

// decoupler-specific outputs
decoupler_dc_estimate = DECOUPLER_DECOUPLER.out.dc_estimate
decoupler_dc_pvals = DECOUPLER_DECOUPLER.out.dc_pvals
decoupler_png = DECOUPLER_DECOUPLER.out.png
decoupler_dc_estimate = DECOUPLER_DECOUPLER.out.dc_estimate
decoupler_dc_pvals = DECOUPLER_DECOUPLER.out.dc_pvals
decoupler_png = DECOUPLER_DECOUPLER.out.png

// grea-specific outputs
grea_results = PROPR_GREA.out.results
grea_results = PROPR_GREA.out.results

// tool versions
versions = ch_versions
// generic outputs
rds = ch_rds
session_info = ch_session_info
versions = ch_versions
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The emit: outputs were renamed/restructured here (e.g., gprofiler2_plot_html -> gprofiler2_html, gsea_report/gsea_artifacts -> gsea_report_tsv/gsea_html/gsea_plots/etc., and new generic rds/session_info). However, subworkflows/nf-core/differential_functional_enrichment/meta.yml and the nf-test snapshots still describe/assert the old output names and shapes, so linting/tests will fail until they are updated to match the new emit: interface.

Copilot uses AI. Check for mistakes.
Comment on lines +165 to +168
gprofiler2_enrich_tsv = GPROFILER2_GOST.out.all_enrich
.join(GPROFILER2_GOST.out.sub_enrich, remainder: true)
gprofiler2_enrich_png = GPROFILER2_GOST.out.plot_png
.join(GPROFILER2_GOST.out.sub_plot, remainder: true)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These join(..., remainder: true) results are emitted directly. When the optional RHS outputs are absent (e.g. sub_enrich / sub_plot), Nextflow join will emit tuples containing null placeholders, which can break downstream consumers that assume only file paths. Consider mapping the joined tuples to remove nulls (e.g., emitting a list of existing files) or keep optional outputs as separate emitted channels.

Copilot uses AI. Check for mistakes.
Comment on lines +178 to +183
.join(GSEA_GSEA.out.snapshot, remainder: true)
gsea_plots = GSEA_GSEA.out.histogram
.join(GSEA_GSEA.out.heatmap)
.join(GSEA_GSEA.out.pvalues_vs_nes_plot)
.join(GSEA_GSEA.out.ranked_list_corr)
.join(GSEA_GSEA.out.butterfly_plot, remainder: true)
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gsea_html / gsea_plots are built by chaining join() calls and then emitted. Because snapshot and butterfly_plot are optional, the remainder: true joins will introduce null placeholders in the emitted tuples when those files are not produced, which can cause downstream .map { meta, file -> ... } / file() usage to fail. Consider normalizing these outputs (e.g. emit [meta, List<Path>] with nulls removed, or emit optional files on their own channels).

Suggested change
.join(GSEA_GSEA.out.snapshot, remainder: true)
gsea_plots = GSEA_GSEA.out.histogram
.join(GSEA_GSEA.out.heatmap)
.join(GSEA_GSEA.out.pvalues_vs_nes_plot)
.join(GSEA_GSEA.out.ranked_list_corr)
.join(GSEA_GSEA.out.butterfly_plot, remainder: true)
.join(GSEA_GSEA.out.snapshot, remainder: true)
.map { meta, report_ref, report_target, index_html, heatmap_corr, snapshot_opt ->
def files = [report_ref, report_target, index_html, heatmap_corr, snapshot_opt].findAll { it != null }
[meta, files]
}
gsea_plots = GSEA_GSEA.out.histogram
.join(GSEA_GSEA.out.heatmap)
.join(GSEA_GSEA.out.pvalues_vs_nes_plot)
.join(GSEA_GSEA.out.ranked_list_corr)
.join(GSEA_GSEA.out.butterfly_plot, remainder: true)
.map { meta, histogram, heatmap, pvalues_vs_nes_plot, ranked_list_corr, butterfly_opt ->
def files = [histogram, heatmap, pvalues_vs_nes_plot, ranked_list_corr, butterfly_opt].findAll { it != null }
[meta, files]
}

Copilot uses AI. Check for mistakes.
Base automatically changed from update-for-workflow-outputs to master March 26, 2026 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants