Skip to content

Conversation

@ypriverol
Copy link
Member

@ypriverol ypriverol commented Dec 20, 2025

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the bigbio/quantms branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core pipelines lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Summary by CodeRabbit

  • New Features

    • MS2Features rescoring framework with multi-model support and a new fine‑tuning process.
    • Onsite phosphorylation localization replaces previous scorer.
    • Native DIA (.dia) passthrough, Thermo RAW conversion module, and a new MultiQC module with tests.
    • Added default TMT18plex correction matrix asset.
  • Updates

    • Nextflow requirement bumped to 25.04.0; CI workflows and action runtimes updated.
    • Docs: supported mass‑spec file formats, examples, and usage clarified.
    • Input schema: added required "assay name" field.
  • Other

    • Devcontainer setup script added; editorconfig removed.

✏️ Tip: You can customize this high-level summary in your review settings.

daichengxin and others added 30 commits August 14, 2025 18:12
fixing of bug in msstats_tmt.R (Issue: Error in the msstats_tmt.R function parse_contrasts #577)
Pass correct flag to ProteinQuantifier
fixed rescoring range bugs
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
modules.json (1)

22-26: Fix invalid utils_nextflow_pipeline SHA.

The SHA 05954dab2ff481bcb999f24455da29a5828af08d does not exist in the nf-core/modules repository. Update it to the latest master commit: cff071cc0e1d21a0b7e06808599d8489fbeab024.

🧹 Nitpick comments (1)
modules/bigbio/thermorawfileparser/main.nf (1)

12-28: Verify necessity of complex stageInMode retry logic.

The stageInMode configuration implements a sophisticated three-tier retry strategy with executor-specific logic. While the approach is technically sound (progressively falling back from fast symlinks to reliable copies), this level of complexity is unusual for nf-core modules.

Please confirm:

  1. Are there known issues with RAW file staging that necessitate this retry strategy?
  2. Has this logic been tested across different executors (local, awsbatch, slurm, etc.)?
  3. Could this be simplified or documented in the meta.yml to explain why it's needed?

The error_retry label (line 5) suggests retries are expected, but documenting the specific file staging challenges would help maintainers understand the rationale.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d9d253 and a7bd119.

⛔ Files ignored due to path filters (1)
  • modules/bigbio/thermorawfileparser/tests/main.nf.test.snap is excluded by !**/*.snap
📒 Files selected for processing (6)
  • modules.json
  • modules/bigbio/thermorawfileparser/environment.yml
  • modules/bigbio/thermorawfileparser/main.nf
  • modules/bigbio/thermorawfileparser/meta.yml
  • modules/bigbio/thermorawfileparser/tests/main.nf.test
  • modules/bigbio/thermorawfileparser/tests/nextflow.config
✅ Files skipped from review due to trivial changes (1)
  • modules/bigbio/thermorawfileparser/environment.yml
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-14T09:53:36.777Z
Learnt from: enryH
Repo: bigbio/quantms PR: 574
File: .github/actions/get-shards/action.yml:31-41
Timestamp: 2025-08-14T09:53:36.777Z
Learning: The .github/actions/get-shards/ and .github/actions/nf-test/ actions in quantms are part of the nf-core template (introduced in template version 3.3.2) for nf-test framework integration, not custom pipeline code.

Applied to files:

  • modules.json
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: Docker Tests (latest-everything, test_localize)
  • GitHub Check: Docker Tests (25.04.0, test_tmt_corr)
  • GitHub Check: Docker Tests (25.04.0, test_dia)
  • GitHub Check: Docker Tests (latest-everything, test_tmt_corr)
  • GitHub Check: Docker Tests (latest-everything, test_lfq)
  • GitHub Check: Docker Tests (25.04.0, test_localize)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_alphapeptdeep)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_ms2pip)
  • GitHub Check: Docker Tests (latest-everything, test_tmt)
  • GitHub Check: Docker Tests (25.04.0, test_lfq)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_ms2pip)
  • GitHub Check: Docker Tests (25.04.0, test_tmt)
  • GitHub Check: Docker Tests (latest-everything, test_dia)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_alphapeptdeep)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_ms2pip, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_lfq, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_alphapeptdeep, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_tmt, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_localize, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dia, docker)
🔇 Additional comments (4)
modules/bigbio/thermorawfileparser/tests/nextflow.config (1)

1-3: LGTM!

The test configuration follows standard nf-core conventions for organizing test outputs by process name.

modules.json (1)

8-17: Module SHAs verified as valid.

Both custom/dumpsoftwareversions and multiqc git SHAs resolve successfully in the nf-core/modules repository and represent recent, legitimate commits (November 2025). No compatibility issues identified with Nextflow 25.04.0.

modules/bigbio/thermorawfileparser/main.nf (1)

37-49: Code looks good—ThermoRawFileParser.sh --version is documented and supported.

The --version flag is officially documented in the CompOmics GitHub repository and will output a version string suitable for the versions.yml file. The script structure is correct.

modules/bigbio/thermorawfileparser/tests/main.nf.test (1)

10-29: No changes needed—code behavior is correct and intentional.

The test correctly demonstrates the process's dual-mode operation:

  1. Script mode (test 1): ThermoRawFileParser.sh determines the output filename based on the RAW file content, independently of meta.mzml_id (which serves as metadata for tracking).
  2. Stub mode (test 2): Uses meta.mzml_id as the output filename prefix for synthetic testing without executing the actual tool.

The checkIfExists: false is appropriate for nf-test's file staging mechanism.

Comment on lines +51 to +66
stub:
def prefix = task.ext.prefix ?: "${meta.mzml_id}"
def args = task.ext.args ?: ''
// Determine output format from args, default to mzML
// Format 0 = MGF, formats 1-2 = mzML, format 3 = Parquet, format 4 = None
def outputExt = (args =~ /-f=0\b/).find() ? 'mgf' : 'mzML'

"""
touch '${prefix}.${outputExt}'
touch '${prefix}_conversion.log'
cat <<-END_VERSIONS > versions.yml
"${task.process}":
ThermoRawFileParser: \$(ThermoRawFileParser.sh --version)
END_VERSIONS
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Stub mode doesn't handle all format options.

The stub block only distinguishes between MGF (-f=0) and mzML formats (lines 54-56), defaulting everything else to mzML. This means Parquet format (-f=3) will incorrectly produce a .mzML file in stub mode.

While stub mode is primarily for testing workflow structure rather than actual data processing, this inconsistency could cause confusion during testing.

🔎 Proposed enhancement to handle all format options
     def prefix = task.ext.prefix ?: "${meta.mzml_id}"
     def args = task.ext.args ?: ''
-    // Determine output format from args, default to mzML
-    // Format 0 = MGF, formats 1-2 = mzML, format 3 = Parquet, format 4 = None
-    def outputExt = (args =~ /-f=0\b/).find() ? 'mgf' : 'mzML'
+    // Determine output format from args, default to indexed mzML
+    // Format 0 = MGF, formats 1-2 = mzML, format 3 = Parquet, format 4 = None
+    def outputExt = 'mzML'  // default for -f=1, -f=2, or not specified
+    if ((args =~ /-f=0\b/).find()) {
+        outputExt = 'mgf'
+    } else if ((args =~ /-f=3\b/).find()) {
+        outputExt = 'parquet'
+    }

     """
     touch '${prefix}.${outputExt}'
     touch '${prefix}_conversion.log'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
stub:
def prefix = task.ext.prefix ?: "${meta.mzml_id}"
def args = task.ext.args ?: ''
// Determine output format from args, default to mzML
// Format 0 = MGF, formats 1-2 = mzML, format 3 = Parquet, format 4 = None
def outputExt = (args =~ /-f=0\b/).find() ? 'mgf' : 'mzML'
"""
touch '${prefix}.${outputExt}'
touch '${prefix}_conversion.log'
cat <<-END_VERSIONS > versions.yml
"${task.process}":
ThermoRawFileParser: \$(ThermoRawFileParser.sh --version)
END_VERSIONS
"""
stub:
def prefix = task.ext.prefix ?: "${meta.mzml_id}"
def args = task.ext.args ?: ''
// Determine output format from args, default to indexed mzML
// Format 0 = MGF, formats 1-2 = mzML, format 3 = Parquet, format 4 = None
def outputExt = 'mzML' // default for -f=1, -f=2, or not specified
if ((args =~ /-f=0\b/).find()) {
outputExt = 'mgf'
} else if ((args =~ /-f=3\b/).find()) {
outputExt = 'parquet'
}
"""
touch '${prefix}.${outputExt}'
touch '${prefix}_conversion.log'
cat <<-END_VERSIONS > versions.yml
"${task.process}":
ThermoRawFileParser: \$(ThermoRawFileParser.sh --version)
END_VERSIONS
"""
🤖 Prompt for AI Agents
In modules/bigbio/thermorawfileparser/main.nf around lines 51 to 66, the stub
currently maps only -f=0 to 'mgf' and everything else to 'mzML', which
misreports formats like -f=3 (Parquet) and -f=4 (None); update the stub to parse
the -f= value (defaulting to mzML when absent), map 0 => 'mgf', 1 or 2 =>
'mzML', 3 => 'parquet', and 4 => no output (skip creating a data file), then
adjust the touch commands to create the correct "<prefix>.<ext>" only when an
extension is set while still always creating the conversion log and
versions.yml.

Comment on lines +12 to +14
Use `task.ext.args` to pass additional arguments, e.g.:
- `-f=0` for MGF output, `-f=1` for mzML, `-f=2` for indexed mzML (default), `-f=3` for Parquet, `-f=4` for None
- `-L` or `--msLevel=VALUE` to select MS levels (e.g., `-L=1,2` or `--msLevel=1-3`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🌐 Web query:

ThermoRawFileParser -f=4 format option None documentation

💡 Result:

According to the official ThermoRawFileParser docs/README, the -f/--format options are:

  • 0 = MGF
  • 1 = mzML
  • 2 = indexed mzML
  • 3 = Parquet
  • 4 = None (no spectra output). [1][2]

Note: some third‑party wrappers/tools have reported different enumerations (e.g., an ursgal wrapper listing 4 as "MGF with profile data excluded"), so check the specific wrapper/version you use if behavior differs. [3]

Sources: ThermoRawFileParser README (GitHub) and project docs; ursgal wrapper docs. [1][2][3]


Expand format option documentation for clarity.

The documentation lists -f=4 for "None" output format, which is unclear. According to ThermoRawFileParser's official documentation, this option means "no spectra output." Consider expanding this in the meta.yml to -f=4 for None (no spectra output) to make the purpose and behavior explicit to users.

🤖 Prompt for AI Agents
In modules/bigbio/thermorawfileparser/meta.yml around lines 12-14, the
documentation for the `-f=4` format option is ambiguous; update the text to
explicitly state that `-f=4` means "None (no spectra output)" so users
understand this suppresses spectra export, and keep the examples for other
formats and msLevel flags unchanged.

Comment on lines +31 to +52
test("Should run stub mode") {

options "-stub"

when {
process {
"""
input[0] = [
[ id: 'test', mzml_id: 'test_sample' ],
file(params.test_data['proteomics']['msspectra']['ups1_50amol_r3'], checkIfExists: false)
]
"""
}
}

then {
assert process.success
assert snapshot(process.out.versions).match("versions_stub")
assert new File(process.out.convert_files[0][1]).name == 'test_sample.mzML'
assert process.out.log.size() == 1
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent filename behavior between stub and normal modes.

In stub mode (line 49), the output filename is test_sample.mzML, matching the mzml_id from the input metadata. However, in the normal conversion test (line 26), the output filename is derived from the RAW file's basename, not the mzml_id. This behavioral difference between stub and normal modes could lead to confusion or unexpected results in workflows that depend on consistent output naming.

Consider either:

  1. Aligning the stub mode to use the RAW file basename (if feasible in stub mode)
  2. Modifying the normal mode to use mzml_id for output naming
  3. Clearly documenting this behavioral difference in the meta.yml

@jpfeuffer
Copy link
Collaborator

Please check the coderabbit comments. Almost all of them are valid.

main:
ch_version = Channel.empty()
if (params.search_engines.split(",").size() != 1){
ID_SCORE_SWITCHER(ch_id_files.combine(Channel.value("\"Posterior Error Probability_score\"")))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since onsite should be easily able to choose which score to use, I would strongly suggest to remove this step and just load another score in onsite.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
subworkflows/local/file_preparation/main.nf (1)

115-123: Issue: .dia files are lost when peak picking is enabled.

When openms_peakpicking is enabled, ch_results is overwritten with only the peak-picked mzMLs (line 122), losing the .dia files that were mixed in earlier (line 106). The .dia files should be preserved and merged back into the final ch_results.

🔎 Proposed fix to preserve .dia files during peak picking

First, separate .dia files before statistics (as shown in the previous comment), then:

 if (params.openms_peakpicking) {
     // If the peak picker is enabled, it will over-write not bypass the .d files
     OPENMS_PEAK_PICKER (
         indexed_mzml_bundle
     )

     ch_versions = ch_versions.mix(OPENMS_PEAK_PICKER.out.versions)
-    ch_results = OPENMS_PEAK_PICKER.out.mzmls_picked
+    ch_results = OPENMS_PEAK_PICKER.out.mzmls_picked.mix(ch_dia_files)
 }

This ensures .dia files are included in the final output regardless of whether peak picking is enabled.

♻️ Duplicate comments (1)
subworkflows/local/file_preparation/main.nf (1)

104-109: Critical: .dia files are incompatible with MZML_STATISTICS.

The .dia files mixed into ch_results (line 106) are passed to MZML_STATISTICS (line 109), which does not support them. The underlying quantmsutilsc mzmlstats tool only supports .mzML or Bruker .d formats, not .dia files.

Filter .dia files before passing ch_results to MZML_STATISTICS, or create a separate statistics collection path for them.

🔎 Proposed fix to exclude .dia files from MZML_STATISTICS
 // Pass through .dia files without conversion (DIA-NN handles them natively)
 // Note: .dia files bypass peak picking (when enabled) as they are only used with DIA-NN
 ch_results = ch_results.mix(ch_branched_input.dia)

+// Separate .dia files from other formats for statistics collection
+ch_results
+    .branch {
+        dia: hasExtension(it[1], '.dia')
+        other: true
+    }.set { ch_results_branched }
+
+ch_results_for_stats = ch_results_branched.other
+ch_dia_files = ch_results_branched.dia

-MZML_STATISTICS(ch_results)
+MZML_STATISTICS(ch_results_for_stats)
 ch_statistics = ch_statistics.mix(MZML_STATISTICS.out.ms_statistics.collect())
 ch_ms2_statistics = ch_statistics.mix(MZML_STATISTICS.out.ms2_statistics)
 ch_feature_statistics = ch_statistics.mix(MZML_STATISTICS.out.feature_statistics.collect())
 ch_versions = ch_versions.mix(MZML_STATISTICS.out.versions)

Then, ensure the final ch_results includes both paths:

 if (params.openms_peakpicking) {
     // If the peak picker is enabled, it will over-write not bypass the .d files
     OPENMS_PEAK_PICKER (
         indexed_mzml_bundle
     )

     ch_versions = ch_versions.mix(OPENMS_PEAK_PICKER.out.versions)
-    ch_results = OPENMS_PEAK_PICKER.out.mzmls_picked
+    ch_results = OPENMS_PEAK_PICKER.out.mzmls_picked.mix(ch_dia_files)
+} else {
+    ch_results = ch_results_for_stats.mix(ch_dia_files)
 }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a7bd119 and af918cf.

📒 Files selected for processing (2)
  • modules/bigbio/thermorawfileparser/meta.nf
  • subworkflows/local/file_preparation/main.nf
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_ms2pip)
  • GitHub Check: Docker Tests (latest-everything, test_tmt)
  • GitHub Check: Docker Tests (25.04.0, test_localize)
  • GitHub Check: Docker Tests (latest-everything, test_lfq)
  • GitHub Check: Docker Tests (latest-everything, test_tmt_corr)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_ms2pip)
  • GitHub Check: Docker Tests (25.04.0, test_dia)
  • GitHub Check: Docker Tests (latest-everything, test_dia)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_alphapeptdeep)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_alphapeptdeep)
  • GitHub Check: Docker Tests (25.04.0, test_tmt_corr)
  • GitHub Check: Docker Tests (latest-everything, test_localize)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_ms2pip, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_tmt, docker)
  • GitHub Check: Docker Tests (25.04.0, test_tmt)
  • GitHub Check: Run pipeline with test data (25.04.0, test_lfq, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_alphapeptdeep, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dia, docker)
  • GitHub Check: Docker Tests (25.04.0, test_lfq)
  • GitHub Check: Run pipeline with test data (25.04.0, test_localize, docker)
🔇 Additional comments (2)
subworkflows/local/file_preparation/main.nf (2)

46-64: Well-implemented file format handling and warnings.

The branching logic for .dia files and the warning mechanism for unsupported formats are well-structured and user-friendly. The formatted warning message clearly communicates which files are being skipped and what formats are supported.


5-5: The import path change to the BigBio module is valid. Both the new (modules/bigbio/thermorawfileparser/main.nf) and old (modules/local/thermorawfileparser/main.nf) modules exist with identical process interfaces. The new module adds a conda environment definition as an improvement while maintaining full interface compatibility with the previous local implementation.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
subworkflows/local/dda_id/main.nf (1)

133-133: Typo: params.rescore_range should be params.ms2features_range.

Line 87 and 91 correctly use params.ms2features_range, but line 133 still references the old parameter name params.rescore_range. This branch will never execute as intended.

🔎 Proposed fix
-        } else if (params.rescore_range == "by_project"){
+        } else if (params.ms2features_range == "by_project"){
♻️ Duplicate comments (7)
nextflow_schema.json (2)

1182-1188: Type mismatch for pg_level default remains unfixed.

The pg_level enum contains integers [0, 1, 2] (line 1186) but the default is the float 2.0 (line 1187). For consistency and to avoid validation issues, the default should be the integer 2.

🔎 Proposed fix
                 "pg_level": {
                     "type": "number",
                     "description": "Controls the protein inference mode",
                     "fa_icon": "fas fa-list-ol",
                     "enum": [0, 1, 2],
-                    "default": 2.0
+                    "default": 2
                 },

889-894: Type mismatch for reference_channel remains unfixed.

The schema defines reference_channel as type string (line 890) but the default is numeric 126 (line 893). This mismatch can cause JSON Schema validation failures. The default should be the string "126" to match the type.

🔎 Proposed fix
                 "reference_channel": {
                     "type": "string",
                     "description": "The reference channel, e.g. for calculating ratios.",
                     "fa_icon": "fas fa-list-ol",
-                    "default": 126
+                    "default": "126"
                 },
.github/workflows/ci.yml (1)

37-47: Nextflow 25.04.0 upgrade and expanded test matrix.

The Nextflow version has been upgraded to 25.04.0 and the test matrix now includes three new DDA ID test profiles (alphapeptdeep, ms2pip, fine_tuning) while removing the generic test_dda_id. As noted in previous reviews, Nextflow 25.04.0 introduces breaking changes including Java 17+ requirement and workflow syntax changes. The workflow already configures Java 17 (lines 56-59), which is correct.

.github/workflows/extended_ci.yml (2)

31-43: Nextflow 25.04.0 upgrade and expanded Docker test matrix.

The Docker test matrix has been expanded to include all new DDA ID test profiles. The Nextflow 25.04.0 upgrade is consistent with other workflows and the Java 17 setup is correct (lines 54-58).


123-126: Test coverage inconsistency between Docker and Singularity.

The Singularity test matrix (line 126) is missing test_dda_id_ms2pip and test_dda_id_fine_tuning profiles that are present in the Docker matrix (lines 39-42). This creates inconsistent test coverage between execution environments.

🔎 Suggested fix
         test_profile:
-          ["test_lfq", "test_dia", "test_localize", "test_tmt", "test_dda_id_alphapeptdeep", "test_tmt_corr"]
+          ["test_lfq", "test_dia", "test_localize", "test_tmt", "test_dda_id_alphapeptdeep", "test_dda_id_fine_tuning", "test_dda_id_ms2pip", "test_tmt_corr"]
modules/local/utils/msrescore_features/main.nf (1)

45-57: Tolerance unit handling mixes meta and params sources - issue persists.

Line 49 checks meta['fragmentmasstoleranceunit'] but line 51 checks params.ms2features_tolerance_unit. This inconsistency means if the meta unit is 'ppm', the warning on line 52 won't trigger (it checks params instead). Both checks should reference the same source.

🔎 Suggested fix
     if (using_ms2pip) {
         // ms2pip only supports Da unit
         ms2_tolerance_unit = 'Da'
         ms2_tolerance = params.ms2features_tolerance
         if (meta['fragmentmasstoleranceunit'].toLowerCase().endsWith('da')) {
             ms2_tolerance = meta['fragmentmasstolerance']
-        } else if (params.ms2features_tolerance_unit == 'ppm') {
+        } else if (meta['fragmentmasstoleranceunit'].toLowerCase() == 'ppm') {
             log.info "Warning: MS2pip only supports Da unit. Using default from config!"
             ms2_tolerance = 0.05
         } else {
             log.info "Warning: MS2pip only supports Da unit. Using default from config!"
         }
     }
subworkflows/local/dda_id/main.nf (1)

174-176: ifEmpty(null) removal may cause issues when single search engine is used.

When only one search engine is used, CONSENSUSID is not invoked (line 174 condition), so CONSENSUSID.out.versions will be undefined. The removed fallback protected against this.

🔎 Suggested fix - conditionally mix versions
         if (params.search_engines.tokenize(",").unique().size() > 1) {
             CONSENSUSID(ch_consensus_input.groupTuple(size: params.search_engines.tokenize(",").unique().size()))
-            ch_software_versions = ch_software_versions.mix(CONSENSUSID.out.versions)
+            ch_software_versions = ch_software_versions.mix(CONSENSUSID.out.versions.ifEmpty(null))
             ch_psmfdrcontrol = CONSENSUSID.out.consensusids

Alternatively, move the mix call inside the conditional block where CONSENSUSID is invoked.

🧹 Nitpick comments (4)
modules/local/utils/msrescore_fine_tuning/main.nf (2)

9-19: Potential filename collision with hardcoded output.

The output model weight file is hardcoded as "retained_ms2.pth" (line 13). If this process runs multiple times in the same work directory or if outputs are published to the same location, this could cause file collisions. Consider using a prefix based on metadata to ensure unique output filenames.

🔎 Suggested improvement
     output:
-    path "retained_ms2.pth" , emit: model_weight
+    path "${prefix}_retained_ms2.pth" , emit: model_weight
     path "versions.yml"     , emit: versions
     path "*.log"            , emit: log

And update the script command accordingly.


35-37: Unclear warning message for MS2pip fine-tuning.

The warning at line 36 states "Fine tuning only supports AlphaPeptdeep. Using default from config!" but it's unclear what "default from config" means in this context. If MS2pip is not supported for fine-tuning, consider either:

  1. Throwing an error to prevent execution
  2. Automatically switching to AlphaPeptdeep
  3. Clarifying what fallback behavior occurs
🔎 Suggested improvement
     if (using_ms2pip) {
-        log.info "Warning: Fine tuning only supports AlphaPeptdeep. Using default from config!"
+        log.warn "Fine tuning only supports AlphaPeptdeep, but MS2pip was specified in ms2features_generators. AlphaPeptdeep will be used for fine-tuning."
     }
subworkflows/local/dda_id/main.nf (2)

55-59: Model directory defaults to ./ when path is empty or boolean true.

The fallback to file("./") when ms2features_model_dir is not set makes the channel valid but may not be meaningful. Ensure downstream processes handle this gracefully.


186-186: Consistent handling needed for ifEmpty(null) across version channels.

Line 186 removes ifEmpty(null) from PSM_FDR_CONTROL.out.versions, but line 190 retains it for PHOSPHO_SCORING.out.versions. Apply consistent handling based on whether outputs are guaranteed.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af918cf and b5f05b3.

📒 Files selected for processing (13)
  • .github/workflows/ci.yml
  • .github/workflows/extended_ci.yml
  • conf/tests/test_dda_id_fine_tuning.config
  • modules/local/utils/msrescore_features/main.nf
  • modules/local/utils/msrescore_fine_tuning/main.nf
  • modules/local/utils/msrescore_fine_tuning/meta.yml
  • modules/local/utils/psm_clean/main.nf
  • modules/local/utils/spectrum_features/main.nf
  • nextflow.config
  • nextflow_schema.json
  • subworkflows/local/dda_id/main.nf
  • subworkflows/local/file_preparation/main.nf
  • subworkflows/local/psm_rescoring/main.nf
✅ Files skipped from review due to trivial changes (1)
  • modules/local/utils/msrescore_fine_tuning/meta.yml
🚧 Files skipped from review as they are similar to previous changes (1)
  • subworkflows/local/file_preparation/main.nf
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-14T09:53:36.777Z
Learnt from: enryH
Repo: bigbio/quantms PR: 574
File: .github/actions/get-shards/action.yml:31-41
Timestamp: 2025-08-14T09:53:36.777Z
Learning: The .github/actions/get-shards/ and .github/actions/nf-test/ actions in quantms are part of the nf-core template (introduced in template version 3.3.2) for nf-test framework integration, not custom pipeline code.

Applied to files:

  • .github/workflows/extended_ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: Docker Tests (latest-everything, test_lfq)
  • GitHub Check: Docker Tests (latest-everything, test_tmt_corr)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_alphapeptdeep)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_ms2pip)
  • GitHub Check: Docker Tests (latest-everything, test_localize)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_alphapeptdeep)
  • GitHub Check: Docker Tests (latest-everything, test_tmt)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_ms2pip)
  • GitHub Check: Docker Tests (25.04.0, test_lfq)
  • GitHub Check: Docker Tests (25.04.0, test_tmt)
  • GitHub Check: Docker Tests (25.04.0, test_localize)
  • GitHub Check: Docker Tests (latest-everything, test_dia)
  • GitHub Check: Docker Tests (25.04.0, test_dia)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_fine_tuning, docker)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_fine_tuning)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_alphapeptdeep, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_lfq, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dia, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_localize, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_tmt, docker)
🔇 Additional comments (33)
modules/local/utils/psm_clean/main.nf (1)

5-7: LGTM! Container image update is consistent.

The container references have been updated to version 0.0.13, matching the pattern in other modules. This ensures consistency across the pipeline.

nextflow_schema.json (8)

25-31: LGTM! Directory path format validation added.

Adding "format": "directory-path" to the outdir parameter improves schema validation and provides better error messages for invalid paths.


69-75: Default value change for export_decoy_psm.

The default for export_decoy_psm has been explicitly set to true (line 74). This may change behavior for users who previously relied on an implicit false default. Ensure this is documented in the CHANGELOG.


84-116: LGTM! SDRF validation defaults clarified.

Explicit defaults have been set for validation parameters. Note that validate_ontologies defaults to true while use_ols_cache_only also defaults to true, which means validation will use cached ontologies rather than fetching from the internet.


389-412: LGTM! Float defaults for m/z values.

Converting m/z parameter defaults from integers to floats (400.0, 2400.0, 100.0, 1800.0) is correct, as m/z values are inherently floating-point measurements.


440-532: Major migration from Luciphor to Onsite parameters.

This section replaces the luciphor_* parameter namespace with onsite_* parameters for PTM localization. The onsite_algorithm enum includes "lucxor" as an option, suggesting backward compatibility. All parameters are marked as hidden, indicating they are advanced settings. Ensure this migration is thoroughly documented in the CHANGELOG and migration guide.


569-696: Major MS2 feature namespace refactor with breaking changes.

This section introduces a comprehensive MS2 features refactor:

  • Renames: ms2rescorems2features_enable, rescore_rangems2features_range, etc.
  • Adds fine-tuning support with new parameters: ms2features_fine_tuning, epoch_to_train_ms2, transfer_learning_test_ratio
  • Expands model support to include timsTOF2023 and timsTOF2024
  • Adds tolerance unit parameter for better flexibility

This is a breaking change for existing users. Ensure:

  1. CHANGELOG documents all renamed and new parameters
  2. Migration guide provides clear mapping from old to new parameter names
  3. Consider adding deprecation warnings if old parameter names are still supported

1384-1400: LGTM! Quality control defaults and new parameters.

The quality control section has been enhanced with explicit defaults:

  • enable_pmultiqc defaults to true for report generation
  • pmultiqc_idxml_skip defaults to true to skip search engine scores
  • New contaminant_string parameter with default "CONT" for pmultiqc contaminant identification

These changes improve the default behavior and configuration clarity.


1552-1563: LGTM! New help parameters improve UX.

The addition of help, help_full, and show_hidden parameters provides better control over help message display. Note that help accepts both boolean and string types, which is flexible but ensure this is handled correctly in the pipeline code.

conf/tests/test_dda_id_fine_tuning.config (2)

13-19: Resource limits for fine-tuning test.

The test is configured with 4 CPUs, 6GB memory, and 48-hour time limit. The 6GB memory allocation may be tight for deep learning fine-tuning workloads. If the test fails with OOM errors, consider increasing the memory limit.


21-47: LGTM! Comprehensive fine-tuning test configuration.

The test configuration properly enables the fine-tuning pathway with:

  • AlphaPeptDeep as the feature generator
  • Generic model (appropriate for diverse datasets)
  • Fine-tuning enabled with debug logging
  • SNR features included

The combination of ms2features_force=true and ms2features_best=false ensures the specified generic model is used directly without model search, which is efficient for testing.

.github/workflows/extended_ci.yml (1)

51-52: Minor: Step name updated.

The step name has been changed from "Check out pipeline code" to "Checkout repository". This is a minor cosmetic improvement for consistency.

modules/local/utils/msrescore_fine_tuning/main.nf (2)

1-8: LGTM! Process definition and container configuration.

The process definition follows Nextflow best practices with appropriate resource labeling (process_high) for fine-tuning workloads. Container version 0.0.13 is consistent with other MS2-related modules.


81-88: LGTM! Comprehensive version reporting.

Version reporting captures versions for all relevant tools (quantms-rescoring, ms2pip, deeplc, MS2Rescore) using consistent version extraction patterns.

subworkflows/local/psm_rescoring/main.nf (6)

12-12: LGTM! MSRESCORE_FINE_TUNING import added.

The import is correctly added to support the new fine-tuning functionality.


25-47: LGTM! MS2 features gating and fine-tuning logic.

The refactored logic properly handles:

  • Parameter migration from ms2rescore to ms2features_enable
  • Model directory with fallback when parameter is boolean true (empty value)
  • Fine-tuning path with random sampling using fixed seed 2025 for reproducibility
  • Consistent outputs between fine-tuning and direct paths

The random seed 2025 ensures reproducible test results across runs.


57-61: Parameter rename to ms2features namespace.

The parameter has been renamed from add_snr_feature_percolator to ms2features_snr, consistent with the broader MS2 features namespace migration.


64-68: LGTM! Independent run branch updated.

The parameter has been renamed to ms2features_range while maintaining the same logic for the independent_run branch.


107-137: LGTM! by_project branch updated consistently.

The by_project branch has been updated with the same pattern as by_sample, including the consensus input mapping change from 3 to 2 elements (line 134). The parameter rename to ms2features_range is consistent.


68-106: Consensus input mapping is correct.

The by_sample branch correctly emits 2-element tuples [it[1], it[2]] on line 103, which matches the CONSENSUSID input signature tuple val(meta), path(id_file). The mapping provides metadata and the ID file path in the expected format. No downstream updates are needed.

modules/local/utils/spectrum_features/main.nf (1)

5-7: Container update is consistent across spectrum_features and related utility modules, but verify wave.config version consistency.

The container version has been consistently updated to 0.0.13 across spectrum_features, msrescore_features, psm_clean, and msrescore_fine_tuning modules using the correct ORAS protocol for Singularity and GHCR for Docker. However, wave.config still references an older version (0.0.7) from community.wave.seqera.io—confirm whether this version pinning is intentional for caching purposes or should be aligned with the updated 0.0.13 version.

modules/local/utils/msrescore_features/main.nf (2)

25-32: Clarify the fine-tuning model directory logic.

When ms2features_fine_tuning is true, ms2_model_dir points to ./ (current directory). This assumes the fine-tuned model will be available in the working directory at runtime. Verify this aligns with how MSRESCORE_FINE_TUNING.out.model_weight is passed in the workflow.


83-87: Debug logging support looks good.

Clean implementation of conditional debug log level. Consider using log.warn instead of log.info for the warning messages in the tolerance handling above for consistency with severity.

subworkflows/local/dda_id/main.nf (3)

39-40: Verify ifEmpty(null) removal is safe.

Removing ifEmpty(null) assumes PEPTIDE_DATABASE_SEARCH.out.versions always emits. If the subworkflow can short-circuit or emit nothing, this could cause downstream issues.


62-63: Verify randomSample behavior with small datasets.

randomSample(params.fine_tuning_sample_run, 2025) uses a fixed seed. If the channel has fewer items than requested, randomSample returns all items. Ensure fine_tuning_sample_run default (1) is appropriate and the downstream collect() handles variable sizes.


191-191: The PHOSPHO_SCORING subworkflow correctly emits id_onsite.

The output channel id_onsite is confirmed in the emit block of subworkflows/local/phospho_scoring/main.nf (line 28). The reference at line 191 is correct. However, there is no evidence of a previous id_luciphor output in the codebase—this rename claim cannot be substantiated.

Likely an incorrect or invalid review comment.

nextflow.config (7)

112-130: MS2Features configuration block is well-structured.

The new ms2features_* parameters provide a comprehensive configuration surface. The inline comments clearly document units and defaults.


152-164: Onsite options block replaces Luciphor with expanded configurability.

New onsite_algorithm parameter (defaulting to 'lucxor') supports 'ascore', 'phosphors', and 'lucxor'. Ensure documentation reflects available options and migration path from previous Luciphor-specific configurations.


269-269: hook_url now reads from environment variable.

Using System.getenv('HOOK_URL') as default is good practice for sensitive configuration, but will return null if unset. Verify downstream code handles null gracefully.


312-322: New arm64 profile requires Wave service.

The arm64 profile mandates wave.enabled = true, which requires network access to Seqera's Wave service. This is documented in the comment but users on air-gapped systems won't be able to use arm64. Consider adding this limitation to documentation.


399-401: No action needed—CI/CD workflows and configuration already reference the new profile names (test_dda_id_alphapeptdeep, test_dda_id_ms2pip, test_dda_id_fine_tuning). No legacy references to test_dda_id exist in the codebase.

Likely an incorrect or invalid review comment.


531-531: nf-schema plugin updated to 2.5.1 (bug fix release).

This release fixes an issue with non-local samplesheets validation and conversion.


524-525: Nextflow 25.04.0 version bump is compatible with this project.

Verified that the primary breaking changes in Nextflow 25.04.0 do not affect this codebase:

  • Java 17 requirement is already satisfied (configured in CI/CD workflows)
  • HyperQueue executor is not used
  • Traditional publishDir is used throughout, not the new workflow outputs feature

No migration steps required.

Comment on lines +6 to +7
'oras://ghcr.io/bigbio/quantms-rescoring-sif:0.0.13' :
'ghcr.io/bigbio/quantms-rescoring:0.0.13' }"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Find and examine the file
find . -path "*/modules/local/utils/msrescore_features/main.nf" -type f

Repository: bigbio/quantms

Length of output: 106


🏁 Script executed:

cat -n ./modules/local/utils/msrescore_features/main.nf

Repository: bigbio/quantms

Length of output: 4869


Address tolerance unit logic inconsistency before approval.

Container versions are correctly updated to 0.0.13 for both Singularity and Docker. However, line 51 checks params.ms2features_tolerance_unit instead of the meta-derived ms2_tolerance_unit that was loaded on line 44, creating an inconsistency in the tolerance handling logic for ms2pip fallback. This should check the already-loaded meta value to maintain consistency with the rest of the tolerance logic.

🤖 Prompt for AI Agents
In modules/local/utils/msrescore_features/main.nf around lines 6 to 7, the code
that decides ms2pip fallback is checking params.ms2features_tolerance_unit
instead of the meta-derived ms2_tolerance_unit loaded on line 44; change the
conditional to reference ms2_tolerance_unit so it uses the already-loaded meta
value consistently with the rest of the tolerance logic, ensuring the variable
is used where present and no new params lookup is introduced.

Comment on lines 1116 to 1122
"lfq_intensity_threshold": {
"type": "number",
"description": "The minimum intensity for a feature to be considered for quantification. (default: '10000')",
"default": 1000,
"default": 1000.0,
"fa_icon": "fas fa-filter",
"help_text": "The minimum intensity for a feature to be considered for quantification. (default: '10000')"
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Significant threshold change for LFQ intensity.

The default for lfq_intensity_threshold has been changed from 10000 to 1000.0 (line 1119), which is a 10-fold reduction. This will include more features in quantification but may also increase noise. The help text (line 1121) still references the old default of 10000. Update the help text to match the new default or verify this change is intentional and document it in the CHANGELOG.

🔎 Proposed fix for help text
                 "lfq_intensity_threshold": {
                     "type": "number",
-                    "description": "The minimum intensity for a feature to be considered for quantification. (default: '10000')",
+                    "description": "The minimum intensity for a feature to be considered for quantification. (default: '1000')",
                     "default": 1000.0,
                     "fa_icon": "fas fa-filter",
-                    "help_text": "The minimum intensity for a feature to be considered for quantification. (default: '10000')"
+                    "help_text": "The minimum intensity for a feature to be considered for quantification. (default: '1000')"
                 },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"lfq_intensity_threshold": {
"type": "number",
"description": "The minimum intensity for a feature to be considered for quantification. (default: '10000')",
"default": 1000,
"default": 1000.0,
"fa_icon": "fas fa-filter",
"help_text": "The minimum intensity for a feature to be considered for quantification. (default: '10000')"
},
"lfq_intensity_threshold": {
"type": "number",
"description": "The minimum intensity for a feature to be considered for quantification. (default: '1000')",
"default": 1000.0,
"fa_icon": "fas fa-filter",
"help_text": "The minimum intensity for a feature to be considered for quantification. (default: '1000')"
},
🤖 Prompt for AI Agents
In nextflow_schema.json around lines 1116 to 1122, the numeric default for
lfq_intensity_threshold was changed to 1000.0 but the description/help_text
still mention 10000; update the help_text (and the description default text if
it embeds the old value) to reflect the new default (e.g., '1000' or '1000.0')
so all user-facing strings match the numeric default, or if the 10x reduction
was intentional, leave text updated and add a short entry to the CHANGELOG
noting the default change and its rationale.

random_preanalysis_seed = 42
empirical_assembly_ms_n = 200
enable_diann_mztab = true
enable_diann_mztab = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Breaking change: enable_diann_mztab default changed from true to false.

This could affect existing workflows that rely on mzTab output from DIA-NN. Consider documenting this in the changelog as a breaking change.

🤖 Prompt for AI Agents
In nextflow.config around line 224, the boolean flag enable_diann_mztab was
changed from true to false which is a breaking change for workflows depending on
DIA‑NN mzTab output; update the project changelog/release notes to explicitly
call this out as a breaking change (include old default, new default, rationale
and migration steps), and either restore the default to true if backward
compatibility must be preserved or add an explicit comment above this config
option and update any docs/workflow configs that assume mzTab generation so
users are not silently broken.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (8)
modules/bigbio/thermorawfileparser/meta.yml (1)

13-13: Clarify the -f=4 format option documentation.

The documentation states -f=4 for "None", which is ambiguous. According to ThermoRawFileParser's official documentation, this option means "no spectra output." Expand this to -f=4 for None (no spectra output)` to make the behavior explicit.

This issue was previously flagged but remains unresolved.

🔎 Proposed fix
-        - `-f=0` for MGF output, `-f=1` for mzML, `-f=2` for indexed mzML (default), `-f=3` for Parquet, `-f=4` for None
+        - `-f=0` for MGF output, `-f=1` for mzML, `-f=2` for indexed mzML (default), `-f=3` for Parquet, `-f=4` for None (no spectra output)
modules/local/utils/msrescore_fine_tuning/main.nf (1)

45-49: Variable naming issue resolved.

The previous review comment about the variable name mismatch has been addressed. The variable is now consistently named force_transfer_learning in both the definition (line 46) and usage (line 75).

Also applies to: 75-75

subworkflows/local/dda_id/main.nf (1)

175-187: Potential undefined channel when using single search engine.

When only one search engine is configured, CONSENSUSID is not invoked (line 175 condition), making CONSENSUSID.out.versions undefined. However, line 177 directly mixes this channel without a fallback, which could cause a workflow error.

🔎 Proposed fix to restore fallback
         if (params.search_engines.tokenize(",").unique().size() > 1) {
             CONSENSUSID(ch_consensus_input.groupTuple(size: params.search_engines.tokenize(",").unique().size()))
-            ch_software_versions = ch_software_versions.mix(CONSENSUSID.out.versions)
+            ch_software_versions = ch_software_versions.mix(CONSENSUSID.out.versions.ifEmpty(null))
             ch_psmfdrcontrol = CONSENSUSID.out.consensusids
             ch_psmfdrcontrol
                 .map { it -> it[1] }
                 .set { ch_pmultiqc_consensus }
         } else {
             ch_psmfdrcontrol = ch_consensus_input
         }

Or conditionally mix based on the branch:

         if (params.search_engines.tokenize(",").unique().size() > 1) {
             CONSENSUSID(ch_consensus_input.groupTuple(size: params.search_engines.tokenize(",").unique().size()))
             ch_psmfdrcontrol = CONSENSUSID.out.consensusids
+            ch_software_versions = ch_software_versions.mix(CONSENSUSID.out.versions)
             ch_psmfdrcontrol
                 .map { it -> it[1] }
                 .set { ch_pmultiqc_consensus }
         } else {
             ch_psmfdrcontrol = ch_consensus_input
         }
-        ch_software_versions = ch_software_versions.mix(CONSENSUSID.out.versions)
nextflow_schema.json (3)

889-894: Type consistency issue resolved.

The previous type mismatch for reference_channel has been fixed. The schema now correctly uses a string type with string default "126", matching the config file.


1182-1188: Type consistency issue resolved.

The previous type mismatch for pg_level has been fixed. The default is now the integer 2, matching the enum type [0, 1, 2].


1116-1122: Update help text to match new default value.

The lfq_intensity_threshold default was changed from 10000 to 1000.0 (line 1119), but the help text on line 1121 still references the old default of 10000. This will confuse users.

🔎 Proposed fix
                 "lfq_intensity_threshold": {
                     "type": "number",
                     "description": "The minimum intensity for a feature to be considered for quantification. (default: '1000')",
                     "default": 1000.0,
                     "fa_icon": "fas fa-filter",
-                    "help_text": "The minimum intensity for a feature to be considered for quantification. (default: '10000')"
+                    "help_text": "The minimum intensity for a feature to be considered for quantification. (default: '1000')"
                 },
.github/workflows/extended_ci.yml (2)

31-31: Verify Nextflow 25.04.0 compatibility and breaking changes.

The upgrade from 24.10.5 to 25.04.0 introduces several breaking changes including dynamic publish path syntax, Java 17+ requirement, and deprecated shell sections. Ensure the pipeline has been tested with this version and follows the migration guide.

Based on learnings, the NXF version upgrade requires careful validation of the entire pipeline against the breaking changes documented in the Nextflow migration guide.

Also applies to: 123-123


33-43: Test profile parity achieved between Docker and Singularity.

Both test matrices now include the same test profiles (test_dda_id_alphapeptdeep, test_dda_id_fine_tuning, test_dda_id_ms2pip, etc.), providing consistent coverage across execution environments.

Also applies to: 125-135

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5f05b3 and fd85625.

📒 Files selected for processing (9)
  • .github/workflows/extended_ci.yml
  • .nf-core.yml
  • CHANGELOG.md
  • conf/tests/test_dda_id_fine_tuning.config
  • modules/bigbio/thermorawfileparser/meta.yml
  • modules/local/utils/msrescore_fine_tuning/main.nf
  • nextflow_schema.json
  • subworkflows/local/dda_id/main.nf
  • subworkflows/local/file_preparation/main.nf
🚧 Files skipped from review as they are similar to previous changes (2)
  • .nf-core.yml
  • CHANGELOG.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-14T09:53:36.777Z
Learnt from: enryH
Repo: bigbio/quantms PR: 574
File: .github/actions/get-shards/action.yml:31-41
Timestamp: 2025-08-14T09:53:36.777Z
Learning: The .github/actions/get-shards/ and .github/actions/nf-test/ actions in quantms are part of the nf-core template (introduced in template version 3.3.2) for nf-test framework integration, not custom pipeline code.

Applied to files:

  • .github/workflows/extended_ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dia, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_localize, docker)
  • GitHub Check: Docker Tests (25.04.0, test_tmt)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_fine_tuning)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_alphapeptdeep)
  • GitHub Check: Docker Tests (25.04.0, test_lfq)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_alphapeptdeep, docker)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_ms2pip)
  • GitHub Check: Docker Tests (latest-everything, test_tmt_corr)
  • GitHub Check: Docker Tests (25.04.0, test_localize)
  • GitHub Check: Docker Tests (latest-everything, test_dia)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_fine_tuning)
  • GitHub Check: Docker Tests (latest-everything, test_lfq)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_alphapeptdeep)
  • GitHub Check: Docker Tests (25.04.0, test_dia)
  • GitHub Check: Docker Tests (latest-everything, test_tmt)
  • GitHub Check: Run pipeline with test data (25.04.0, test_lfq, docker)
  • GitHub Check: Docker Tests (25.04.0, test_tmt_corr)
  • GitHub Check: Run pipeline with test data (25.04.0, test_tmt, docker)
  • GitHub Check: Docker Tests (latest-everything, test_localize)
🔇 Additional comments (7)
subworkflows/local/file_preparation/main.nf (4)

46-48: Good addition of .dia and unsupported format handling.

The new branches for .dia files and unsupported formats are well-structured. The catch-all unsupported branch is correctly placed last to capture any files that don't match the specific format checks.


50-64: Excellent user experience improvement for unsupported formats.

The warning block provides clear feedback when unsupported file formats are detected. The implementation correctly:

  • Uses collect() on metadata tuples (not file content), so memory impact is minimal
  • Lists each unsupported file individually for clarity
  • Provides guidance on supported formats and compressed variants

90-90: Output channel name change is consistent with module migration.

The reference to THERMORAWFILEPARSER.out.convert_files aligns with the module path change at line 5. This change is already covered by the verification script in the earlier comment.


110-112: Excellent fix for the past review issue with .dia files!

This change correctly addresses the previous review comment about .dia files being incompatible with MZML_STATISTICS. The .dia files are now added to ch_results (line 112) AFTER MZML_STATISTICS is called (line 104), preventing them from being processed by the incompatible module.

Additionally, the implementation correctly excludes .dia files from peak picking by:

  1. Creating indexed_mzml_bundle as a snapshot before .dia files are added (line 92)
  2. Using indexed_mzml_bundle for OPENMS_PEAK_PICKER (line 117), not the final ch_results

The inline comments accurately document this bypass behavior for DIA-NN.

subworkflows/local/dda_id/main.nf (2)

56-60: Good defensive handling of model directory parameter.

The code properly handles edge cases where ms2features_model_dir might be empty, null, boolean true, or whitespace, with a sensible fallback to the current directory.


62-64: Good use of fixed seed for reproducibility.

Using a fixed seed (2025) with randomSample ensures reproducible test dataset selection across pipeline runs, which is essential for debugging and validation.

conf/tests/test_dda_id_fine_tuning.config (1)

21-46: Test configuration looks well-structured.

The fine-tuning test profile appropriately configures MS2 features with alphapeptdeep, enables debugging, and uses reasonable test parameters. The high FDR cutoffs (0.20/0.50) are appropriate for faster test execution.

Comment on lines +51 to 52
- name: Checkout repository
uses: actions/checkout@v4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent checkout configuration between Docker and Singularity jobs.

The Docker job removes fetch-depth: 0 (lines 51-52) while the Singularity job retains it (lines 143-145). This inconsistency may lead to different Git history availability between test environments.

🔎 Consider aligning checkout steps

If shallow clone is intentional for Docker, apply to Singularity:

       - name: Check out pipeline code
         uses: actions/checkout@v4
-        with:
-          fetch-depth: 0

Or restore full history for Docker if needed:

       - name: Checkout repository
         uses: actions/checkout@v4
+        with:
+          fetch-depth: 0

Also applies to: 142-145

🤖 Prompt for AI Agents
In .github/workflows/extended_ci.yml around lines 51-52 and 142-145, the Docker
job checkout step differs from the Singularity job (Docker lacks fetch-depth: 0
while Singularity includes it), causing inconsistent Git history availability
between jobs; update the workflow so both jobs use the same actions/checkout
configuration—either add fetch-depth: 0 to the Docker job or remove fetch-depth
from the Singularity job—so both checkout steps consistently provide the same
commit history.

Comment on lines +36 to +37
Converted files in mzML or MGF format depending on the format parameter (-f).
Format options: 0 for MGF, 1 for mzML, 2 for indexed mzML (default), 3 for Parquet, 4 for None.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Update description to include Parquet format.

The description mentions "mzML or MGF format" but format option 3 produces Parquet output, which is also included in the pattern on line 38. Update the description to reflect all supported output formats.

🔎 Proposed fix
       description: |
-        Converted files in mzML or MGF format depending on the format parameter (-f).
+        Converted files in mzML, MGF, or Parquet format depending on the format parameter (-f).
         Format options: 0 for MGF, 1 for mzML, 2 for indexed mzML (default), 3 for Parquet, 4 for None.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Converted files in mzML or MGF format depending on the format parameter (-f).
Format options: 0 for MGF, 1 for mzML, 2 for indexed mzML (default), 3 for Parquet, 4 for None.
Converted files in mzML, MGF, or Parquet format depending on the format parameter (-f).
Format options: 0 for MGF, 1 for mzML, 2 for indexed mzML (default), 3 for Parquet, 4 for None.
🤖 Prompt for AI Agents
In modules/bigbio/thermorawfileparser/meta.yml around lines 36 to 37, the
description only mentions mzML or MGF output while the format parameter also
supports indexed mzML, Parquet and None (options 2, 3, 4); update the
description sentence to list all supported output formats (MGF, mzML, indexed
mzML, Parquet, or None) and ensure it matches the format options documented on
line 38.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
subworkflows/local/file_preparation/main.nf (1)

110-122: Critical: .dia files are lost when peak picking is enabled.

The .dia files mixed into ch_results at line 112 are lost when params.openms_peakpicking is enabled. At line 121, ch_results is completely overwritten with OPENMS_PEAK_PICKER.out.mzmls_picked, which doesn't include the .dia files since they were mixed in after indexed_mzml_bundle was created (line 92).

This means users who enable OpenMS peak picking will silently lose their .dia files, breaking DIA-NN analysis.

🔎 Proposed fix: Move .dia passthrough after peak picking block
     MZML_STATISTICS(ch_results)
     ch_statistics = ch_statistics.mix(MZML_STATISTICS.out.ms_statistics.collect())
     ch_ms2_statistics = ch_statistics.mix(MZML_STATISTICS.out.ms2_statistics)
     ch_feature_statistics = ch_statistics.mix(MZML_STATISTICS.out.feature_statistics.collect())
     ch_versions = ch_versions.mix(MZML_STATISTICS.out.versions)
 
-    // Pass through .dia files without conversion (DIA-NN handles them natively)
-    // Note: .dia files bypass peak picking and mzML statistics (when enabled) as they are only used with DIA-NN
-    ch_results = ch_results.mix(ch_branched_input.dia)
-
     if (params.openms_peakpicking) {
         // If the peak picker is enabled, it will over-write not bypass the .d files
         OPENMS_PEAK_PICKER (
             indexed_mzml_bundle
         )
 
         ch_versions = ch_versions.mix(OPENMS_PEAK_PICKER.out.versions)
         ch_results = OPENMS_PEAK_PICKER.out.mzmls_picked
     }
+
+    // Pass through .dia files without conversion (DIA-NN handles them natively)
+    // Note: .dia files bypass peak picking and mzML statistics as they are only used with DIA-NN
+    ch_results = ch_results.mix(ch_branched_input.dia)
 
     emit:
♻️ Duplicate comments (2)
nextflow.config (1)

224-224: Breaking change: enable_diann_mztab default changed from true to false.

This is a breaking change for existing workflows that rely on mzTab output from DIA-NN. Ensure this is documented in the CHANGELOG.md as a breaking change with migration guidance.

subworkflows/local/dda_id/main.nf (1)

179-181: Same ifEmpty(null) removal issue for CONSENSUSID.

When only one search engine is used, the if block (lines 179-186) is skipped, so CONSENSUSID.out.versions is never defined. Line 181 will then fail when mixing into ch_software_versions. Consider conditionally mixing or using a fallback.

🧹 Nitpick comments (3)
subworkflows/local/file_preparation/main.nf (1)

99-99: Remove commented-out code.

The commented-out code should be removed to improve readability and maintainability.

🔎 Proposed cleanup
     ch_results = indexed_mzml_bundle.mix(TDF2MZML.out.mzmls_converted)
-    // indexed_mzml_bundle = indexed_mzml_bundle.mix(TDF2MZML.out.mzmls_converted)
 } else {
modules/local/utils/msrescore_fine_tuning/main.nf (1)

21-22: Unused variable prefix.

The prefix variable is defined on line 22 but never used in the script. Consider removing it or using it for the log file naming.

🔎 Proposed fix
     def args = task.ext.args ?: ''
-    def prefix = task.ext.prefix ?: "${meta.mzml_id}_ms2rescore"

Or use it for the log file:

-        2>&1 | tee ${idxml.baseName}_fine_tuning.log
+        2>&1 | tee ${prefix}_fine_tuning.log
subworkflows/local/dda_id/main.nf (1)

56-60: Potential issue with Channel.from(file(...)) pattern.

Channel.from() is deprecated in DSL2. Consider using Channel.of() or Channel.value() instead for single values, or Channel.fromPath() for file paths.

🔎 Proposed fix
             if (params.ms2features_model_dir && params.ms2features_model_dir != true) {
-                ms2_model_dir = Channel.from(file(params.ms2features_model_dir, checkIfExists: true))
+                ms2_model_dir = Channel.value(file(params.ms2features_model_dir, checkIfExists: true))
             } else {
-                ms2_model_dir = Channel.from(file("./"))
+                ms2_model_dir = Channel.value(file("./"))
             }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 344d721 and 59861b3.

📒 Files selected for processing (7)
  • .nf-core.yml
  • modules/local/thermorawfileparser/main.nf
  • modules/local/thermorawfileparser/meta.yml
  • modules/local/utils/msrescore_fine_tuning/main.nf
  • nextflow.config
  • subworkflows/local/dda_id/main.nf
  • subworkflows/local/file_preparation/main.nf
💤 Files with no reviewable changes (2)
  • modules/local/thermorawfileparser/meta.yml
  • modules/local/thermorawfileparser/main.nf
🚧 Files skipped from review as they are similar to previous changes (1)
  • .nf-core.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
  • GitHub Check: Docker Tests (25.04.0, test_dia)
  • GitHub Check: Docker Tests (25.04.0, test_tmt)
  • GitHub Check: Docker Tests (latest-everything, test_dda_id_ms2pip)
  • GitHub Check: Docker Tests (25.04.0, test_dda_id_ms2pip)
  • GitHub Check: Docker Tests (latest-everything, test_tmt)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_ms2pip, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dia, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_localize, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_fine_tuning, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_lfq, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_alphapeptdeep, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_tmt, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_lfq, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_ms2pip, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_tmt, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_localize, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_fine_tuning, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dda_id_alphapeptdeep, docker)
  • GitHub Check: Run pipeline with test data (25.04.0, test_dia, docker)
🔇 Additional comments (13)
subworkflows/local/file_preparation/main.nf (2)

46-64: LGTM: Clear handling of unsupported formats.

The addition of .dia branching and the warning block for unsupported file formats provides excellent user feedback. The logic correctly catches files that don't match any supported extension and clearly communicates supported formats to users.


84-84: Past review comment addressed.

The comment has been correctly updated to reflect the actual output channel name convert_files, as requested in the previous review.

nextflow.config (5)

112-130: New MS2 features configuration looks well-structured.

The ms2features_* namespace is a clean replacement for the previous msrescore_* parameters. Defaults are sensible (e.g., ms2features_enable = false, ms2features_calibration = 0.15).

Minor observation: ms2features_tolerance has a comment indicating "Unit Da (fallback when unit conversion needed)" at line 119, but there's also a separate ms2features_tolerance_unit parameter. Ensure the documentation clearly explains when the fallback applies versus when the unit parameter is used.


152-164: Onsite PTM-localization parameters are complete.

The new onsite_* parameters replacing Luciphor are comprehensive. The algorithm options (lucxor, ascore, phosphors) and fragment method configuration align with common PTM localization workflows.


269-269: Environment variable fallback for hook_url is a good practice.

Using System.getenv('HOOK_URL') allows CI/CD systems to inject webhook URLs without hardcoding them.


312-325: New arm64 and emulate_amd64 profiles are well configured.

The arm64 profile correctly enables Wave with OCI auto-pull for both Apptainer and Singularity. The emulate_amd64 profile provides a clean fallback for running x86_64 containers on ARM hardware.


524-526: Review the Nextflow version constraint syntax in nextflow.config.

The code specifies nextflowVersion = '!>=25.04.0', which uses the negation operator (!). In Nextflow, this actually means "NOT greater than or equal to 25.04.0" — a maximum version constraint, not a minimum. Clarify whether this syntax is intentional or if it should be nextflowVersion = '>=25.04.0' to set the minimum requirement as suggested.

Likely an incorrect or invalid review comment.

modules/local/utils/msrescore_fine_tuning/main.nf (2)

56-72: Transfer learning command structure looks correct.

The force_transfer_learning variable naming issue from the previous review appears to be resolved. The command now correctly references ${force_transfer_learning} which is defined with the same name on lines 38-42.


74-80: Version extraction is robust.

The version extraction uses grep -Eo with a semantic version pattern, which handles various output formats gracefully.

subworkflows/local/dda_id/main.nf (4)

13-14: New module imports are correctly added.

The imports for MSRESCORE_FINE_TUNING and PHOSPHO_SCORING align with the new fine-tuning and PTM localization workflows.


190-191: Verify PSM_FDR_CONTROL.out.versions is always defined.

The .ifEmpty(null) fallback was removed from PSM_FDR_CONTROL.out.versions. Unlike CONSENSUSID, PSM_FDR_CONTROL is always invoked, so this should be safe. However, ensure the process always emits a versions file.


193-196: PHOSPHO_SCORING integration looks correct.

The conditional phospho scoring path correctly gates on params.enable_mod_localization and assigns results to ch_id_results from either PHOSPHO_SCORING.out.id_onsite or PSM_FDR_CONTROL.out.id_filtered.


66-68: [Rewritten review comment]
[Exactly ONE classification tag]

Comment on lines +63 to +64
if (params.ms2features_generators.toLowerCase().contains('ms2pip')) {
exit(1, 'Error: Fine tuning only supports AlphaPeptdeep!')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use error() instead of exit() in Nextflow workflows.

The exit(1, 'message') syntax is not standard Nextflow. Use error('message') to properly terminate the workflow with an error.

🔎 Proposed fix
             if (params.ms2features_fine_tuning == true) {
                 if (params.ms2features_generators.toLowerCase().contains('ms2pip')) {
-                    exit(1, 'Error: Fine tuning only supports AlphaPeptdeep!')
+                    error('Fine tuning only supports AlphaPeptdeep. Please set --ms2features_generators to include "alphapeptdeep" instead of "ms2pip".')
                 } else {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (params.ms2features_generators.toLowerCase().contains('ms2pip')) {
exit(1, 'Error: Fine tuning only supports AlphaPeptdeep!')
if (params.ms2features_generators.toLowerCase().contains('ms2pip')) {
error('Fine tuning only supports AlphaPeptdeep. Please set --ms2features_generators to include "alphapeptdeep" instead of "ms2pip".')
🤖 Prompt for AI Agents
In subworkflows/local/dda_id/main.nf around lines 63-64, the code calls exit(1,
'Error: Fine tuning only supports AlphaPeptdeep!') inside the conditional
checking params.ms2features_generators; replace that non-standard Nextflow
exit(...) call with the Nextflow error('Error: Fine tuning only supports
AlphaPeptdeep!') function so the workflow properly aborts with the given message
(keep the existing condition and message, just swap exit(...) for error(...)).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
nextflow.config (1)

224-224: Confirm breaking change: DIA-NN mzTab default changed.

The default value of enable_diann_mztab has been changed from true to false. This is a breaking change that could affect existing workflows relying on mzTab output from DIA-NN.

Please confirm this breaking change is intentional and ensure it's documented in CHANGELOG.md with migration guidance for users.

🧹 Nitpick comments (2)
modules/local/utils/msrescore_fine_tuning/main.nf (2)

24-30: Remove unnecessary null initialization and clarify misleading comment.

Lines 25-26 initialize tolerance variables to null only to immediately overwrite them on lines 29-30. The comment on line 28 suggests conditional logic for ms2pip vs alphapeptdeep handling, but no such logic exists in this code.

🔎 Proposed refactor
-    // Initialize tolerance variables
-    def ms2_tolerance = null
-    def ms2_tolerance_unit = null
-
-    // ms2pip only supports Da unit, but alphapeptdeep supports both Da and ppm
-    ms2_tolerance = meta['fragmentmasstolerance']
-    ms2_tolerance_unit = meta['fragmentmasstoleranceunit']
+    // Extract fragment mass tolerance from metadata
+    def ms2_tolerance = meta['fragmentmasstolerance']
+    def ms2_tolerance_unit = meta['fragmentmasstoleranceunit']

32-42: Add def keyword to ensure local variable scope.

The variables force_transfer_learning and consider_modloss are assigned without the def keyword, making them global bindings in Groovy. This can lead to unexpected behavior if these names are reused elsewhere.

🔎 Proposed fix
+    def force_transfer_learning
     if (params.force_transfer_learning) {
         force_transfer_learning = "--force_transfer_learning"
     } else {
         force_transfer_learning = ""
     }

+    def consider_modloss
     if (params.ms2features_modloss) {
         consider_modloss = "--consider_modloss"
     } else {
         consider_modloss = ""
     }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59861b3 and 929ca6a.

📒 Files selected for processing (2)
  • modules/local/utils/msrescore_fine_tuning/main.nf
  • nextflow.config
🔇 Additional comments (9)
modules/local/utils/msrescore_fine_tuning/main.nf (2)

1-18: LGTM!

The process definition, container configuration, and input/output declarations follow Nextflow best practices.


44-68: LGTM!

The rescoring transfer_learning command is properly constructed with all required parameters, and the version capture follows standard Nextflow patterns. The previous issue with variable naming (force_transfer_learning vs force_model) has been resolved.

nextflow.config (7)

48-48: LGTM!

The replacement of luciphor_debug with onsite_debug aligns with the PR's migration from Luciphor to the ONSITE PTM-localization module.


112-130: LGTM!

The comprehensive MS2Features configuration block introduces well-structured parameters for the new MS2-based feature generation and fine-tuning workflow. The defaults are sensible (disabled by default with ms2features_enable = false), and the parameter names follow a consistent ms2features_* convention.


152-164: LGTM!

The ONSITE configuration block properly replaces the Luciphor setup with comprehensive PTM localization parameters. The default algorithm 'lucxor' provides continuity while enabling the new ONSITE module's multi-algorithm support.


252-252: LGTM!

The addition of contaminant_string for pmultiqc options and the new help flags (help_full, show_hidden) are useful enhancements. Using System.getenv('HOOK_URL') for the hook URL allows proper environment-based configuration.

Also applies to: 269-272


312-325: LGTM!

The new arm64 and emulate_amd64 profiles properly support multi-architecture environments. The inline comment explaining the wave requirement for arm64 maintainability is helpful.


399-401: LGTM!

The new test profiles (test_dda_id_alphapeptdeep, test_dda_id_ms2pip, test_dda_id_fine_tuning) properly support testing the new MS2 features and fine-tuning workflows.


531-531: LGTM!

The nf-schema plugin update to version 2.5.1 is appropriate for the pipeline's validation requirements.

Comment on lines +524 to +525
nextflowVersion = '!>=25.04.0'
version = '1.7.0'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check Nextflow releases for version 25.04.0
# First, let's try a direct web query approach
echo "Searching for Nextflow 25.04.0 release information..."

# Try to find Nextflow release information via curl if gh is not available
curl -s "https://api.github.com/repos/nextflow-io/nextflow/releases" | jq -r '.[] | select(.tag_name | contains("25.04")) | {tag_name, published_at, prerelease, draft}' | head -50

Repository: bigbio/quantms

Length of output: 186


🌐 Web query:

Nextflow version 25.04.0 release date and status

💡 Result:

Nextflow 25.04.0 — released May 8, 2025; stable release. [1][2]

Sources:
[1] Nextflow migration notes: "Nextflow 25.04 was released on May 8, 2025."
[2] Nextflow updating docs (release cadence / stable vs edge).


🏁 Script executed:

#!/bin/bash
# Check the codebase for README and CHANGELOG files
echo "=== Searching for README files ==="
find . -type f -iname "README*" | head -20

echo -e "\n=== Searching for CHANGELOG files ==="
find . -type f -iname "CHANGE*" | head -20

echo -e "\n=== Checking nextflow.config for version references ==="
cat -n nextflow.config | sed -n '524,530p'

Repository: bigbio/quantms

Length of output: 427


🏁 Script executed:

#!/bin/bash
# Search for any documentation of this version requirement
echo "=== Searching for references to 25.04.0 in the codebase ==="
rg "25\.04\.0" --type-list | head -5
rg "25\.04\.0" --max-count 20

Repository: bigbio/quantms

Length of output: 13857


🏁 Script executed:

#!/bin/bash
# Check CHANGELOG for documentation about the 25.04.0 version requirement
echo "=== Checking CHANGELOG.md for 25.04.0 documentation ==="
head -100 CHANGELOG.md

echo -e "\n=== Searching for 25.04.0 in CHANGELOG ==="
rg "25\.04\.0|nextflowVersion" CHANGELOG.md -A 2 -B 2 | head -30

Repository: bigbio/quantms

Length of output: 4905


Document Nextflow 25.04.0 requirement in CHANGELOG.

The nextflowVersion requirement of '!>=25.04.0' is valid: Nextflow 25.04.0 is a stable release (May 8, 2025). The requirement is already documented in the README badge showing "version ≥25.04.0". However, the CHANGELOG.md does not mention when or why this Nextflow version requirement was established for version 1.7.0. Consider adding an entry to document this dependency update for clarity.

🤖 Prompt for AI Agents
In nextflow.config around lines 524-525, the Nextflow engine requirement is set
to '!>=25.04.0' for version 1.7.0 but the CHANGELOG.md lacks an entry
documenting this dependency bump; add a short CHANGELOG entry under version
1.7.0 noting the Nextflow compatibility requirement (Nextflow ≥ 25.04.0, date
May 8, 2025) and a brief rationale (e.g., needed for new feature/fix or
compatibility), include the exact version string and date, and commit the
updated CHANGELOG.md.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants