Skip to content

fix: workbench pipeline execution on Google Batch#1

Draft
samhornstein wants to merge 7 commits into
masterfrom
fix/workbench-review-fixes
Draft

fix: workbench pipeline execution on Google Batch#1
samhornstein wants to merge 7 commits into
masterfrom
fix/workbench-review-fixes

Conversation

@samhornstein

Copy link
Copy Markdown
Owner

Summary

  • Fix pipeline execution on Google Batch via Verily Workbench, resolving container image, resource allocation, and compatibility issues
  • Add pre-built GRCh38 bowtie2 host index to skip ~50 min index build step
  • Add Workbench quick-start documentation

Changes

  • Resource config: Add bowtie2_index and bwa index process overrides (n2-highmem-8, 64GB) to prevent OOM kills on full genome builds
  • Container image: Rename to flores-workbench, add fastp, bowtie2, make, bundle bin/ scripts, pin nextflow=24
  • MultiQC: Fix output naming for newer MultiQC versions (--outdir/--filename flags)
  • Bracken: Add targeted errorStrategy (ignore exit code 1 only) for empty taxonomic levels
  • Kraken: Add Domain (D) taxonomic level to kraken2_long_to_wide_update.py; use container-baked script paths consistently
  • Host index: Fix GCS glob handling by removing Paths.get() in fastq_host_removal.nf; add pre-built index to params_google_batch.config
  • Config hygiene: Gitignore workspace-specific env files (wb.env, gcp.env); add setup instructions to templates
  • Nextflow idioms: Replace ${threads} with ${task.cpus} throughout

Test plan

  • Full end-to-end pipeline run on Google Batch (no -resume), 22 min, all 18 process types passed
  • Verified pre-built host index skips bowtie2_index step entirely
  • Container image must be rebuilt to include Domain D fix in kraken2_long_to_wide_update.py

🤖 Generated with Claude Code

samhornstein and others added 7 commits May 13, 2026 13:00
Fixes several issues preventing FloRes from running on Verily Workbench:
- wb/run.sh now passes -profile and -c flags to nextflow
- Adds params_google_batch.config with all gs:// paths for cloud execution
- Restores params.config to local-only defaults (no hardcoded google-batch executor)
- Moves process resource declarations into config/google_batch.config
- Parameterizes hardcoded bucket names with GCS_REF_BUCKET env var
…ecution

Apply learnings from AMR workbench conversion: add fastp/bowtie2/make to
container, pin nextflow=24, COPY bin/ to /opt/amrplusplus/bin, replace all
$baseDir/bin/ refs with container paths, and use ${task.cpus} instead of
${threads} in bwa/trimmomatic modules.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rkbench

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add resource overrides for bowtie2_index and bwa index processes
  (n2-highmem-8, 64GB) to prevent OOM kills on full genome builds
- Fix multiqc output naming for newer multiqc versions by adding
  --outdir and --filename flags
- Add errorStrategy 'ignore' to runbracken for empty taxonomic levels
- Add Domain ('D') taxonomic level to kraken2_long_to_wide_update.py
- Use Nextflow-uploaded bin/ scripts instead of container-baked paths
  in krakenresults process
- Fix GCS glob handling in host index loading by removing Paths.get()
- Add pre-built GRCh38 bowtie2 host index to skip 50-min build step

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use consistent container path for krakenresults script
  (/opt/amrplusplus/bin/ instead of $HOME/.nextflow-bin/)
- Gitignore wb.env and gcp.env (workspace-specific config)
- Add setup instructions to wb.env.template
- Add helpful error message when env file is missing
- Make runbracken errorStrategy targeted to exit code 1 only

Note: container image must be rebuilt to include the Domain 'D'
fix in kraken2_long_to_wide_update.py for krakenresults to work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@samhornstein samhornstein marked this pull request as draft May 18, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant