Skip to content

Audit: Module directory names that don't match the tool invoked in script block #11073

@FriederikeHanssen

Description

@FriederikeHanssen

Summary

An automated scan of all 1767 nf-core modules found 39 modules where the directory name (modules/nf-core/<tool>/<subtool>/) does not match the actual command invoked in the script: block of main.nf. Per nf-core convention, modules should follow tool/subtool naming that matches the CLI tool being called.

This issue was prompted by the discussion in #10959 (comment) about mcstaging/* modules being grouped by purpose rather than by tool name.

Each case below needs human review to determine the appropriate action (rename, restructure, or document as acceptable exception).


Mismatches by Category

🔬 Imaging / mcmicro modules (grouped by purpose instead of tool)

These modules are grouped under umbrella directories by function rather than by tool name:

Reviewed? Module path Dir tool name Actual command Notes
mcstaging/imc2mc mcstaging python /imc2mc/scripts/imc2mc.py Tool is imc2mc, grouped under mcstaging
mcstaging/macsima2mc mcstaging python /staging/macsima2mc/macsima2mc.py Tool is macsima2mc
mcstaging/phenoimager2mc mcstaging python /phenoimager2mc/scripts/phenoimager2mc.py Tool is phenoimager2mc
vizgenpostprocessing/compiletilesegmentation vizgenpostprocessing vpt Tool is vpt (Vizgen Postprocessing Tool)
vizgenpostprocessing/preparesegmentation vizgenpostprocessing vpt Tool is vpt
vizgenpostprocessing/runsegmentationontile vizgenpostprocessing vpt Tool is vpt
coreograph coreograph python /app/UNetCoreograph.py Script is UNetCoreograph.py
basicpy basicpy /opt/main.py Container-embedded script, no basicpy CLI binary (borderline)

🧬 Tool suite vs. individual tool binary

The directory uses a suite/package name but the script calls a differently-named binary from that suite:

Reviewed? Module path Dir tool name Actual command Notes
fibertoolsrs/addnucleosomes fibertoolsrs ft fibertools-rs installs as ft
fibertoolsrs/extract fibertoolsrs ft Same
fibertoolsrs/predictm6a fibertoolsrs ft Same
gstama/collapse gstama tama_collapse.py Conda pkg is gs-tama, tool is tama
gstama/merge gstama tama_merge.py Same
gstama/polyacleanup gstama tama_flnc_polya_cleanup.py Same
cmseq/polymut cmseq polymut.py cmseq is the suite, polymut is the specific tool
vrhyme/extractunbinned vrhyme extract_unbinned_sequences.py Helper script in vRhyme, not vRhyme itself
vrhyme/linkbins vrhyme link_bin_sequences.py Same
fcs/fcsadaptor fcs av_screen_x Internal NCBI FCS-adaptor binary name

📦 Conda package name vs. CLI binary name

The directory uses the Conda package name, but the installed binary has a different name:

Reviewed? Module path Dir tool name Actual command Notes
dragmap/align dragmap dragen-os DragMap package installs dragen-os binary
dragmap/hashtable dragmap dragen-os Same
pbccs pbccs ccs PacBio Conda pkg pbccs, binary is ccs
pbjasmine pbjasmine jasmine Conda pkg pbjasmine, binary is jasmine
biohansel biohansel hansel Conda pkg bio_hansel, binary is hansel
kofamscan kofamscan exec_annotation kofamscan installs exec_annotation as its CLI
variantbam variantbam variant VariantBam binary is just variant

🔧 Wrapper/parallel tools with different names

Reviewed? Module path Dir tool name Actual command Notes
ltrfinder ltrfinder LTR_FINDER_parallel Parallel wrapper, not ltr_finder itself
ltrharvest ltrharvest LTR_HARVEST_parallel Parallel wrapper (note: gt/ltrharvest correctly calls gt ltrharvest)

🏷️ Functional/descriptive names instead of tool names

Reviewed? Module path Dir tool name Actual command Notes
unzip unzip 7za Calls p7zip, not unzip
unzipfiles unzipfiles 7za Same
zip zip 7z Calls p7zip, not zip
shasum shasum sha256sum macOS shasum vs GNU sha256sum
amps amps postprocessing.AMPS.r Actually calls HOPS package script
multiqcsav multiqcsav multiqc Calls multiqc with SAV plugin, not a standalone tool

🔀 Subcommand under wrong parent

Reviewed? Module path Dir tool name Actual command Notes
bcftools/rohviz bcftools roh-viz Standalone tool, not a bcftools subcommand
genotyphi/parse genotyphi parse_typhi_mykrobe.py Script name doesn't match directory
gens/preparecovandbaf gens generate_cov_and_baf CLI entry point doesn't match directory
macsyfinder/download macsyfinder msf_data install Separate database management tool
checkm2/databasedownload checkm2 aria2c Just a download; checkm2 never called

🌐 Platform/runtime tools

Reviewed? Module path Dir tool name Actual command Notes
deeptmhmm deeptmhmm biolib run DTU/DeepTMHMM Runs via BioLib platform CLI

Methodology

  • Automated scan of all 1767 module main.nf files
  • Compared directory tool name against the primary command in the script: block
  • Excluded shell builtins (cat, echo, mkdir, etc.) and version-reporting blocks
  • Minor variations (case, hyphens, version suffixes like gatk4gatk) were not flagged
  • Each finding needs human review — some may be acceptable exceptions

Suggested Actions

For each module, the community should decide:

  1. Rename the module directory to match the actual tool (preferred for most cases)
  2. Accept as exception with documentation (e.g., functional modules like unzip/zip)
  3. Move to pipeline-local if the module is too pipeline-specific for the shared repository

Context

  • PR discussion: mcstaging/imc2mc topic update #10959
  • Convention: modules should be tool/subtool matching the CLI call
  • Related: imaging community modules grouped by purpose rather than tool name

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions