-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Summary
An automated scan of all 1767 nf-core modules found 39 modules where the directory name (modules/nf-core/<tool>/<subtool>/) does not match the actual command invoked in the script: block of main.nf. Per nf-core convention, modules should follow tool/subtool naming that matches the CLI tool being called.
This issue was prompted by the discussion in #10959 (comment) about mcstaging/* modules being grouped by purpose rather than by tool name.
Each case below needs human review to determine the appropriate action (rename, restructure, or document as acceptable exception).
Mismatches by Category
🔬 Imaging / mcmicro modules (grouped by purpose instead of tool)
These modules are grouped under umbrella directories by function rather than by tool name:
| Reviewed? | Module path | Dir tool name | Actual command | Notes |
|---|---|---|---|---|
mcstaging/imc2mc |
mcstaging | python /imc2mc/scripts/imc2mc.py |
Tool is imc2mc, grouped under mcstaging |
|
mcstaging/macsima2mc |
mcstaging | python /staging/macsima2mc/macsima2mc.py |
Tool is macsima2mc |
|
mcstaging/phenoimager2mc |
mcstaging | python /phenoimager2mc/scripts/phenoimager2mc.py |
Tool is phenoimager2mc |
|
vizgenpostprocessing/compiletilesegmentation |
vizgenpostprocessing | vpt |
Tool is vpt (Vizgen Postprocessing Tool) |
|
vizgenpostprocessing/preparesegmentation |
vizgenpostprocessing | vpt |
Tool is vpt |
|
vizgenpostprocessing/runsegmentationontile |
vizgenpostprocessing | vpt |
Tool is vpt |
|
coreograph |
coreograph | python /app/UNetCoreograph.py |
Script is UNetCoreograph.py |
|
basicpy |
basicpy | /opt/main.py |
Container-embedded script, no basicpy CLI binary (borderline) |
🧬 Tool suite vs. individual tool binary
The directory uses a suite/package name but the script calls a differently-named binary from that suite:
| Reviewed? | Module path | Dir tool name | Actual command | Notes |
|---|---|---|---|---|
fibertoolsrs/addnucleosomes |
fibertoolsrs | ft |
fibertools-rs installs as ft |
|
fibertoolsrs/extract |
fibertoolsrs | ft |
Same | |
fibertoolsrs/predictm6a |
fibertoolsrs | ft |
Same | |
gstama/collapse |
gstama | tama_collapse.py |
Conda pkg is gs-tama, tool is tama |
|
gstama/merge |
gstama | tama_merge.py |
Same | |
gstama/polyacleanup |
gstama | tama_flnc_polya_cleanup.py |
Same | |
cmseq/polymut |
cmseq | polymut.py |
cmseq is the suite, polymut is the specific tool |
|
vrhyme/extractunbinned |
vrhyme | extract_unbinned_sequences.py |
Helper script in vRhyme, not vRhyme itself | |
vrhyme/linkbins |
vrhyme | link_bin_sequences.py |
Same | |
fcs/fcsadaptor |
fcs | av_screen_x |
Internal NCBI FCS-adaptor binary name |
📦 Conda package name vs. CLI binary name
The directory uses the Conda package name, but the installed binary has a different name:
| Reviewed? | Module path | Dir tool name | Actual command | Notes |
|---|---|---|---|---|
dragmap/align |
dragmap | dragen-os |
DragMap package installs dragen-os binary |
|
dragmap/hashtable |
dragmap | dragen-os |
Same | |
pbccs |
pbccs | ccs |
PacBio Conda pkg pbccs, binary is ccs |
|
pbjasmine |
pbjasmine | jasmine |
Conda pkg pbjasmine, binary is jasmine |
|
biohansel |
biohansel | hansel |
Conda pkg bio_hansel, binary is hansel |
|
kofamscan |
kofamscan | exec_annotation |
kofamscan installs exec_annotation as its CLI |
|
variantbam |
variantbam | variant |
VariantBam binary is just variant |
🔧 Wrapper/parallel tools with different names
| Reviewed? | Module path | Dir tool name | Actual command | Notes |
|---|---|---|---|---|
ltrfinder |
ltrfinder | LTR_FINDER_parallel |
Parallel wrapper, not ltr_finder itself |
|
ltrharvest |
ltrharvest | LTR_HARVEST_parallel |
Parallel wrapper (note: gt/ltrharvest correctly calls gt ltrharvest) |
🏷️ Functional/descriptive names instead of tool names
| Reviewed? | Module path | Dir tool name | Actual command | Notes |
|---|---|---|---|---|
unzip |
unzip | 7za |
Calls p7zip, not unzip |
|
unzipfiles |
unzipfiles | 7za |
Same | |
zip |
zip | 7z |
Calls p7zip, not zip |
|
shasum |
shasum | sha256sum |
macOS shasum vs GNU sha256sum |
|
amps |
amps | postprocessing.AMPS.r |
Actually calls HOPS package script | |
multiqcsav |
multiqcsav | multiqc |
Calls multiqc with SAV plugin, not a standalone tool |
🔀 Subcommand under wrong parent
| Reviewed? | Module path | Dir tool name | Actual command | Notes |
|---|---|---|---|---|
bcftools/rohviz |
bcftools | roh-viz |
Standalone tool, not a bcftools subcommand | |
genotyphi/parse |
genotyphi | parse_typhi_mykrobe.py |
Script name doesn't match directory | |
gens/preparecovandbaf |
gens | generate_cov_and_baf |
CLI entry point doesn't match directory | |
macsyfinder/download |
macsyfinder | msf_data install |
Separate database management tool | |
checkm2/databasedownload |
checkm2 | aria2c |
Just a download; checkm2 never called |
🌐 Platform/runtime tools
| Reviewed? | Module path | Dir tool name | Actual command | Notes |
|---|---|---|---|---|
deeptmhmm |
deeptmhmm | biolib run DTU/DeepTMHMM |
Runs via BioLib platform CLI |
Methodology
- Automated scan of all 1767 module
main.nffiles - Compared directory tool name against the primary command in the
script:block - Excluded shell builtins (
cat,echo,mkdir, etc.) and version-reporting blocks - Minor variations (case, hyphens, version suffixes like
gatk4→gatk) were not flagged - Each finding needs human review — some may be acceptable exceptions
Suggested Actions
For each module, the community should decide:
- Rename the module directory to match the actual tool (preferred for most cases)
- Accept as exception with documentation (e.g., functional modules like
unzip/zip) - Move to pipeline-local if the module is too pipeline-specific for the shared repository
Context
- PR discussion: mcstaging/imc2mc topic update #10959
- Convention: modules should be
tool/subtoolmatching the CLI call - Related: imaging community modules grouped by purpose rather than tool name