Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions modules/nf-core/exomiser/analyse/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
process EXOMISER_ANALYSE {
tag "${meta.id}"
label 'process_medium'

container "nf-core/exomiser-cli:15.0.0-bash"

input:
tuple val(meta), path(vcf), path(ped), val(assembly), path(phenopacket), path(analysis_script)
tuple val(meta2), path(reference_cache, stageAs: 'exomiser_data/*'), val(reference_version)
tuple val(meta3), path(phenotype_cache, stageAs: 'exomiser_data/*'), val(phenotype_version)
Comment on lines +8 to +10
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stuff looks reasonable to me, I don't really have a better idea

Comment on lines +9 to +10
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it maybe make sense to have a seperate module that takes care of properly loading this data? That is what we did for PCGR. That would be then exomiser/getreference and it would download and create this needed exomiser.data-directory=/data/exomiser-data which can then be just an input to this module?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the data is too big to be loaded on the fly. It comes down to about 50GB of reference data in total

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes thats why I would handle it seperately or at least that is how we are doing it with vep cache etc. Either we add the data to the vep cache thingy @maxulysse built or we create a module that can be used in a pipeline to have this loaded see pcgr in the variantprioritization pipeline. And then for testing we subsample this cache to chr22 etc (I did that for pcgr as well).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #9295


output:
tuple val(meta), path("*.tsv"), emit: tsv
tuple val(meta), path("*.json"), emit: json
tuple val(meta), path("*.html"), emit: html
tuple val(meta), path("*.parquet"), emit: parquet
tuple val(meta), path("*.vcf"), emit: vcf
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say we need to bgzip this vcf before we output it

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The thing is, I'm only just testing this tool and I haven't had the opportunity to look further into it.
Chances are it's already compressed and I just missed it in the docs

Copy link
Copy Markdown
Contributor

@famosab famosab Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VCF file is tabix-indexed and exomiser ranked alleles can be extracted using grep

Yes its bgzipped (otherwise it cannot be index afaik) so perfect

tuple val("${task.process}"), val('exomiser'), eval("exomiser --version"), topic: versions, emit: versions_exomiser

when:
task.ext.when == null || task.ext.when

script:
// Exit if running this module with -profile conda / -profile mamba
if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) {
error("EXOMISER_ANALYSE module does not support Conda. Please use Docker / Singularity / Podman instead.")
}
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"
def ped_cmd = ped ? "--ped=${ped}" : ""
def phenopacket_cmd = phenopacket ? "--sample=${phenopacket}" : ""
def assembly_cmd = assembly ? "--assembly=${assembly}" : ""
def analysis_cmd = analysis_script ? "--analysis ${analysis_script}" : ""
def vcf_cmd = vcf ? "--vcf=${vcf}" : ""

"""
export EXOMISER_DATA_DIRECTORY=./exomiser_data
export EXOMISER_${assembly}_DATA_VERSION=${reference_version}
export EXOMISER_PHENOTYPE_DATA_VERSION=${phenotype_version}

exomiser analyse \\
${ped_cmd} \\
${phenopacket_cmd} \\
${assembly_cmd} \\
${vcf_cmd}\\
${analysis_cmd} \\
${args} \\
--output-directory=\$PWD \\
--output-filename=${prefix}
"""

stub:
// Exit if running this module with -profile conda / -profile mamba
if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) {
error("EXOMISER_ANALYSE module does not support Conda. Please use Docker / Singularity / Podman instead.")
}
def args = task.ext.args ?: ''
def prefix = task.ext.prefix ?: "${meta.id}"

"""
echo ${args}
touch ${prefix}.tsv
touch ${prefix}.json
touch ${prefix}.html
touch ${prefix}.parquet
touch ${prefix}.vcf
"""
}
153 changes: 153 additions & 0 deletions modules/nf-core/exomiser/analyse/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
name: "exomiser_analyse"
description: Phenotype-driven variant prioritisation for rare Mendelian
disorders.
keywords:
- exomiser
- variant prioritisation
- rare disease
- Mendelian disorders
tools:
- "exomiser":
description: "A Tool to Annotate and Prioritize Exome Variants"
homepage: "https://exomiser.readthedocs.io/en/stable/"
documentation: "https://exomiser.readthedocs.io/en/stable/"
tool_dev_url: "https://github.com/exomiser/Exomiser"
doi: "10.1038/s41525-024-00456-2"
licence:
- "AGPL-3.0"
identifier: biotools:exomiser
input:
- - meta:
type: map
description: Groovy Map containing sample information. e.g. `[
id:'sample1' ]`
- vcf:
type: file
description: "VCF file containing variants to be analysed."
pattern: "*.vcf.gz"
ontologies:
- edam: http://edamontology.org/format_3989
- ped:
type: file
description: "PED file containing family information."
pattern: "*.ped"
ontologies: []
- assembly:
type: string
description: "Genome assembly to use. e.g. GRCh37, GRCh38"
- phenopacket:
type: file
description: "Phenopacket file containing phenotype information."
pattern: "*.{yml,yaml,json}"
ontologies:
- edam: http://edamontology.org/format_3750
- edam: http://edamontology.org/format_3464
- analysis_script:
type: file
description: "Custom analysis script for Exomiser analysis"
pattern: "*.{yml,yaml,json}"
ontologies:
- edam: http://edamontology.org/format_3750
- edam: http://edamontology.org/format_3464
- - meta2:
type: map
description: Groovy Map containing reference cache information. e.g. `[
id:'sample1' ]`
- reference_cache:
type: file
description: "Reference cache for Exomiser analysis"
pattern: "exomiser_data/*"
ontologies: []
- reference_version:
type: string
description: "Reference version for Exomiser analysis"
- - meta3:
type: map
description: Groovy Map containing phenotype cache information. e.g. `[
id:'sample1' ]`
- phenotype_cache:
type: file
description: "Phenotype cache for Exomiser analysis"
pattern: "exomiser_data/*"
ontologies: []
- phenotype_version:
type: string
description: "Phenotype version for Exomiser analysis"
output:
tsv:
- - meta:
type: map
description: Groovy Map containing sample information. e.g. `[
id:'sample1' ]`
- "*.tsv":
type: file
description: "TSV file containing prioritized variants."
pattern: "*.tsv"
ontologies:
- edam: http://edamontology.org/format_3475
json:
- - meta:
type: map
description: Groovy Map containing sample information. e.g. `[
id:'sample1' ]`
- "*.json":
type: file
description: "JSON file containing prioritized variants."
pattern: "*.json"
ontologies:
- edam: http://edamontology.org/format_3464
html:
- - meta:
type: map
description: Groovy Map containing sample information. e.g. `[
id:'sample1' ]`
- "*.html":
type: file
description: "HTML file containing prioritized variants."
pattern: "*.html"
ontologies: []
parquet:
- - meta:
type: map
description: Groovy Map containing sample information. e.g. `[
id:'sample1' ]`
- "*.parquet":
type: file
description: "Parquet file containing prioritized variants."
pattern: "*.parquet"
ontologies: []
vcf:
- - meta:
type: map
description: Groovy Map containing sample information. e.g. `[
id:'sample1' ]`
- "*.vcf":
type: file
description: "VCF file containing prioritized variants."
pattern: "*.vcf"
ontologies: []
versions_exomiser:
- - ${task.process}:
type: string
description: The name of the process
- exomiser:
type: string
description: The name of the tool
- exomiser --version:
type: eval
description: The expression to obtain the version of the tool
topics:
versions:
- - ${task.process}:
type: string
description: The name of the process
- exomiser:
type: string
description: The name of the tool
- exomiser --version:
type: eval
description: The expression to obtain the version of the tool
authors:
- "@matthdsm"
maintainers:
- "@matthdsm"
48 changes: 48 additions & 0 deletions modules/nf-core/exomiser/analyse/tests/main.nf.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
nextflow_process {

name "Test Process EXOMISER_ANALYSE"
script "../main.nf"
process "EXOMISER_ANALYSE"

tag "modules"
tag "modules_nfcore"
tag "exomiser"
tag "exomiser/analyse"

test("homo_sapiens - vcf - stub") {

options "-stub"

when {
process {
"""
input[0] = [
[ id:'test' ],
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/vcf/test.vcf.gz'),
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/ped/test.ped'),
"GRCh38",
file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/phenopacket/test.yml'),
[]
]
input[1] = [
[ id:'test' ],
file("s3://nf-core-reference-data/exomiser/GRCh38/1234/reference_cache"),
"1234"
]
input[2] = [
[ id:'test' ],
file("s3://nf-core-reference-data/exomiser/GRCh38/1234/phenotype_cache"),
"1234"
]
"""
}
}

then {
assertAll (
{ assert process.success },
{ assert snapshot(process.out.findAll { key, val -> key.startsWith("versions") }).match() }
)
}
}
}
16 changes: 16 additions & 0 deletions modules/nf-core/exomiser/analyse/tests/main.nf.test.snap
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"homo_sapiens - vcf - stub": {
"content": [
{
"versions_exomiser": [

]
}
],
"meta": {
"nf-test": "0.9.3",
"nextflow": "25.10.4"
},
"timestamp": "2026-03-23T19:09:41.158068"
}
}
Loading