Skip to content

Releases: exomiser/Exomiser

Bigger on the inside

28 Feb 00:23

Choose a tag to compare

This release is big. You just won't believe how vastly, hugely, mind-bogglingly big it is. I mean, you may think it's
a long way down the road to the chemist's, but that's just peanuts to this.

-- Douglas Adams

Apologies for misquoting Douglas Adams, but this is a big release. For starters, we've fittingly managed to
coincide with Rare Disease Day this year, which is a first.

Rare Disease Day

This release has touched literally every part of the Exomiser CLI and core libraries. These changes should be very apparent as the CLI has subtly changed and improved, the output files have either been replaced or enhanced, and a new parquet format has been added. The documentation has had a fair bit of work done to improve the user installation experience. The analysis scripts and CLI presets have been updated to improve their performance in various scenarios. The biggest change of all is probably that of the logistic regression model which has been updated to take into account the automated ACMG assignments and re-trained using the solved cases from the UK's 100,000 Genomes Project

Given all this change, we urge you to review the changelog also provided below and the documentation and open an issue if you have any questions or problems. Existing pipelines will need to be minorly changed to use this release, but the effort to do so should be worth the gains.

CLI Changes

  • Minimum Java version is now Java 21

  • The CLI is now handled by picocli and has new analyse and batch commands.

    • The analyse command works with the same options as before, but will fail before loading resources if no samples have been provided in the command input.
    • The batch command replaces the --batch option and now has a --dry-run option to check the input commands and samples before running and will write out an error file.

    Run exomiser --help for details or see the docs about how to migrate your scripts. However, the snippet below should be enough to get you started:

    # Running the `analyse` command:
    ## Exomiser < 15.0.0
    java -jar exomiser-cli-14.1.0.jar --analysis examples/exome-analysis.yml --output-directory exomiser-results/exome-analysis --output-format HTML
    # Exomiser 15.0.0
    java -jar exomiser-cli-15.0.0.jar analyse --analysis examples/exome-analysis.yml --output-directory exomiser-results/exome-analysis --output-format HTML
    
    # Running the `batch` command:
    ## Exomiser < 15.0.0
    java -jar exomiser-cli-14.1.0.jar --batch examples/test-analysis-batch-commands.txt
    # Exomiser 15.0.0
    java -jar exomiser-cli-15.0.0.jar batch examples/test-analysis-batch-commands.txt
  • Updated logistic regression model which will take into account the ACMG assignment data which leads to improved accuracy of the results. !!! WARNING - THIS SIGNIFICANTLY CHANGES THE EXOMISER COMBINED SCORES, SO IF YOU USE ANY CUTOFFS TO FILTER YOUR RESULTS IN YOUR PIPELINE, YOU WILL NEED TO RE-CALIBRATE THEM !!!.

  • New alleleBalanceFilter: {} analysis step to filter variants based on allele balance (see docs for details).

  • Updated examples/preset-exome-analysis.yml and examples/preset-genome-analysis.yml to use new defaults. UPDATE YOUR SCRIPTS TO USE THESE FOR IMPROVED ACCURACY.

  • Added examples/preset-exome-analysis-human-only.yml

  • Added examples/preset-exome-analysis-with-introns.yml

  • Added examples/preset-phenotype-only-analysis.yml

  • New PARQUET output file format. This is a much more efficient format for storing results. It is an amalgamation of the TSV_VARIANT and TSV_GENE data with added fields and should be considered as a replacement for the JSON output.

  • JSON output has been replaced with JSONL output which is a line-delimited JSON format (https://jsonlines.org/). Note that the file suffix is now .jsonl rather than .json.

  • New HTML output format. This is a much more compact and readable format for displaying results.

  • Fix for issue #621 in VCF output where ACMG categories were being concatenated with , which broke parsers. These are now replaced with &.

  • Removed use of BS4 category in ACMG assignments as this was being applied too stringently, leading to lost diagnoses in DDD cohort.

  • Fixed PM4 assignment to include disruptive_inframe_deletion/insertion variants

  • Updated Exomiser CLI startup configuration to not write the results directory to the installation directory by default.

Under the hood changes

New Java record classes have been added to the core module to represent the immutable data structures used in the analysis.
These have led to a much less 'getty' API as the traditional Java bean conventions have been replaced with a terser API.

Data Release

This update also includes a new 2512 data release. See the data-release discussions for links.

What's Changed

New Contributors

Full Changelog: 14.1.0...15.0.0

14.1.0 Nice and Splicy

10 Dec 16:09

Choose a tag to compare

The main changes in this release focus on further updates to ACMG assignment categories, including addition of PS1, PM1, PM5, BS1, BS2 categories to ACMG assignments. This release also includes implementation of the assignment of the ClinGen recommendations for splicing variants.

Under the hood changes

  • Deprecate out of date Acmg2015Classifier and Acgs2020Classifier
  • Update JannovarSmallVariantAnnotator to remove MNV annotations from effects as these were overriding more damaging
    functional effects such as STOP_LOSS, STOP_GAIN, SPLICE_DONOR, SPLICE_ACCEPTOR which prevented potential assignment of
    PVS1.
  • Update Acmg2015EvidenceAssigner to include BS1, BS2 assignments.
  • Refactor Acmg2015EvidenceAssigner missense assignment methods into new AcmgMissenseInFrameIndelEvidenceAssigner class.
  • Add PP2/BP1 assignments to AcmgMissenseInFrameIndelEvidenceAssigner using GeneStatistics
  • Update ClinVarDao with new getGeneStatistics() method.
  • Add new GeneStatistics class for handling aggregated ClinVar gene-level variant effect counts.
  • Add new AcmgEvidence.parseAcmgEvidence() method.
  • Changes to enable SpliceAI PP3 and other splicing-related ACMG assignments.
  • Add new AcmgPVS1EvidenceAssigner class to assign PVS1 to loss of function variants
  • Add new AcmgMissenseInFrameIndelEvidenceAssigner class to assign PS1, PM1, PM5, PP2, BP1, PP3, BP4 to missense and
    inframe indels
  • Add new AcmgSpliceEvidenceAssigner class to assign PS1, PP3, BP4, BP7 to splice region variants
  • Add new AcmgEvidence.Builder.containsWithEvidence method
  • Add @nullable to PathogenicityData.pathogenicityScore method

This update also includes a new 2410 data release. See the data-release discussions for links.

Full Changelog: 14.0.2...14.1.0

Fix for issue #571

20 Sep 14:23

Choose a tag to compare

Fix for issue #571. This is a bug-fix release to prevent erroneous assignment of PVS1 to recessive-compatible variants in LOF-tolerant genes.

We strongly recommend updating to this release if you rely on the ACMG assignments from Exomiser.

Fix for Issue #565

03 Sep 14:02

Choose a tag to compare

This is a patch release to prevent a possible ArrayIndexOutOfBoundsException being thrown when outputting the variants TSV file. There are no other changes.

New Java version, new database format, smaller data downloads, more ACMG categories, better reporting...

29 Feb 14:00

Choose a tag to compare

  • Minimum Java version is now Java 17
  • Update database format REQUIRES DATABASE VERSION 2406 - these are significantly smaller than the previous versions (~50-60% of previous size) See the GitHub discussions section for details.
  • Added new GeneBlacklistFilter #457
  • Add new ClinVar conflicting evidence counts in HTML output #535
  • Added PS1, PM1, PM5 categories to ACMG assignments
  • Altered reporting of InheritanceModeFilter to state that the number shown refers to variants rather than genes.
  • Updated gene constraints to use gnomad v4.0 data.
  • TSV genes, TSV variants and VCF outputs will only write to a single file where the possible modes of inheritances are now shown together rather than split across separate files.
  • Fix for issue #531 where the priorityScoreFilter and regulatoryFeatureFilter pass/fail counts were not displayed in the HTML.
  • Fix for issue #534 where variant frequency and/or pathogenicity annotations are missing in certain run configurations.
  • Fix for issue #541 where logging to /tmp/spring.log causes clashes in shared user environments.
  • TSV output column CLINVAR_ALLELE_ID has been changed to CLINVAR_VARIANT_ID to allow easier reference to ClinVar variants.

Full Changelog: 13.3.0...14.0.0

MT codon tables and Bayesian ACMG

26 Oct 10:30

Choose a tag to compare

  • Updated Jannovar version to 0.41 to fix incorrect MT codon table usage #521
  • Downgraded PM2 - PM2_Supporting for variants lacking frequency information #502.
  • Updated Acgs2020Classifier and Acmg2015Classifier to allow for PVS1 and PM2_Supporting to be sufficient to trigger LIKELY_PATHOGENIC
  • Updated AcmgEvidence to fit a Bayesian points-based system #514
  • Removed ASJ, FIN, OTH ExAC and gnomAD populations from presets and examples #513.
  • Fix for regression causing <INV> variants to be incorrectly down-ranked
  • Fix for issue #486 where VCF output includes whitespace in INFO field.
  • Logs will now display elapsed time correctly if an analysis runs over an hour (!).

Full Changelog: 13.2.1...13.3.0

SV `<INS>` bugfix

07 Jul 09:02

Choose a tag to compare

This is a bugfix release to address the blanket scoring of <INS> variants with a variant score of 1.0. The fix should increase the accuracy of SV call prioritisation.

  • Fix for bug where all <INS> structural variants were given a maximal variant score of 1.0 regardless of their position on a transcript.
  • Added partial implementation of SVanna scoring for coding and splice site symbolic variants.
  • Fix for issue #481 where TSV and VCF results files would contain no data when the analysis inheritanceModes was empty.

IMPORTANT! This will be the last major release to run on Java 11. Subsequent major releases (i.e. 14+) will require Java 17.

Sometimes it's the little things...

28 Feb 18:54

Choose a tag to compare

This release adds a couple of minor quality of life features to the CLI and fixes a few bugs.

  • New multi-architecture docker images with and without bash #471. These images can be found on https://hub.docker.com/repositories/exomiser
  • Deprecated of output-prefix CLI option (will be removed in next major version) #469
  • Added output-directory and output-filename CLI options to replace output-prefix #469
  • Added output-format CLI option #471
  • Fixed excessive CPU usage and application hang after variant prioritisation with large number of results #479
  • Fixed issue #478 where gene.tsv output files are empty when running a phenotype only prioritisation.
  • Fixed broken links to OMIM in the phenotypic similarity section of the HTML output #465
  • Added gene symbol as HTML id tag in gene panel HTML results #422

Automated ACMG, p-values, simpler output, documentation!

29 Jul 13:45

Choose a tag to compare

The three new features for this release is the automated ACMG classification of small sequence variants, calculating
p-values for the combined scores and providing new and more interpretable TSV and VCF output files.

  • Added new automated ACMG annotations for top-scoring variants in known disease-causing genes.
  • Added new combined score p-value
  • Added new TSV_GENE, TSV_VARIANT and VCF output files containing ranked genes/variants for all the assessed modes of
    inheritance. Note that these new file formats will supersede the existing individual MOI-specific TSV/VCF files which
    will be removed in the next major release
    . See the online documentation for details.
  • New update online documentation! See https://exomiser.readthedocs.io/en/latest/
  • New Docker hub images for CLI and web on https://hub.docker.com/u/exomiser
  • Added checks to ensure user specifies genome assembly if user specifies VCF path outside of phenopacket/analysis
  • Added --output-prefix option to enable output prefix directly on the command line
  • Updated examples to use the latest recommended settings as per preset derived from 100,000 genomes project

for the latest data, please follow the discussions for announcements: #424

hg38 only configuration bugfix

23 Nov 14:25

Choose a tag to compare

Bug-fix release. No external changes.

CLI changes

  • Bug fix for issue #410 where application fails to start when only specifying hg38 data in application.properties