Skip to content

Releases: LucoDevro/CAGEcleaner

v1.5.0

22 Apr 08:21

Choose a tag to compare

Added features:

  • New dereplication mode: dereplicate by genomic neighbourhood
  • Support for non-cblaster inputs from TSVs via a session construction helper tool

Improved features:

  • Made hit recovery strategies synteny-aware by checking cluster layout instead of cluster contents
  • Improved logging, progress bars and error catching

Tool architecture:

  • Shifted to multiple class inheritance structure to define logic common to specific workflows
  • Converted and integrated bash helper scripts to Python code

Miscellaneous:

  • Updated examples
  • Moved docs from Wiki to ReadTheDocs

What's Changed

Full Changelog: v1.4.5...v1.5.0

v1.4.5

17 Nov 09:13

Choose a tag to compare

Improve internal handling of cblaster binary tables to increase robustness against messy NCBI strain tags

v1.4.4

13 Oct 09:25

Choose a tag to compare

Bugfix for case when cblaster omits genomes without hits from its session file (issue #3)

v1.4.3

24 Sep 08:13

Choose a tag to compare

New features:

  • Added option to use skDER's low-memory mode for the dereplication. Trades a bit of representative genome quality for much lower memory requirements and a slight speed increase.

v1.4.2

01 Sep 12:42

Choose a tag to compare

Bugfix in hit recovery

v1.4.1

20 Aug 14:10

Choose a tag to compare

Bugfix for wrongly packaged auxiliary scripts

v1.4.0 [yanked]

20 Aug 09:17
1be7654

Choose a tag to compare

Complete remake in object-oriented fashion. Separate child objects for local and remote dereplication runs.

Altered features:

  • Merging the dereplication status table and the assembly-scaffold pairs table into one output table (extended binary table)
  • Improved verbosity

v1.3.1

05 Aug 08:49

Choose a tag to compare

Minor bugfix for excluded duplicates

v1.3.0

05 Aug 08:48
4232219

Choose a tag to compare

New features:

  • Added support for cblaster local and HMM mode (for both Genbank and FASTA + GFF input genome databases).
  • Expanded scaffold to assembly mapping to non-gzipped fasta files
  • Added option to bypass scaffolds or assemblies to retain in the output in any case

Altered features:

  • Added support for more precise ID exclusion and bypassing in case of duplicate IDs by using an assembly:scaffold ID formatting internally
  • Minimum ANI threshold of 82% is now enforced: CAGEcleaner exits with a warning if a user passes a lower value.

v1.2.3

16 Jul 11:17

Choose a tag to compare

New features:

  • Option to exclude scaffolds or assemblies from being processed
  • Optional verbosity

Compatibility maintenance:

  • Created pyproject.toml for pip compatibility