Skip to content

Ensure that contig names can be used within filenames #57

@fedarko

Description

@fedarko

There are a few commands that include contig names in filenames -- right now it's just phasing commands:

  • strainFlye smooth create (output reads for each contig are named [contig].fasta.gz)
  • strainFlye smooth assemble (output LJA assemblies for each contig are written to a folder named [contig])
  • strainFlye link nt (output pickle files are named [contig]_pos2nt2ct.pickle and [contig]_pospair2ntpair2ct.pickle)
  • strainFlye link graph (output graphs, regardless of format, include [contig] as a prefix)

In most cases, contig names should be restricted to [a-zA-Z0-9_-.], and should thus be fine as filenames. But I'm sure eventually we'll start seeing weird contig names with spaces or other characters that will mess this up.

I'm not sure it's worth trying to anticipate and address these problems in advance (we could modify the FASTA-loading parts of the code to do some validation on contig names), but I'm making this issue just to catalog what parts of the code this problem touches at the moment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    backburnerLow-priority things that are still good to keep track ofbugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions