Skip to content

Use ISA-L even in the Rust version #561

@marcelm

Description

@marcelm

Commit 759398f added support for reading gzip files with ISA-L to the C++ code. This has not been ported to Rust, yet.

We currently use the flate2 crate for decompression. We use their zlib-rs backend, which is supposed to be the fastest one.

Decompressing compressed FASTQ files represents only a small fraction of total runtime, so in many cases we don’t actually gain that much, but it is highly relevant when we try to run strobealign on a machine with many cores. Before we had ISA-L in the C++ code, strobealign could only saturate about 32 CPU cores. When more cores are available, the worker threads sit idle part of the time because they have to wait for the (single) decompression thread to provide them with data.

I measured this on the Rust version and can see the same thing: When increasing the number of threads, CPU usage scales accordingly up to about 32 threads and then stays flat.

When I tried to switch our code to use the isal-rs crate, it initially looked straightforward and was essentially just a cargo add isal-rs and switching from flate2::read::MultiGzDecoder to isal::read::GzipDecoder. However, the code does not actually compile because GzipDecoder does not implement the Send trait.

This is a Rust-specific thing: The Send trait marks types that are safe to send from one thread to another. We do this in strobealign because the GzipDecoder is created in the main thread so that we can estimate the read length, and then it is moved into a new thread that reads the entire file.

Here are a couple of ways to solve this:

  • Explicitly mark isal::read::GzipDecoder as Send. This requires the use of the unsafe keyword and is very, very likely not the proper way to solve this. We should do this only if we understand much better what the implications are.
  • Try cloudflare’s fork of zlib instead, which is also supposed to be a lot faster than regular zlib.
  • Restructure the code so that we don’t need to send the GzipDecoder to a different thread. That is, read the initial 500 reads for estimating read length also in the reader thread.

I have tried the first option just to have something that I can perhaps use in benchmarks in order to see whether it actually makes a difference in core usage, but there were some issues compiling it.

I haven’t tried cloudflare’s zlib, but from what I heard, it is not as fast as ISA-L.

Restructuring the code is probably the cleanest way forward.

See also milesgranger/isal-rs#33

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions