Skip to content

Conversation

@jakobnissen
Copy link
Member

It was originally capped at 8 under the belief that reading 8 BAM files in parallel would saturate the disk, so there would be no benefit of going higher. However, my laptop can read at 4 GB/s, and decompress BAM files perhaps 40 times slower, so it's CPU bottlenecked even with 32 threads.

This change is significant, because users have reported slow BAM file parsing. However, it will potentially quadruple the memory usage of the BAM parsing step. Will be benchmarked before merging.

The BLAS change is simply because I think 8 CPUs is too conservative.

@jakobnissen jakobnissen added the Needs benchmark Must benchmark before merging this label Feb 15, 2024
@jakobnissen
Copy link
Member Author

This is ready to go, but we need to measure the memory usage of parsing 32 large (10M ref seqs) BAM files in parallel.

It was originally capped at 8 under the belief that reading 8 BAM files in
parallel would saturate the disk, so there would be no benefit of going higher.
However, my laptop can read at 4 GB/s, and decompress BAM files perhaps 40 times
slower, so it's CPU bottlenecked even with 32 threads.

This change is significant, because users have reported slow BAM file parsing.
However, it will potentially quadruple the memory usage of the BAM parsing step.
Will be benchmarked before merging.

The BLAS change is simply because I think 8 CPUs is too conservative.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Needs benchmark Must benchmark before merging this

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants