Refine mean/median sequence length output#205
Open
ewels wants to merge 1 commit into
Open
Conversation
Builds on the mean/median sequence length feature (s-andrews#203): - Report Mean Length to 1 decimal place rather than truncating to an integer, so e.g. a 150.7 bp mean is no longer shown as 150. - Calculate the median with standard tie-breaking: for an even number of reads, average the two central values and round up, rather than taking only the upper-middle value. - Update the FileContentsTest approved snapshots (data + HTML) for the new rows. HTML snapshots are edited in place so the embedded chart images are left untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
ewels
commented
May 29, 2026
4b9a339 to
c38a976
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Builds on #203 / 54b336e (mean & median sequence length). Two small refinements plus the test snapshots that commit didn't include.
Changes
1. Mean Length → 1 decimal place (
BasicStats.java)Currently the mean is integer division (
totalBases/actualCount), so a 150.7 bp mean prints as150. This reports it to 1 d.p. instead:(
Locale.ROOTkeeps the decimal separator as.regardless of system locale; the zero guard avoids an integer divide-by-zero on empty input.)2. Median tie-breaking (
SequenceLengthDistribution.medianLength())For an even number of reads, the current code returns the upper of the two central values. This uses the standard definition — the mean of the two central values, rounded up:
(lo + hi + 1) / 2.Also added a
total == 0guard (returns 0) so an empty distribution doesn't throw. Identical to current behaviour for uniform-length (short-read) data; differs only on even read counts with two distinct central lengths.3. Test snapshots (
FileContentsTest_{minimal,complex}_fastqc_*)Updated the approved data + HTML files for the new
Mean Length/Median Lengthrows. The HTML snapshots are edited in place so the embedded base64 chart images are left untouched (rendering differs per OS).Notes
Mean Length/Median Length) and filtered-sequence handling are left exactly as in 54b336e.fastqc_data.txtfiles still carry##FastQC 0.12.2.develwhile the build now emits0.13.0.devel— a pre-existing version-string mismatch in the snapshots, independent of this change. Left as-is to keep this PR scoped; happy to bump it if you'd prefer.🤖 Generated with Claude Code