Skip to content

Fix CLI decode truncation + brotli iterative optimal parse (1.48→1.39)#99

Merged
MagicalTux merged 7 commits into
masterfrom
cli-fix-brotli-ratio
Jun 15, 2026
Merged

Fix CLI decode truncation + brotli iterative optimal parse (1.48→1.39)#99
MagicalTux merged 7 commits into
masterfrom
cli-fix-brotli-ratio

Conversation

@MagicalTux

Copy link
Copy Markdown
Member

Two follow-up items from the ratio/speed work, both independent and verified end-to-end.

1. Fix: CLI -d truncated highly-compressible large inputs

compcol -t bzip2 -d (and any block-buffering decoder) truncated output at 64 KiB on large compressible inputs. The streaming decode loop stopped once the compressed input was consumed (its consumed < n guard), leaving output the decoder had buffered internally — a whole bzip2 block larger than the output buffer — undrained, and finish doesn't flush it. Added a drain loop that pulls the decoder's buffered output before finishing.

  • The library decoders were already correct (the bench/round-trip tests drive the drain); this was a CLI-only bug. Verified our output was always valid: system bzip2 -d decoded it fine.
  • Repro now fixed: a 5 MB compressible input round-trips fully through compcol -c | compcol -d for bzip2/gzip/xz/zstd/lzma/lz4/brotli (was 65536 of 5000000 for bzip2). Regression test added (tests/cli.rs).

2. brotli encoder: iterative optimal parse — 1.48 → 1.39 vs brotli -q 11

The last remaining ratio gap. Added a zopfli-style iterative, statistics-driven optimal LZ77 parse (src/brotli/encoder_optimal.rs) at quality 9–11: a forward DP whose cost model is rebuilt from the previous pass's actual command/literal/distance histograms each round (2 passes at q9/q10, 3 at q11), with candidate matches cached across passes. This is the distance-reuse feedback loop that gives brotli q10/q11 its edge.

  • max-quality ratio on the 2.9 MB corpus: 707558 → 669632 bytes (1.473 → 1.394 vs brotli -q 11 = 480480; 1.253 vs -q 9). Encode ~4.7 s at q11; greedy retained as fallback and for q≤8.
  • Reference cross-decode confirmed: compcol -t brotli -c -l N | brotli -d | cmp byte-exact across inputs × q0/q4/q9/q10/q11.
  • Honest limit (documented): the remaining gap to -q 9 is structural — q9–q11 also do block splitting with separate command/distance context trees and NPOSTFIX distance coding, which this encoder doesn't yet do.

Checks

  • cargo test --all-features61 suites green, 0 failures (incl. the new CLI drain regression test).
  • Independent verification on the integrated branch: brotli 1.394 + cross-decode OK; CLI fix round-trips all codecs on a 5 MB input.
  • cargo fmt --check, cargo clippy --all-features --all-targets -D warnings, rustdoc -D warnings — clean.

🤖 Generated with Claude Code

MagicalTux and others added 7 commits June 15, 2026 20:15
The streaming decode loop stopped once the compressed input was consumed
(its `consumed < n` guard), leaving any output the decoder had buffered
internally (e.g. a whole bzip2 block larger than the 64 KiB output buffer)
undrained — `finish` does not flush it, so `compcol -t bzip2 -d` truncated
highly-compressible large inputs at 64 KiB. Add an empty-input drain loop
after the read loop, mirroring the streaming contract. Regression test added.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add a forward dynamic-programming optimal parse (encoder_optimal.rs) that
rebuilds its cost model from the previous pass's command/literal/distance
histograms, run 2 passes at q10 and 3 at q11. Candidates (explicit chain
matches, repeat-distance matches, dictionary refs) are precomputed once and
shared across passes. Greedy parse retained as fallback and for q<=9.

corpus.dat q11: 707558 -> 669880 bytes (ratio vs brotli -q11 1.473 -> 1.394).
Reference cross-decode and our-decoder round-trip verified.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
When an explicit chain match's distance coincides with a recent ring
distance, charge the cheaper of the full distance-symbol cost and the
short-code cost so the DP prefers ring-reusing matches.

corpus.dat q11: 669880 -> 669632 bytes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The match cache makes the DP cheap enough to run at q9 within a few
seconds. corpus.dat q9: 709198 -> 680156 bytes; quality ordering stays
monotonic (q9 > q10 > q11). Reference cross-decode verified across random,
text, cyclic, all-zero, tiny, and corpus inputs at q9/q10/q11.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@MagicalTux MagicalTux force-pushed the cli-fix-brotli-ratio branch from 4f6ff24 to 4b7d843 Compare June 15, 2026 11:16
@MagicalTux MagicalTux merged commit db86560 into master Jun 15, 2026
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant