Fix CLI decode truncation + brotli iterative optimal parse (1.48→1.39)#99
Merged
Conversation
The streaming decode loop stopped once the compressed input was consumed (its `consumed < n` guard), leaving any output the decoder had buffered internally (e.g. a whole bzip2 block larger than the 64 KiB output buffer) undrained — `finish` does not flush it, so `compcol -t bzip2 -d` truncated highly-compressible large inputs at 64 KiB. Add an empty-input drain loop after the read loop, mirroring the streaming contract. Regression test added. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add a forward dynamic-programming optimal parse (encoder_optimal.rs) that rebuilds its cost model from the previous pass's command/literal/distance histograms, run 2 passes at q10 and 3 at q11. Candidates (explicit chain matches, repeat-distance matches, dictionary refs) are precomputed once and shared across passes. Greedy parse retained as fallback and for q<=9. corpus.dat q11: 707558 -> 669880 bytes (ratio vs brotli -q11 1.473 -> 1.394). Reference cross-decode and our-decoder round-trip verified. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
When an explicit chain match's distance coincides with a recent ring distance, charge the cheaper of the full distance-symbol cost and the short-code cost so the DP prefers ring-reusing matches. corpus.dat q11: 669880 -> 669632 bytes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The match cache makes the DP cheap enough to run at q9 within a few seconds. corpus.dat q9: 709198 -> 680156 bytes; quality ordering stays monotonic (q9 > q10 > q11). Reference cross-decode verified across random, text, cyclic, all-zero, tiny, and corpus inputs at q9/q10/q11. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
4f6ff24 to
4b7d843
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two follow-up items from the ratio/speed work, both independent and verified end-to-end.
1. Fix: CLI
-dtruncated highly-compressible large inputscompcol -t bzip2 -d(and any block-buffering decoder) truncated output at 64 KiB on large compressible inputs. The streaming decode loop stopped once the compressed input was consumed (itsconsumed < nguard), leaving output the decoder had buffered internally — a whole bzip2 block larger than the output buffer — undrained, andfinishdoesn't flush it. Added a drain loop that pulls the decoder's buffered output before finishing.bzip2 -ddecoded it fine.compcol -c | compcol -dfor bzip2/gzip/xz/zstd/lzma/lz4/brotli (was 65536 of 5000000 for bzip2). Regression test added (tests/cli.rs).2. brotli encoder: iterative optimal parse — 1.48 → 1.39 vs
brotli -q 11The last remaining ratio gap. Added a zopfli-style iterative, statistics-driven optimal LZ77 parse (
src/brotli/encoder_optimal.rs) at quality 9–11: a forward DP whose cost model is rebuilt from the previous pass's actual command/literal/distance histograms each round (2 passes at q9/q10, 3 at q11), with candidate matches cached across passes. This is the distance-reuse feedback loop that gives brotli q10/q11 its edge.brotli -q 11= 480480; 1.253 vs-q 9). Encode ~4.7 s at q11; greedy retained as fallback and for q≤8.compcol -t brotli -c -l N | brotli -d | cmpbyte-exact across inputs × q0/q4/q9/q10/q11.-q 9is structural — q9–q11 also do block splitting with separate command/distance context trees and NPOSTFIX distance coding, which this encoder doesn't yet do.Checks
cargo test --all-features— 61 suites green, 0 failures (incl. the new CLI drain regression test).cargo fmt --check,cargo clippy --all-features --all-targets -D warnings, rustdoc-D warnings— clean.🤖 Generated with Claude Code