Add QPACK (RFC 9204) + standalone Huffman / range-coder / MTF / BWT codecs#92
Merged
Conversation
… BWT codecs Five new codecs, all clean-room from public specs or built from machinery already in the tree. No `unsafe`, no new dependencies. - **qpack** (`compcol::qpack`): HTTP/3 QPACK header compression (RFC 9204). Full decoder — static table (Appendix A), dynamic table built from the encoder-stream instructions (set-capacity / insert-with-name-ref / insert-with-literal-name / duplicate), and every field-line representation (indexed static/dynamic/post-base, literal with name ref, literal name). Encoder uses the static table + literals (Required Insert Count = 0), which is fully interoperable; dynamic-table *encoding* is a documented future extension. Reuses the HPACK string Huffman code and `HeaderField`. Validated byte-for-byte against RFC 9204 Appendix B (B.1–B.5). - **huffman** (`compcol::huffman_codec::Huffman`, name "huffman"): standalone self-delimiting canonical (length-limited, order-0) Huffman codec — builds the code from the input's own byte statistics and serialises it into the stream. Self-contained (does not depend on the deflate-internal huffman). - **rangecoder** (`compcol::rangecoder::RangeCoder`, name "range"): adaptive order-0 binary range coder (LZMA-style carry-less coder + 255-node bit-tree model). ~44x on 64 KiB of zeros; bounded expansion on incompressible input. - **mtf** (`compcol::mtf::Mtf`, name "mtf"): Move-To-Front reversible transform — a streaming, length-preserving byte filter (as used in bzip2). - **bwt** (`compcol::bwt::Bwt`, name "bwt"): standalone block Burrows-Wheeler Transform with a per-block primary index (prefix-doubling suffix array forward, LF-mapping inverse), block size configurable. The four `Algorithm`-based codecs are registered in the factory (by-name + extension); QPACK is a stateful header-codec module API like HPACK. Validation: `cargo test --all-features` green (61 suites), `cargo fmt`, `cargo clippy --all-features --all-targets -D warnings`, and rustdoc all clean; each feature builds standalone under `--no-default-features`. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to the codec-gap review: adds the high-value, low-risk batch — the HTTP-stack sibling of HPACK plus a set of standalone building blocks. All clean-room from public specs or built from machinery already in the tree. No
unsafe, no new dependencies.What's new
qpackcompcol::qpack::{QpackEncoder, QpackDecoder}huffmancompcol::huffman_codec::Huffman/"huffman"rangecodercompcol::rangecoder::RangeCoder/"range"mtfcompcol::mtf::Mtf/"mtf"bwtcompcol::bwt::Bwt/"bwt"QPACK
Full decoder: static table (Appendix A, 99 entries), dynamic table built from the encoder-stream instructions (set-capacity / insert-with-name-ref / insert-with-literal-name / duplicate), and every field-line representation (indexed static/dynamic/post-base, literal with name reference, literal name). Encoder uses the static table + literals (Required Insert Count = 0) — fully interoperable; dynamic-table encoding is a deliberate, documented future extension. Reuses the HPACK §5.2 string Huffman code and
HeaderField(soqpackpullshpack). Like HPACK, it's a stateful header-codec module API, not a byte-stream codec.Standalone primitives
These expose entropy-coding and reversible-transform building blocks the crate previously only used internally —
mtf+bwt+rangecodertogether are the pieces of a bzip2-style pipeline, usable à la carte. The four are registered in the factory (by-name lookup + CLI extension).Method
Implemented by five parallel agents in isolated worktrees (one module each, new files only); the shared wiring (
Cargo.toml,lib.rs,factory.rs) was done centrally to keep it conflict-free and consistent. A couple of bugs the agents caught and fixed during their own validation: a BWT counting-sort bucket overflow on small blocks, and a Huffman length-table RLE command collision.Checks
cargo test --all-features— 61 suites green, 0 failures (incl. QPACK Appendix B vectors and a newtests/new_codecs_batch.rsfactory round-trip + QPACK API test).cargo fmt --check,cargo clippy --all-features --all-targets -D warnings, rustdoc (-D warnings) — all clean.--no-default-features.Not in this batch (from the same review)
bzip3, LZJB, ARJ/ZOO/Unix-pack, heatshrink/FastLZ, and the PAQ family — deferred (several carry licensing or reference-fixture risk worth a separate look). Completing existing
Unsupportedstubs (lzma2encoder,lz5Huffman,lzfsebvx2) is also still open.🤖 Generated with Claude Code