Skip to content

Add QPACK (RFC 9204) + standalone Huffman / range-coder / MTF / BWT codecs#92

Merged
MagicalTux merged 1 commit into
masterfrom
add-codecs-batch
Jun 14, 2026
Merged

Add QPACK (RFC 9204) + standalone Huffman / range-coder / MTF / BWT codecs#92
MagicalTux merged 1 commit into
masterfrom
add-codecs-batch

Conversation

@MagicalTux

Copy link
Copy Markdown
Member

Follow-up to the codec-gap review: adds the high-value, low-risk batch — the HTTP-stack sibling of HPACK plus a set of standalone building blocks. All clean-room from public specs or built from machinery already in the tree. No unsafe, no new dependencies.

What's new

Codec Feature API / name Validation
HTTP/3 QPACK (RFC 9204) qpack compcol::qpack::{QpackEncoder, QpackDecoder} RFC 9204 Appendix B vectors (B.1–B.5), byte-exact
Canonical Huffman (standalone) huffman compcol::huffman_codec::Huffman / "huffman" round-trip + edge cases
Range coder (adaptive order-0) rangecoder compcol::rangecoder::RangeCoder / "range" round-trip + ratio + fuzz-style garbage
Move-To-Front transform mtf compcol::mtf::Mtf / "mtf" round-trip + streaming-equivalence
Burrows-Wheeler Transform bwt compcol::bwt::Bwt / "bwt" round-trip incl. multi-block + malformed-input

QPACK

Full decoder: static table (Appendix A, 99 entries), dynamic table built from the encoder-stream instructions (set-capacity / insert-with-name-ref / insert-with-literal-name / duplicate), and every field-line representation (indexed static/dynamic/post-base, literal with name reference, literal name). Encoder uses the static table + literals (Required Insert Count = 0) — fully interoperable; dynamic-table encoding is a deliberate, documented future extension. Reuses the HPACK §5.2 string Huffman code and HeaderField (so qpack pulls hpack). Like HPACK, it's a stateful header-codec module API, not a byte-stream codec.

Standalone primitives

These expose entropy-coding and reversible-transform building blocks the crate previously only used internally — mtf + bwt + rangecoder together are the pieces of a bzip2-style pipeline, usable à la carte. The four are registered in the factory (by-name lookup + CLI extension).

Method

Implemented by five parallel agents in isolated worktrees (one module each, new files only); the shared wiring (Cargo.toml, lib.rs, factory.rs) was done centrally to keep it conflict-free and consistent. A couple of bugs the agents caught and fixed during their own validation: a BWT counting-sort bucket overflow on small blocks, and a Huffman length-table RLE command collision.

Checks

  • cargo test --all-features61 suites green, 0 failures (incl. QPACK Appendix B vectors and a new tests/new_codecs_batch.rs factory round-trip + QPACK API test).
  • cargo fmt --check, cargo clippy --all-features --all-targets -D warnings, rustdoc (-D warnings) — all clean.
  • Each new feature builds standalone under --no-default-features.

Not in this batch (from the same review)

bzip3, LZJB, ARJ/ZOO/Unix-pack, heatshrink/FastLZ, and the PAQ family — deferred (several carry licensing or reference-fixture risk worth a separate look). Completing existing Unsupported stubs (lzma2 encoder, lz5 Huffman, lzfse bvx2) is also still open.

🤖 Generated with Claude Code

… BWT codecs

Five new codecs, all clean-room from public specs or built from
machinery already in the tree. No `unsafe`, no new dependencies.

- **qpack** (`compcol::qpack`): HTTP/3 QPACK header compression (RFC 9204).
  Full decoder — static table (Appendix A), dynamic table built from the
  encoder-stream instructions (set-capacity / insert-with-name-ref /
  insert-with-literal-name / duplicate), and every field-line representation
  (indexed static/dynamic/post-base, literal with name ref, literal name).
  Encoder uses the static table + literals (Required Insert Count = 0), which
  is fully interoperable; dynamic-table *encoding* is a documented future
  extension. Reuses the HPACK string Huffman code and `HeaderField`. Validated
  byte-for-byte against RFC 9204 Appendix B (B.1–B.5).

- **huffman** (`compcol::huffman_codec::Huffman`, name "huffman"): standalone
  self-delimiting canonical (length-limited, order-0) Huffman codec — builds
  the code from the input's own byte statistics and serialises it into the
  stream. Self-contained (does not depend on the deflate-internal huffman).

- **rangecoder** (`compcol::rangecoder::RangeCoder`, name "range"): adaptive
  order-0 binary range coder (LZMA-style carry-less coder + 255-node bit-tree
  model). ~44x on 64 KiB of zeros; bounded expansion on incompressible input.

- **mtf** (`compcol::mtf::Mtf`, name "mtf"): Move-To-Front reversible
  transform — a streaming, length-preserving byte filter (as used in bzip2).

- **bwt** (`compcol::bwt::Bwt`, name "bwt"): standalone block Burrows-Wheeler
  Transform with a per-block primary index (prefix-doubling suffix array
  forward, LF-mapping inverse), block size configurable.

The four `Algorithm`-based codecs are registered in the factory
(by-name + extension); QPACK is a stateful header-codec module API like HPACK.

Validation: `cargo test --all-features` green (61 suites), `cargo fmt`,
`cargo clippy --all-features --all-targets -D warnings`, and rustdoc all clean;
each feature builds standalone under `--no-default-features`.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@MagicalTux MagicalTux merged commit 06e7aac into master Jun 14, 2026
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant