perf(lzss,huffman): hash-chain match finder + table Huffman decode by MagicalTux · Pull Request #111 · KarpelesLab/compcol

MagicalTux · 2026-07-01T00:29:06Z

Reviewed the whole codec suite for optimization headroom (benchmarked encode+decode throughput across every algorithm) and kept the changes that clear the ≥10% bar. Two clear algorithmic wins:

lzss encode — O(N·n) brute force → hash chain

The encoder compared every position against all 4096 ring-buffer slots, i.e. O(N·n) regardless of content, so incompressible input collapsed to ~0.3 MB/s (~110k instructions/byte). Replaced it with a hash-chain finder over the raw input, translating a match source at input position cand to the ring index the decoder expects, (cand + N - F) & (N - 1).

Output size is unchanged: it depends only on match lengths, which the fully-walked chain reproduces; only the tie-broken source position can differ.
~9× faster on text, ~700× on random (at 1 MiB); zeros neutral. Compressed sizes within 0.01% across text/binary/zeros/source.
u32 chain array keeps the fixed allocation small.

huffman decode — bit-by-bit → table lookup

The standalone canonical-Huffman decoder walked each code one bit at a time (a BitReader call per bit). It now builds a single peek-and-lookup table indexed by the next max_length bits (≤15 ⇒ ≤64 KiB) and decodes one symbol per lookup.

~1.9–2.1× fewer decode instructions (deterministic callgrind) on both text and high-entropy input.
Output identical; corrupt/truncated streams still rejected (Corrupt/UnexpectedEnd) without panicking.

Notes

Other slow paths were checked and left alone as already-optimal or inherently costly: the range coder (8-bit bit-tree, ~37 instr/bit), h2-huffman (already a byte-wide FSA), mtf (already single-pass; random is a non-goal), bwt (SA-IS). The multi-MB, highly-repetitive lzss case is ~30% slower than the old early-breaking brute force but stays >270 MB/s — an acceptable trade for fixing the 700× worst case.

Verification

Full suite (61 binaries), clippy, fmt clean. lzss ratio preserved + round-trips; 60-case huffman fuzz + 30 corrupt inputs round-trip through our decoder without panic.

🤖 Generated with Claude Code

Reviewed the codec suite for optimization headroom (bench across every algorithm). Two clear algorithmic wins, both keeping output correct: lzss encode: the finder compared each position against all 4096 ring-buffer slots — O(N·n) regardless of content, so incompressible input collapsed to ~0.3 MB/s. Replace it with a hash chain over the raw input (translating a match source at input position `cand` to the decoder's ring index `(cand + N - F) & (N - 1)`). Output size is unchanged because it depends only on match lengths, which the fully-walked chain reproduces; only the tie-broken source position can differ. ~9x faster on text, ~700x on random at 1 MiB; compressed sizes within 0.01% across text/binary/zeros/code. huffman decode: the canonical decoder walked each code one bit at a time (one BitReader call per bit). Build a single peek-and-lookup table indexed by the next max_length bits (<= 15, so <= 64 KiB) and decode a symbol per lookup. ~1.9-2.1x fewer decode instructions on both text and high-entropy input; output identical, corrupt/truncated streams still rejected without panic. Verified: full suite (61 binaries), clippy, fmt clean; lzss ratio preserved and round-trips; 60-case huffman fuzz + 30 corrupt inputs round-trip through our decoder without panic. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

MagicalTux merged commit ca385e2 into master Jul 1, 2026
42 checks passed

MagicalTux deleted the perf/lzss-huffman-decode branch July 1, 2026 00:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(lzss,huffman): hash-chain match finder + table Huffman decode#111

perf(lzss,huffman): hash-chain match finder + table Huffman decode#111
MagicalTux merged 1 commit into
masterfrom
perf/lzss-huffman-decode

MagicalTux commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

MagicalTux commented Jul 1, 2026

lzss encode — O(N·n) brute force → hash chain

huffman decode — bit-by-bit → table lookup

Notes

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant