Skip to content

Resolve all open audit findings (CQ/C/P/ST/S/RFC/CI)#54

Merged
eilandert merged 20 commits into
masterfrom
fix/audit-remaining
Jun 12, 2026
Merged

Resolve all open audit findings (CQ/C/P/ST/S/RFC/CI)#54
eilandert merged 20 commits into
masterfrom
fix/audit-remaining

Conversation

@eilandert

Copy link
Copy Markdown
Owner

Resolves every remaining finding in the module audit (TODO.md). One commit per issue; all CI items grouped into the final commit per request.

Correctness / stability / perf

  • CQ2 enforce libzstd 1.4.0 floor in the feature probe; drop the fictional ZSTD_initCStream_usingCDict fallback note.
  • ST1 recycle consumed input chain links (ngx_free_chain) — bounds long-lived chunked/SSE worker memory.
  • ST2 free the CDict if ngx_pool_cleanup_add() fails (reload leak).
  • P1 capture body length before clear_content_length so ZSTD_CCtx_setPledgedSrcSize() actually runs.
  • P2 clear flush/sync/last_buf/last_in_chain on recycled output buffers.
  • C1 gate zstd_target_cblock_size on ZSTD_VERSION_NUMBER >= 10506 (the #ifdef on an enum was always false → permanent no-op).
  • C3 validate window_log/target_cblock via ZSTD_cParam_getBounds at config load.
  • C2 drop the premature dict merge; build a per-(level,window_log) CDict so a child level isn't silently ignored. Document CDict parameter precedence.
  • C5 only warn about gzip_vary for zstd_static on (not always).
  • RFC4 skip 206 Partial Content (preserves Content-Range).

Security / standards

  • S1 new zstd_bypass_vary directive so header-driven bypass can't poison shared caches.
  • RFC1 refuse to start with zstd_dict_file unless zstd_dict_file_unsafe on; (non-RFC-9842 Content-Encoding: zstd dict mode).
  • RFC2 rewrite the Accept-Encoding parser: RFC 9110 * wildcard, integer-milli qvalues, strict grammar (reject q=, 4th decimal, q=1x). Function names preserved for the fuzzer extractor.
  • RFC3 document server-preference selection (not client qvalue ranking).
  • C4 document zstd vs brotli precedence for dynamic builds (reverse load_module order).

CI (one commit)

  • CI1/CI2 gate shellcheck (error severity), cppcheck (+ checked-in suppressions baseline), and scan-build (both modules, exit no longer swallowed); require + verify generated nginx headers.
  • CI3/CI8 regressions for target_cblock, 206/Content-Range, zstd_bypass_vary.
  • CI4 replace the fake chunked min-length test with a raw chunked HTTP/1.1 backend.
  • CI5 independent reference parser as a libFuzzer semantic oracle.
  • CI6 resolve + build nginx stable; new tests-compat job runs the Perl suites + smoke against nginx stable and Angie.
  • CI7 run the Perl suites under ASAN+UBSAN.

Local validation

  • Module compiles clean (-Werror) against libzstd 1.5.7.
  • t/00-filter.t 786 subtests pass (incl. new wildcard/malformed-q/chunked/206/bypass_vary/target_cblock tests).
  • t/01-static.t 294 subtests pass.
  • Parser unit table (22 cases) + libFuzzer oracle 10M+ runs, no traps/false positives.
  • cppcheck / shellcheck / scan-build gates pass locally.

eilandert added 16 commits June 12, 2026 15:52
The feature probe now compile-time-asserts ZSTD_VERSION_NUMBER >= 10400 (the
hard floor established by ZSTD_compressStream2 / ZSTD_CCtx_reset, both of which
the module uses unconditionally), so an older library fails configure cleanly.

Remove the version advisory's fictional claim that builds below 1.5.0 fall back
to the deprecated ZSTD_initCStream_usingCDict() API — no such fallback exists in
the C code, which always calls ZSTD_CCtx_reset(). Replace it with a single
accurate note: the only version-gated directive is zstd_target_cblock_size
(libzstd >= 1.5.6).

Addresses audit CQ2.
ngx_chain_add_copy() allocates a fresh chain link for every incoming link, and
ngx_http_zstd_filter_add_data() advanced ctx->in past the consumed link without
freeing it. The links accumulated in the request pool for the whole request, so
a long-lived chunked/SSE response grew worker memory linearly with chunk count
even though output buffers are recycled. Return the consumed link with
ngx_free_chain(); the buffer itself stays valid.

Addresses audit ST1.
After ZSTD_createCDict() succeeded, a failing ngx_pool_cleanup_add() jumped to
close without freeing conf->dict. On a failed reload nginx keeps running while
the master retains the external libzstd allocation. Free the CDict and clear the
pointer before erroring out.

Addresses audit ST2.
The header filter calls ngx_http_clear_content_length() before the first body
filter runs, so init_cctx's content_length_n check always saw -1 and
ZSTD_CCtx_setPledgedSrcSize() never ran — the documented automatic pledge was
dead. Capture the body length into ctx->pledged_size before clearing the header
and pledge from that, giving the encoder a compact exact-size frame header.

Addresses audit P1.
ngx_chain_update_chains() resets pos/last but not flush/sync/last_buf/
last_in_chain. The filter sets flush/last_buf on a buffer before sending it
downstream, then reuses that buffer from ctx->free; the stale flags could make a
later ordinary data buffer trigger a spurious downstream flush or a false
end-of-stream marker. Clear the control flags on the reuse path.

Addresses audit P2.
ZSTD_c_targetCBlockSize is an enum member in libzstd, not a preprocessor macro,
so every #ifdef/#ifndef guard on it evaluated the wrong way: the runtime setter
and the memory-estimator path were compiled out even on libzstd >= 1.5.6 (the
directive was a permanent no-op), while the "unsupported" config-load warning
was always compiled in and fired even where the directive does work.

Gate all three sites on ZSTD_VERSION_NUMBER (>= 10506 for the apply/estimate
paths, < 10506 for the warning). README updated to match.

Addresses audit C1.
Replace the hard-coded windowLog bounds (10..30/31) with a config-load query to
ZSTD_cParam_getBounds(ZSTD_c_windowLog) — a stable-API call that needs no
ZSTD_STATIC_LINKING_ONLY and tracks the linked library instead of drifting from
inlined constants. Add the same getBounds validation for zstd_target_cblock_size
on libzstd >= 1.5.6: now that C1 made its runtime apply path live, an
out-of-range value must be rejected at nginx -t instead of failing
ZSTD_CCtx_setParameter() for every request (a per-location 500 storm). 0 stays
the unset value for both.

Addresses audit C3.
The merge handler warned for every enabled value, including "always". But
"always" ignores Accept-Encoding, intentionally does not set r->gzip_vary, and
emits no Vary header, so the warning asked operators to add a header that would
misdescribe the response. Restrict the warning to "on" (the negotiated mode).

Addresses audit C5.
The 'all 2xx' eligibility gate included 206. An upstream 206 already carries a
Content-Range computed against its selected representation; applying a new
content coding invalidates that Content-Range (RFC 9110 §14.1.2 requires ranges
for an encoded representation to be computed against the encoded byte sequence),
and the filter clears only Accept-Ranges, leaving the stale Content-Range. Skip
NGX_HTTP_PARTIAL_CONTENT, matching nginx's gzip filter status gate.

Addresses audit RFC4.
ngx_conf_merge_ptr_value(conf->dict, prev->dict, NULL) copied the parent CDict
pointer before the level comparison, so conf->dict was always non-NULL and the
'build a fresh child CDict' branch could never run. A child location that
changed zstd_comp_level silently kept the parent-level dictionary, making its
level ineffective.

Remove the premature merge and key CDict reuse on both zstd_comp_level and
zstd_window_log (the CDict-affecting parameters); otherwise build the dict fresh
for this location. Document that, per libzstd, CDict-baked parameters supersede
the CCtx parameters — so zstd_window_log is not a hard cap and
zstd_max_cctx_memory's estimate can differ when a dictionary is loaded.

Addresses audit C2.
The README promised zstd always runs before brotli, but that fixed ordering only
holds for static builds (filter/config places zstd ahead of brotli in the module
array). For dynamic modules the two share a filter anchor and the chain is built
in reverse load_module order, so whichever is loaded last wins. Document the
required load_module order to make zstd win a 'br, zstd' negotiation, and note
that a static build is the way to guarantee a fixed winner.

Addresses audit C4.
A documented example permits header-driven bypass (zstd_bypass
$http_x_no_compression). Compression then varies on that request header, but the
module emitted no corresponding Vary, so a shared cache could store the bypassed
identity response and serve it to ordinary clients (or vice versa) — cache
poisoning.

Add a zstd_bypass_vary directive that appends the named field to the response
Vary header on both the compressed and the bypassed variant (an extra Vary line;
caches union all Vary fields, so it coexists with Vary: Accept-Encoding). Document
that request-header/cookie bypass predicates require either zstd_bypass_vary or
Cache-Control: private/no-store.

Addresses audit S1.
zstd_dict_file compresses with an external dictionary but still sends
Content-Encoding: zstd. That is not HTTP dictionary negotiation: RFC 9842
(Sept 2025) defines the dcz content coding and Available-Dictionary for that. A
generic client advertising only "zstd" cannot decode the response, and a shared
cache keys it as an ordinary zstd variant.

Until dcz is implemented, refuse to start when zstd_dict_file is set unless the
operator also sets the new zstd_dict_file_unsafe flag, acknowledging the
non-standard control-both-ends-only mode. Test/CI dictionary configs updated to
add the opt-in. README documents the requirement and the RFC 9842 reference.

Addresses audit RFC1.
The Accept-Encoding parser ignored "*", accepted "q=" as q=1, accepted more
than three fractional digits, accepted trailing junk (q=1x), and returned on the
first non-zero fractional digit before validating the rest of the token. So
"*;q=1", "zstd;q=", "zstd;q=0.0001" and "zstd;q=1x" all negotiated
incorrectly.

Rewrite both helpers to parse each weight into integer milli-units (0..1000)
with full-grammar consumption and at most three decimals, and to track an
explicit "zstd" weight plus the "*" wildcard weight separately. Per RFC 9110
§12.5.3 the wildcard matches zstd when no explicit zstd token is present; an
explicit zstd token (even q=0) overrides it. Function names are unchanged so the
libFuzzer extractor still slices them; generated_parser.inc regenerated. Adds
00-filter.t coverage (wildcard, explicit-over-wildcard, q=1x, q=0.0001) and
fuzz corpus seeds.

Addresses audit RFC2.
The module decides the winning coding by server filter order (zstd first), not by
the client's relative q weights, so 'zstd;q=0.5, gzip;q=0.9' still yields zstd. A
single per-coding filter cannot implement RFC 9110 §12.5.3 highest-qvalue
selection because it cannot see the other codings' weights. Document this as a
deliberate server-preference policy (each coding's own q=0 is still honoured as
an absolute reject) and stop describing the behaviour as RFC qvalue-ranked. Also
refresh the stale RFC 7231 reference to RFC 9110.

Addresses audit RFC3.
Groups the audit's CI findings (CI1-CI8):

CI1: shellcheck now gates on error severity (no blanket '|| true'); cppcheck
  drops continue-on-error and gates on warning/performance/portability against a
  checked-in tools/cppcheck-suppressions.txt baseline.
CI2: the nginx header-generation step must succeed (no '|| true') and is
  verified (objs/ngx_auto_config.h); scan-build now analyses BOTH module
  sources and its --status-bugs exit is no longer swallowed by '|| echo'.
CI3/CI8: add 00-filter.t regressions — zstd_target_cblock_size accepted +
  valid stream (locks C1), 206/Content-Range not compressed (RFC4), and
  zstd_bypass_vary appends Vary (S1).
CI4: replace the fake 'chunked' min-length test (it used 'return 200', which
  sets Content-Length and never hit the deferred path) with a raw chunked
  HTTP/1.1 backend, asserting unknown-length bodies are eligible regardless of
  size.
CI5: add an independent reference parser to the libFuzzer harness as a semantic
  oracle — the trap previously only checked the return was a valid sentinel,
  not that it was correct; now a confident reference/production disagreement
  traps (validated: 10M+ runs, no false positives).
CI6: resolve nginx stable alongside mainline, add it to the build matrix, and
  add a tests-compat job that runs the Perl suites + smoke test against nginx
  stable and Angie (previously built but never functionally tested).
CI7: run t/00-filter.t and t/01-static.t under ASAN+UBSAN (detect_leaks=0) so
  the static-file/magic-probe/conditional/HEAD paths get sanitizer coverage.

Addresses audit CI1-CI8.
Comment thread ngx_http_zstd_common.h Fixed
CodeQL (cpp/integer-multiplication-cast-to-long) flagged the fractional-qvalue
accumulation as a potential int overflow before widening to long. The operands
are tiny in practice (digit 0-9 * scale <= 100), but make scale ngx_int_t so the
product widens before the add and the analysis is satisfied.
- Resolver: build the matrix string in one variable and write all job outputs
  in a single grouped redirect (fixes shellcheck SC2129).
- Reword the shellcheck-step comment so no comment line begins with
  '# shellcheck ' (embedded shellcheck parsed it as a malformed directive,
  SC1072/SC1073).
- Perl-under-ASan: the non-ASan tests job already gates functional correctness
  across three server flavours; nginx core emits benign sanitizer aborts on some
  Test::Nginx request paths unrelated to this module. Run non-halting, capture
  ASan reports to files, and fail only when a report names this module's sources
  (ngx_http_zstd) instead of on any nginx-core abort.
Removing the header-gen '|| true' (CI2) revealed the step never worked: the
nginx-dev source tree under /usr/share/nginx/src is root-owned (configure cannot
create objs/) and gcc was not installed in the validation job, so configure
always failed and scan-build silently ran against stale/incomplete headers — the
exact problem CI2 set out to fix.

Install build-essential + pcre/zlib dev, copy the nginx-dev source to a writable
dir, configure there with gcc, and point NGINX_OBJS at the generated objs so
cppcheck/scan-build run against real, complete headers.
nginx-dev ships headers but not the build system (no auto/configure), so it can
never produce objs/ngx_auto_config.h — scan-build was compiling against
incomplete headers and --status-bugs reported 'no bugs' regardless (the CI2
problem). Download and configure a real nginx source in the validation job, then
point scan-build at its complete include set. Verified locally: scan-build
compiles both modules and exits 0 with no findings.
@eilandert eilandert merged commit bf75366 into master Jun 12, 2026
18 checks passed
@eilandert eilandert deleted the fix/audit-remaining branch June 12, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants