From 25090159a0607d57e6ee11e74f9114200c967a91 Mon Sep 17 00:00:00 2001 From: Nelson Spence Date: Mon, 15 Jun 2026 10:45:59 -0500 Subject: [PATCH] fix(security): pin benchmark Python deps + triage bincode advisory MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit OpenSSF Scorecard / OSV flagged ~20 advisories on main after the BEIR benchmark landed (#237). ALL are dev/benchmark tooling — none reach the published `ordvec` crate or the `ordvec` PyPI wheel. Python (benchmarks/beir/requirements.txt): the deps were UNPINNED, so OSV flagged each against its entire historical CVE list (an unconstrained version cannot be ruled non-vulnerable). The actual resolved-latest versions are already patched. Lower-bound-pin every package at its first patched release — clears the flags (OSV excludes a `>=fixed` range) while `>=` keeps installs on the latest compatible wheel, incl. recent CPython: - requests>=2.32.4 (GHSA-9hjg-9r4m-mvj7 .netrc leak + all older requests CVEs) - hnswlib>=0.8.0 (GHSA-xwc8-rf6m-xr86 double free) - numpy>=1.26.0 (symlink-write + incorrect-comparison CVEs) - safe floors for scipy/pandas/tqdm/tabulate/huggingface-hub/faiss-cpu/ pytrec-eval-terrier/matplotlib. Verified the local cp314 venv satisfies all. Rust (RUSTSEC-2025-0141): bincode 1.x is UNMAINTAINED (informational advisory, not a vulnerability), pulled only transitively via hnsw_rs in the dev-only benchmarks/beir-bench harness. `cargo tree -p ordvec` is clean of bincode, so it does not reach the shipped crate. Add a documented deny.toml ignore so cargo-deny (configured to error on unmaintained crates) stays green; revisit if a maintained HNSW crate that does not pull bincode 1.x is adopted. Verified: `cargo tree -p ordvec` clean of bincode; `cargo deny check advisories` ok; benchmark venv versions satisfy the new floors. Signed-off-by: Nelson Spence --- CHANGELOG.md | 12 +++++++++++ benchmarks/beir/requirements.txt | 34 +++++++++++++++++++------------- deny.toml | 12 ++++++++--- 3 files changed, 41 insertions(+), 17 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 683b5b5..a4b47b0 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased +### Security + +- **Cleared OSV / OpenSSF-Scorecard advisories on the dev-only BEIR benchmark + tooling** (introduced with the benchmark harness; none reach the published + `ordvec` crate or the `ordvec` PyPI wheel). The `benchmarks/beir/requirements.txt` + deps were unpinned, so OSV flagged each against its full historical CVE list; + they are now lower-bound-pinned at the first patched release (`requests>=2.32.4`, + `hnswlib>=0.8.0`, `numpy>=1.26`, plus safe floors for the rest). `bincode` 1.x + (RUSTSEC-2025-0141, *unmaintained* — not a vulnerability) enters only + transitively via `hnsw_rs` in `benchmarks/beir-bench` and is absent from + `cargo tree -p ordvec`; it is triaged with a documented `deny.toml` ignore. + ### Added - **Reproducible BEIR benchmark harness** (`make benchmark-beir`; dev-only, diff --git a/benchmarks/beir/requirements.txt b/benchmarks/beir/requirements.txt index 2e14356..34e85dd 100644 --- a/benchmarks/beir/requirements.txt +++ b/benchmarks/beir/requirements.txt @@ -10,30 +10,36 @@ # sdist, which fails to build on modern gcc. The BEIR dataset loader is vendored # in beir_prepare.py and evaluation uses the prebuilt `pytrec-eval-terrier` # wheel instead. +# +# Versions are LOWER-BOUND-PINNED at the first patched release for every package +# with a known advisory, so this dev-only harness stays clean under OSV / +# OpenSSF-Scorecard scanning (an UNPINNED dep is flagged against the package's +# entire historical CVE list). `>=` keeps installs on the latest compatible wheel +# (incl. recent CPython) while excluding all known-vulnerable versions. # --- core --- -numpy -scipy -requests -tqdm -pandas -tabulate +numpy>=1.26.0 +scipy>=1.11.0 +requests>=2.32.4 # GHSA-9hjg-9r4m-mvj7 (.netrc leak) + all older requests CVEs +tqdm>=4.66.3 # CVE-2024-34062 +pandas>=2.2.0 +tabulate>=0.9.0 # --- model download for the canonical llamacpp lane --- -huggingface-hub +huggingface-hub>=0.24.0 # --- retrieval baselines (comparison references, NOT ground truth) --- -faiss-cpu -hnswlib +faiss-cpu>=1.8.0 +hnswlib>=0.8.0 # GHSA-xwc8-rf6m-xr86 (double free) # --- evaluation: trec_eval bindings (prebuilt wheel, no C compile) --- -pytrec-eval-terrier +pytrec-eval-terrier>=0.5.6 # --- README benchmark graphics --- -matplotlib +matplotlib>=3.8.0 # --- optional: sentence-transformers lane (`--provider st`) --- # Heavy (pulls torch). Uncomment to enable the fp32 ST encoder lane: -# sentence-transformers -# torch -# transformers +# sentence-transformers>=3.0.0 +# torch>=2.2.0 +# transformers>=4.44.0 diff --git a/deny.toml b/deny.toml index 7009117..17acba9 100644 --- a/deny.toml +++ b/deny.toml @@ -14,9 +14,15 @@ all-features = true [advisories] # Default behaviour: error on any RUSTSEC advisory (vulnerability) or -# unmaintained crate in the tree. No advisories are currently ignored; add -# entries with an explicit `reason` only when triaged. -ignore = [] +# unmaintained crate in the tree. Triaged ignores carry an explicit reason. +# +# RUSTSEC-2025-0141 — bincode 1.x is UNMAINTAINED (an informational advisory, +# NOT a vulnerability). It enters the graph only transitively via `hnsw_rs`, +# itself a dependency of the dev-only `benchmarks/beir-bench` harness. It is NOT +# in the published `ordvec` crate (`cargo tree -p ordvec` is clean of bincode), +# so it does not reach any shipped artifact or crate consumer. Revisit if a +# maintained HNSW crate that does not pull bincode 1.x is adopted. +ignore = ["RUSTSEC-2025-0141"] [licenses] # Allow-list only. cargo-deny denies any license not listed here.