Skip to content

Phase 10: test fixture corpus, MSRC CSAF fix, GHSA references fix#73

Merged
scarson merged 36 commits into
mainfrom
dev
Apr 5, 2026
Merged

Phase 10: test fixture corpus, MSRC CSAF fix, GHSA references fix#73
scarson merged 36 commits into
mainfrom
dev

Conversation

@scarson
Copy link
Copy Markdown
Owner

@scarson scarson commented Apr 5, 2026

Summary

  • Test fixture corpus: Golden file tests for all 8 feed adapters (NVD, KEV, GHSA, MITRE, OSV, MSRC, Red Hat, EPSS) using captured real API responses served via httptest. Catches upstream schema drift that hand-crafted fixtures miss.
  • SeedCorpus helper: testutil.SeedCorpus(t, db) seeds a test database with 65 real CVEs from 8 feeds through the merge pipeline. Deterministic, reproducible integration test data.
  • MSRC adapter fix: Rewrote Fetch to use Microsoft's real CSAF 2.0 static file distribution (msrc.microsoft.com/csaf/advisories/) — the previous /csaf/{id} API endpoint never existed.
  • GHSA adapter fix: Changed references from []ghsaReference (objects) to []string (bare URLs) — the real GitHub API returns strings, not objects. Was silently dropping 100% of advisories.
  • Capture/extraction tooling: dev/cmd/capture-feeds/ and dev/cmd/extract-fixtures/ for refreshing the fixture corpus from live APIs.
  • Documentation: Updated CLAUDE.md, AGENTS.md, and testing-pitfalls.md with SeedCorpus usage guidance and golden file test requirements.
  • CI: Added expiring govulncheck exception (expires 2026-07-05) for unpatched docker/docker daemon-side CVEs (GO-2026-4887, GO-2026-4883) — transitive test dependency, zero production exposure.

Test plan

  • All 13 test packages pass (internal/feed/..., internal/testutil/..., dev/cmd/capture-feeds/...)
  • golangci-lint run — 0 issues
  • EPSS golden test cross-checks DB scores against fixture CSV values
  • SeedCorpus seeds 65 CVEs from 8 feeds (NVD 26, MITRE 36, GHSA 12, OSV 65 patches, KEV 10, MSRC 3, Red Hat 5, EPSS 21 scores)
  • MSRC golden test parses 3 real CSAF 2.0 advisories
  • GHSA golden test parses 12 real advisories (was 0 before fix)

🤖 Generated with Claude Code

scarson and others added 30 commits March 19, 2026 01:44
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reads a CVE manifest and extracts matching entries from captured
feed snapshots into per-adapter testdata/golden/ directories.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
28 records covering all required edge case categories:
- Data completeness: CVSS v4.0, missing CVSS, 0.0 scores, multi-CWE, Unicode
- Status: Rejected, RESERVED, Disputed, Withdrawn GHSA
- Cross-feed: up to 7-feed overlap (NVD+KEV+MSRC+OSV+RedHat+MITRE+EPSS)
- Feed-specific: GHSA without CVE, OSV alias resolution, MSRC CSAF, Red Hat fix state
- EPSS: high (>0.9), very low (<0.01)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extracted from real API snapshots (2026-03-19):
- NVD: 26 CVEs in 1 synthetic response page
- GHSA: 12 advisories (including null-CVE and withdrawn)
- KEV: 10 entries with requiredAction
- EPSS: 21 scores (including high >0.9 and low <0.01)
- MITRE: 36 CVE JSON files in ZIP archive
- OSV: 65 entries (including alias resolution targets)
- MSRC: updates list + 3 CVRF documents
- Red Hat: 5 detail pages with vendor enrichment

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- KEV: 10 patches, all InCISAKEV=true with VendorEnrichment
- GHSA: SKIPs — adapter can't unmarshal string references (known upstream schema drift)
- MITRE: 36 patches from curated ZIP subset
- OSV: 65 patches with alias resolution verification
- Red Hat: 5 patches with VendorSeverity="Important" vendor enrichment

MSRC and EPSS golden tests deferred:
- MSRC: CSAF endpoint broken (returns 400 for all release IDs)
- EPSS: Requires testcontainers, deferred to SeedCorpus integration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ixtures

Runs NVD, MITRE, GHSA, OSV, KEV, Red Hat adapters against golden
fixtures through the real merge pipeline, then applies EPSS scores.
Produces 62 unique CVEs from 6 feeds in ~8 seconds.

GHSA produces 0 patches due to known adapter parsing issue
(string references field). MSRC excluded (CSAF endpoint broken).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…oft.com

Old fixtures were CVRF format (PascalCase keys, {Value:...} wrappers)
captured from the /cvrf/v3.0/cvrf/ endpoint. The adapter expected CSAF
2.0 format from a /csaf/ endpoint that never existed.

New fixtures are real CSAF 2.0 advisories downloaded from Microsoft's
static CSAF distribution at msrc.microsoft.com/csaf/advisories/.
Per-CVE files (~6KB each) with proper CSAF 2.0 structure.

CVEs: CVE-2025-14174, CVE-2026-21510, CVE-2026-32194
(CVE-2026-3909 from manifest had no CSAF file; CVE-2026-32194 used
as replacement — decision documented in plan appendix)
Replace the broken OData /updates + /csaf/{releaseID} API flow with the
working CSAF static file approach: download changes.csv for discovery,
then fetch individual per-CVE JSON files. Remove dead code (dateTimeRe,
updateEntry, updatesResponse, parseUpdates) and unused imports (net/url,
regexp). Add encoding/csv import and parseChangesCSV parser.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrite Fetch tests to use changes.csv + per-CVE JSON serving instead of
OData /updates + /csaf/{releaseID}. Replace redirectTransport with
testutil.NewURLRewriteTransport. Add TestParseChangesCSV. Remove
TestParseUpdates, TestFetch_InvalidCursorDate, and redirectTransport.
Update golden_test.go to serve changes.csv and per-CVE fixture files.
All 6 TestCSAFToPatches_* tests preserved unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Code review found missing plan assertion: every patch CVEID must
start with "CVE-" to verify proper format from real CSAF data.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
testutil/seedcorpus.go imports msrc, and adapter_test.go (package msrc)
imported testutil — creating a cycle. Replaced testutil.NewURLRewriteTransport
with a local redirectTransport in the internal test file. The external
golden_test.go (package msrc_test) continues to use testutil safely.
Documents how to refresh captured fixtures, when to refresh,
and how to add new feed adapters to the corpus.
The GitHub Advisory API returns references as plain URL strings:
  "references": ["https://...", "https://..."]
not as objects with a url field:
  "references": [{"url": "https://..."}]

The adapter defined ghsaReference{URL string} which caused every
advisory to fail json.Unmarshal, silently dropping 100% of records.
Changed to []string and updated advisoryToPatches() accordingly.

Also fixed the golden test assertion for GHSA-native advisories:
ResolveCanonicalID returns the GHSA ID (not empty string) when
no CVE alias exists.
Bulk captured data moved from D:\Code\CVErt-Ops\data\ to
.data/ (gitignored) at the project root. Updated all path
references in the Phase 10 plan to use relative paths.
Bulk captured data moved from D:\Code\CVErt-Ops\data\ to
.data/ (gitignored) at the project root. Updated all path
references in the Phase 10 plan and capture-feeds CLI.
Add .data/ to .gitignore for bulk feed snapshot storage (moved from
D: drive). Add autonomous decision appendix entries D2 (import cycle
fix) and D3 (SeedCorpus threshold) to the MSRC CSAF fix plan.
Add test data seeding guidance to CLAUDE.md: when to use SeedCorpus,
when to use hand-crafted fixtures, and the golden file test pattern.

Add to testing-pitfalls.md §9: golden file test requirement for all
feed adapters (with MSRC/GHSA bug evidence), SeedCorpus usage guide
with clear when-to-use/when-not-to criteria, and cross-reference
from §7 "test data must flow through production code paths."
Same test data seeding subsection as CLAUDE.md so non-Claude agents
also know about the test fixture corpus infrastructure.
Seeds NVD CVEs via the merge pipeline, applies EPSS scores from
the golden CSV fixture, verifies DB values including low-score
preservation (testing-pitfalls §9.4). Requires Docker.

This file was lost when the Phase 10 worktree was deleted before
its commits were merged. Recovered and recreated.
scarson added 3 commits April 5, 2026 16:24
Review found assertion 3 was weak — verified scores were non-null
but didn't compare against expected values from the fixture. Now
parses the gzipped CSV and asserts every DB score matches the CSV
exactly. Also fixed LIMIT 5 → full scan (was only reading 1 row).
Document why seedNVDForEPSS is used instead of SeedCorpus (only NVD
needed, not all 8 adapters). Document known gap: golden CSV has no
score of exactly 0.0 so that edge case relies on unit test coverage.
- seedNVDForEPSS: explain intentional duplication of fetchNVDGolden
  and why extracting a shared helper isn't worth the API complexity
- GHSA references field: note that the API returns bare URL strings,
  not objects, since this is non-obvious and was a previous bug
Comment thread internal/testutil/seedcorpus.go Fixed
scarson and others added 2 commits April 5, 2026 16:50
Fix 104 lint issues across golden tests, seed corpus, and dev CLI tools:
- revive unused-parameter: rename unused `r *http.Request` to `_ *http.Request`
- revive missing-export-comment: add doc comment to URLRewriteTransport.RoundTrip
- staticcheck QF1008: simplify `db.Store.DB()` to `db.DB()` (embedded field)
- errcheck: use `_, _ =` pattern for httptest handler w.Write calls
- noctx: replace http.Get/client.Get with context-aware NewRequestWithContext+Do
- gosec G104/G306/G301/G703/G704/G705/G706: add inline nolint with reasons for
  dev-only tools, test fixtures, and confirmed false positives

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GO-2026-4887 (AuthZ plugin bypass) and GO-2026-4883 (plugin privilege
validation off-by-one) are daemon-side vulnerabilities in docker/docker,
a transitive dependency of testcontainers-go. They do not affect our
production code. The exception auto-expires on 2026-07-05 to force
re-evaluation of the upstream fix status.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread internal/testutil/seedcorpus.go Fixed
CodeQL flagged path injection (go/path-injection) — the CVE ID from
the request URL was used directly in filepath.Join without sanitization.
Added path separator and ".." check before constructing the file path.
@scarson scarson merged commit 48b1fea into main Apr 5, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants