Releases: lance0/prefixd
v0.16.0
What's New
Corroborating signals (ADR 021)
A new class of correlation signals that strengthen open signal groups without ever triggering mitigations on their own. Targets coarse telemetry (router CPU, interface utilization, per-customer NetFlow, PoP-level metrics) that shouldn't name a victim IP but is valuable alongside a real detector.
- Configuration. New
mode: corroborating+match_dimensions: [pop, customer_id, service_id, interface]on any entry incorrelation.yaml'ssourcesmap. Declared dimensions are authoritative — a source declared for[pop]can never attach via an undeclaredcustomer_id/service_id/interfaceeven if those happen to match. Validator rejects misconfiguration on bothPUT /v1/config/correlationand daemon boot. - Ingest endpoint.
POST /v1/signals/corroboratoraccepts dimension-tagged signals (novictim_ip). Matches open signal groups via OR-semantics across declared dimensions only, with optionalvectornarrower. Unmatched signals cache for up towindow_secondsand drain when a matching primary event arrives. - Engine invariant. A signal group composed entirely of corroborators never reaches
corroboration_met=true; at least one primary event is required. Enforced incheck_corroboration_with_primaryand in the corroborator-side aggregate recompute. - Dashboard. Per-source mode + dimension picker on the Correlation Config tab; corroborating badge on signal group detail; Signal Sources tab now merges activity from corroborator traffic so
mode: corroboratingsources no longer render as "never seen". - CLI. New
prefixdctl send-corroborator --source router-cpu --pop iad1 .... - Activity endpoint.
GET /v1/signals/corroborator/activity?minutes=Nreturns per-source(last_seen, count)aggregated across live cache and attached rows. - Metrics.
prefixd_corroborator_ingested_total{source},_attached_total{source},_expired_total(unlabelled; counts only unattached cache misses, not attached-then-GC'd rows). - Interface dimension. New optional
interfacefield on inventoryAssetentries feeds intoIpContext.interfaceandprimary_dimensions, so interface-only corroborators (a common gNMI / SNMP shape) have a real matchable dimension.
Database migrations
- 009 —
primary_dimensionsJSONB onsignal_groups, corroborator denormalization onsignal_group_events, newcorroborating_signalscache table. - 010 —
corroborator_ingested_atfor accurate per-row timestamps on corroborator attachments. - 011 — Backfills
primary_dimensionsfor pre-upgrade open signal groups from their associated mitigations so corroborators can attach to in-flight incidents immediately after upgrade.
Bug fixes and hardening
POST /v1/eventsnow rejects corroborating-only sources at handler entry (before any DB writes).- Corroborator matching now strictly filters against declared
match_dimensions(not just a presence check) on both the ingest and cache-drain paths. CorrelationConfig::load+Settings::loadrunvalidate()on YAML parsing — daemon refuses to boot with invalid correlation config.CORROBORATOR_EXPIRED_TOTALnarrowed to true cache misses; attached rows are still GC'd but do not inflate the counter.- Mock repository behavior tightened (full-struct
update_signal_group, real dimension filter infind_open_groups_by_dimensions) so tests reflect production behavior. - Frontend null-safe
ingested_atrendering;SourceDialogauto-clearsmatch_dimensionswhen switching mode back toprimary.
Known limits (deferred to PR B)
- Late corroborator finalization (needs playbook-override-aware recompute on the corroborator path).
- Per-source label on
prefixd_corroborator_expired_total. CorroboratorResponse.cachedfield cleanup.- Dashboard "cached corroborators" panel +
/v1/signals/corroborator/cachelisting endpoint. prefixd_corroborator_cache_size{source}gauge metric.
See ROADMAP → Correlation Engine → "Corroborating signals v2 (PR B)".
References
- ADR 021 — Corroborating signals (with Review remediations + Known limits appendices)
- Detector quickstart
Full Changelog: v0.15.0...v0.16.0
v0.15.0
What's New
Generic webhook adapter for arbitrary detectors
A new endpoint — POST /v1/signals/webhook/{name} — lets you integrate any detector, telemetry source, or commercial DDoS appliance that can POST JSON, without writing Rust. Declare adapters in `correlation.yaml` with JSONPath field mappings, and prefixd maps the payload into the standard event pipeline (correlation, guardrails, playbooks, FlowSpec).
Highlights:
- JSONPath field mapping (RFC 9535 via `serde_json_path`) — extract `victim_ip`, `vector`, `bps`, `pps`, `confidence`, `source_id`, `top_dst_ports`, `action`, and `timestamp` from any JSON shape
- Three auth modes — HMAC-SHA256 (constant-time compare via `subtle`, secret loaded from env var, never in YAML), bearer (reuses global session auth), or none (lab use only)
- Array batching — `root_path: "$.alerts[*]"` iterates a JSON array, producing one event per match; partial failures are per-event, overall status stays 200
- Vector normalization — `vector_map` translates detector-specific strings (`UDP_FLOOD` → `udp_flood`); `default_vector` handles unknowns
- Confidence scaling — `confidence_scale: 100` accepts 0-100 detector scales and normalizes to 0.0-1.0
- Full CRUD UI — manage adapters from the Correlation → Config tab
- Hot-reload — adapter changes apply via `POST /v1/config/reload`, no restart
Docs:
- ADR 020 — design rationale
- docs/detectors/generic-webhook.md — end-to-end Radware walkthrough
- docs/configuration.md — schema reference
- docs/api.md — endpoint reference
Webhook adapter config validation
`CorrelationConfig::validate()` now rejects `confidence_scale <= 0` or non-finite, empty `auth.secret_env` / `auth.header`, and non-`sha256` HMAC algorithms — misconfiguration surfaces as a 400 on PUT/reload instead of runtime 500s.
Fixed
- Rust 1.95 CI compatibility — Addressed new clippy lints (`collapsible_match`, `cloned_ref_to_slice_refs`, `field_reassign_with_default`) and match-exhaustiveness in `gobgp.rs` guard patterns.
- Webhook `action` validation — Invalid `action` values (e.g. `"resolved"`, typos) now produce a per-event mapping error instead of silently defaulting to `"ban"`. Missing/null still defaults to `"ban"`.
Security
- `rustls-webpki` 0.103.10 → 0.103.12 (RUSTSEC-2026-0098 / RUSTSEC-2026-0099)
- `RUSTSEC-2026-0097` (`rand`) added to `cargo-audit` ignore list pending upstream fix.
New dependencies
- `serde_json_path 0.7` — RFC 9535 JSONPath, pure-Rust, no C deps
- `subtle 2` — constant-time comparison for HMAC verification
Tests
- 204 unit + 110 integration + 16 postgres backend tests pass (+8 webhook unit tests over v0.14.1)
- 78 frontend tests pass (+10 validator tests)
Contributors
- @lance0 — generic webhook adapter, validation, CI fixes, docs
Full Changelog: v0.14.1...v0.15.0
v0.14.1
Fixed
- Prometheus metrics wired up — All event, mitigation, announcement, reconciliation, guardrail, and BGP session metrics were defined but never incremented. Now properly instrumented across all handler and reconciliation paths. (contributed by @bswinnerton)
- MITIGATIONS_ACTIVE gauge — Reconciliation loop now recomputes with correct action_type and pop labels each tick
- BGP_SESSION_UP gauge — Updated each reconciliation tick from actual peer session state
Security
- aws-lc-sys 0.36.0 → 0.39.1 — CRL scope check logic error (high), PKCS7 signature/cert chain bypass (high), AES-CCM timing side-channel (medium), X.509 name constraints bypass
- rustls-webpki 0.103.9 → 0.103.10 — CRL Distribution Point matching logic
- picomatch → 4.0.4 — ReDoS via extglob quantifiers (high), method injection in POSIX character classes (moderate)
Dependencies
- uuid 1.21.0 → 1.22.0
- rustls 0.23.36 → 0.23.37
- ipnet 2.11.0 → 2.12.0
- @radix-ui/react-popover 1.1.4 → 1.1.15
- recharts 3.7.0 → 3.8.0
Full Changelog: v0.14.0...v0.14.1
v0.14.0
What's New
Multi-Signal Correlation Engine
Combine weak signals from multiple detectors into high-confidence mitigation decisions. Events targeting the same (victim_ip, vector) within a time window are grouped into signal groups with configurable source weights, corroboration thresholds, and per-playbook overrides.
- Signal groups with derived confidence, source counting, and corroboration status
- Correlation explainability on mitigations (why a decision was made)
- Per-playbook overrides for min_sources and confidence_threshold
- Signal group expiry via the reconciliation loop
Signal Adapters
- Alertmanager webhook (
POST /v1/signals/alertmanager) — maps labels/annotations to attack events, handles batched alerts, resolved alerts trigger withdraw, fingerprint dedup - FastNetMon webhook (
POST /v1/signals/fastnetmon) — classifies vector from traffic breakdown, configurable confidence mapping, attack_uuid dedup
Correlation Config API
GET /v1/config/correlation— current config (secrets redacted)PUT /v1/config/correlation— update config (admin only, validates, writes YAML, hot-reloads)
Correlation Dashboard
- Correlation page with Signals, Groups, and Config tabs
- Signal group detail page with contributing events and source breakdown
- Correlation context section on mitigation detail page
Infrastructure
- Docker configs mount changed from read-only to writable (dashboard config editors work out of the box)
- Default
configs/correlation.yamlincluded as example config
Documentation
- ADR 018 — Correlation engine design
- ADR 019 — Signal adapter architecture
- Upgrade guide — v0.13.0 → v0.14.0 migration steps
- Deployment guide — Signal adapter setup (Alertmanager, FastNetMon)
- Configuration reference — Correlation config section
Testing
- 179 backend unit tests, 99 integration, 16 postgres, 9 e2e, 64 frontend
- All passing, clippy clean, Docker containers verified end-to-end
Breaking Changes
None. Correlation is opt-in via correlation.enabled: true. Default behavior preserves existing single-source flow.
Full Changelog: v0.13.0...v0.14.0
v0.13.0
What's New
-
Event batching —
POST /v1/events/batchaccepts up to 100 events in a single request with partial success semantics (202/207). Sequential processing through the full pipeline (validation, guardrails, policy, FlowSpec announce). -
Post-attack incident reports —
GET /v1/reports/incident?mitigation_id=Xor?ip=Xgenerates a markdown incident report with summary table, timeline, events, mitigations, and audit trail. Dashboard "Report" buttons on mitigation detail and IP history pages with copy/download dialog. -
FlowSpec NLRI fuzz/property tests — 8 proptest property-based tests for prefix parsing, NLRI roundtrip, and action roundtrip. Two cargo-fuzz targets for offline fuzzing with libFuzzer.
Bug Fixes
- WebSocket rejected all connections when auth_mode is none — Dashboard showed "Disconnected" permanently in no-auth deployments.
- Mitigation detail page showed "Not Found" on Next.js 16 — Dynamic route params became async in Next.js 15+.
- Dark mode outline button hover invisible — Export CSV, Refresh, and other outline-variant buttons had nearly invisible hover states in dark mode.
Full Changelog: v0.12.0...v0.13.0
v0.12.0
What's New
- Cursor-based pagination — All list endpoints now use cursor-based pagination (
?cursor=<opaque>&limit=N). Responses includenext_cursorandhas_morefields. Breaking:offsetparameter removed (see ADR 016). - Date range filtering — All list endpoints accept
?start=<ISO8601>&end=<ISO8601>for time-bounded queries. - Bulk acknowledge —
POST /v1/mitigations/acknowledgemarks mitigations as reviewed (setsacknowledged_at/acknowledged_by). Filterable via?acknowledged=true|false. - Per-destination event routing — Each alerting destination can specify its own
eventslist to override the global filter. Backward-compatible (ADR 017). - Notification preferences —
GET/PUT /v1/preferencesstores per-operator toast settings (muted events, quiet hours UTC). Dashboard toasts respect preferences; quiet hours suppress non-critical events only.
Breaking Changes
Offset pagination removed
?offset=N no longer works on /v1/mitigations, /v1/events, or /v1/audit. Use cursor-based pagination:
# Page 1 (no cursor = first page)
curl '/v1/mitigations?limit=50'
# Response: {"mitigations": [...], "next_cursor": "MjAyNi0w...", "has_more": true}
# Page 2
curl '/v1/mitigations?limit=50&cursor=MjAyNi0w...'If you use prefixdctl, replace --offset with --cursor.
Audit response shape changed
GET /v1/audit now returns {"entries": [...], "count": N, "next_cursor": ..., "has_more": ...} instead of a bare array.
Migrations
Two new migrations run automatically on startup:
- 005:
acknowledged_at/acknowledged_bycolumns on mitigations - 006:
notification_preferencestable
Back up your database before upgrading: docker compose exec postgres pg_dump -U prefixd prefixd > backup.sql
Bug Fixes
- Migration 005 uses
IF NOT EXISTSfor idempotent column adds - Notification preferences: quiet hours always serialized as explicit
null - Half-configured quiet hours rejected (both start and end required, or both null)
- Preferences fetched eagerly on session start
- Bun pinned to 1.3.10 in Dockerfile (1.3.11 segfaults during next build in CI)
See upgrading.md for detailed migration guide.
Full Changelog: v0.11.0...v0.12.0
v0.11.0
What's New
- Bulk withdraw —
POST /v1/mitigations/withdrawaccepts up to 100 mitigation IDs with partial success semantics. Frontend adds checkbox selection on active/escalated rows, select-all, selection toolbar, and confirmation dialog. Critical during false-positive waves. - FlowSpec rule preview — Mitigation detail page shows a router-style one-liner (
match destination 203.0.113.10/32 protocol udp destination-port 53 then rate-limit 5 Mbps) above the structured grid. Copy Rule button for quick comparison with router CLI output. - CVE gate in CI — Security audit (
cargo audit+bun audit) now gates Docker publishing. CycloneDX SBOM generated as a release artifact on version tags. - Vendor capability matrix — New
docs/vendors.mdwith tested status for Juniper (verified), Arista (partially verified via community production deployment), Cisco IOS-XR, Nokia SR OS, and FRR. Reference import policies per vendor.
Security
- Next.js — Updated to 16.1.7 (CSRF bypass, HTTP smuggling, disk cache DoS, postpone DoS)
- undici — Updated to 7.24.4 (WebSocket overflow, request smuggling, memory DoS, CRLF injection)
- quinn-proto — Updated to 0.11.14 (RUSTSEC-2026-0037, DoS in Quinn endpoints)
- rollup — Updated to 4.59.0 (GHSA-mw96-cpmx-2vgc, arbitrary file write)
Dependencies
16 dependency updates including tonic 0.14.5, futures-util 0.3.32, Next.js 16.1.7, and 5 GitHub Actions major version bumps.
Full Changelog: v0.10.1...v0.11.0
v0.10.1
Bug Fixes
- FastNetMon re-ban collision — Deterministic
sha256(IP|direction)event IDs caused permanent 409 duplicates after withdrawal. The integration script now uses UUID event IDs per ban occurrence, enabling re-ban and proper TTL extension for ongoing attacks. (#65) - FastNetMon unban flow — Unban now queries active mitigations by
victim_ipand withdraws directly, with a configurable retry window for ban/unban race timing. (#65) - FastNetMon withdraw missing
operator_id— Withdraw payload now includesPREFIXD_OPERATOR(default:fastnetmon) to satisfy API validation.
Added
victim_ipfilter on mitigations API —GET /v1/mitigations?victim_ip=Xfilters by exact victim IP address. Supported in both single-POP and all-POPs queries.- FastNetMon integration guide — New
docs/detectors/fastnetmon.mdwith full setup, env vars, testing, and troubleshooting.
Full Changelog: v0.10.0...v0.10.1
v0.10.0
What's New
- Playbook editor — Edit playbooks directly from the dashboard with a form-based editor or raw YAML tab.
PUT /v1/config/playbooksendpoint with full validation, atomic write with backup, and hot-reload. - Interactive alerting config — Add, edit, and remove alert destinations from the dashboard.
PUT /v1/config/alertingendpoint with secret merge (existing secrets preserved via***sentinel), atomic write to standalonealerting.yaml, and hot-reload. Type-specific forms for all 7 destination types with event filter checkboxes. - Alerting config split — Alerting configuration moved from
prefixd.yamlto standalonealerting.yamlwith backward-compatible fallback. - Event cross-links — Mitigation detail page shows clickable links to triggering and last (TTL extend) events.
- GHCR Docker publishing — CI publishes
prefixdandprefixd-dashboardimages toghcr.io/lance0/prefixdandghcr.io/lance0/prefixd-dashboard.
Security
- SSRF protection on webhook URLs (HTTPS required, localhost/private IPs rejected)
- Secret merge ambiguity detection (errors on multiple same-type destinations)
- Atomic config writes with symlink rejection for both playbooks and alerting
- Concurrent write serialization across merge/validate/save/swap
- reload_config() race fix (write locks held during load+swap)
Test Coverage
- 116 backend unit tests, 25 integration tests, 26 frontend tests
- No database migrations required
- No breaking API changes
Full Changelog: v0.9.1...v0.10.0
v0.9.1
What's New
Webhook Alerting Backend
7 destination types: Slack (Block Kit), Discord (embeds), Microsoft Teams (Adaptive Card), Telegram (Bot API), PagerDuty (Events API v2 with auto-resolve), OpsGenie (Alert API v2), Generic webhook (HMAC-SHA256 signed). Fire-and-forget dispatch with bounded concurrency (64 tasks), 3 retries with exponential backoff.
New endpoints: GET /v1/config/alerting (secrets redacted), POST /v1/config/alerting/test (admin-only).
Dashboard Quick Wins
- Alerting config UI — New tab on Config page with destination list and "Send Test Alert" button
- Audit log detail expansion — Click to expand full JSON details inline
- Customer/POP filters — Dropdown filters on mitigations page
- Timeseries range selector — 1h/6h/24h/7d toggle on activity chart
- Active count badge — Live mitigation count on sidebar nav
- Severity badges — Color-coded severity column on mitigations table
Testing Infrastructure
- Chaos test suite (17 tests: Postgres/GoBGP/prefixd kill, network outage)
- HTTP load test suite (7 tests: ~4,700 events/sec, ~8,000 health req/s)
- Benchmarks documentation with micro-benchmark and HTTP load test baselines
Security
- Login brute-force throttle hardened (atomic check, bounded state, TTL pruning)
- Generic webhook header values redacted in API response
- CIDR validation uses
ipnet::IpNet(rejects invalid masks like /999) - Alerting test endpoint restricted to admin role
- CSV formula sanitization regex strengthened
- Alert task queue bounded at 64 concurrent tasks
- Input validation on operator_id, withdraw reason fields
Changed
- 93 backend unit tests (up from 73), 26 frontend tests
- OpenAPI spec includes alerting endpoints
- Load/chaos scripts support
PREFIXD_API_TOKENfor authenticated environments
Full Changelog: v0.9.0...v0.9.1