Skip to content

v1.1.0: session scoring + security hardening + scoring-recv exclusion override#3

Closed
dmichael-fastly wants to merge 1 commit into
mainfrom
session-scoring
Closed

v1.1.0: session scoring + security hardening + scoring-recv exclusion override#3
dmichael-fastly wants to merge 1 commit into
mainfrom
session-scoring

Conversation

@dmichael-fastly

@dmichael-fastly dmichael-fastly commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

v1.1.0: session scoring + security hardening + scoring-recv exclusion override

This branch ships v1.1.0, three workstreams squashed from 70+ original
commits. 170+ files changed; make ci green (lint + format + mypy +
3,053 backend pytest + 9 vcl + 65 Rust scorer + 265 frontend vitest +
gitleaks no-leaks + OSV no-vulns). Deployed and verified on the dev VM.

1 — Session scoring

End-to-end edge anomaly-detection pipeline for Fastly Compute. Layer 1
behavioural (cookie compliance, impossibly-fast browsing, robotic
dwell) + Layer 2 transition-matrix scoring + combined 0–100 quantized
score. Dual-implemented in Python (backend/scoring) and Rust
(compute/scorer); cross-language wire-format tests pin the AES-GCM
cookie codec byte-for-byte.

  • Edge: Compute scorer, 6-snippet VCL preflight (recv / pass / fetch /
    deliver / miss / enforce), AES-GCM cookie carrying rotating sid +
    transition state, fastly.ddos_detected bypass.
  • Backend: training pipeline, FOS-published matrix versioning,
    labelled-session retrain loop, /scoring/evaluation + /scoring/health
    • composite /scoring/dashboard, matrix version history + rollback,
      AES key rotation with grace window, sliding cookie lifetime,
      scoring audit log, threshold enforcement that 429s flagged requests
      at the edge within seconds of commit.
  • Admin UI at /admin/session-scoring: StatusPanel with live ROC-AUC
    against accumulated labels, ScoringHealthCard, ThresholdSlider with
    counterfactual flag/pass preview + precision/recall, RocPrCurves,
    TopFlaggedTable, LabelsTab with click-to-view-events, RetrainButton,
    RotateKeyButton, MatrixVersionsCard, per-reason AUC breakdown,
    session-events viewer, ExcludeRegexCard (see §3), help popups.

See docs/session_scoring_runbook.md + docs/features.md for the runbook
and feature reference.

2 — Security hardening

Comprehensive hardening across the FastAPI backend, Fastly VCL, Next.js
frontend, and Rust scorer. Full breakdown in the ### Security block
of CHANGELOG.md 1.1.0; capability summary:

  • Trust-boundary normalisation — uvicorn --proxy-headers +
    --forwarded-allow-ips=127.0.0.1, Caddy peer-IP gating of
    Fastly-Client-IP → XFF rewrite, Caddy-injected
    X-Proxied-By-Caddy marker driving Next.js /admin gating in place
    of the (forgeable) Host header. Backend reads request.client.host
    as the trust signal everywhere; in-app leftmost-XFF parsing is gone.
  • Destructive-op auth — provisioning teardown, NGWAF workspace
    mutation, and NGWAF workspace listing all require a caller-supplied
    Fastly token validated via /tokens/self for global scope and
    service binding. Server-stored credentials are never used as a
    destructive-op fallback.
  • DuckDB user-SQL safety — new backend/utils/sql_validator.py wraps
    every /api/query call with a statement-type whitelist + recursive
    parse-tree walker (catalog + function blocklists, fail-closed parse,
    audit logging, perf budget). Replaces an incomplete regex-based
    blocklist that missed read_csv_auto, information_schema,
    duckdb_secrets, INSTALL/LOAD, and getenv. Plus escape_sql_literal
    helper applied across ingest sites with characterisation tests for
    the audit-PoC payload, multi-byte UTF-8, backslash, and empty inputs.
  • VCL header & cache discipline — vcl_recv preamble unsets every
    client-spoofable internal x-of-* / x-fos-edge-data /
    x-is-cluster-fetch / X-Edge-* header; origin-metric log fields are
    numeric-regex-gated and json.escape-wrapped; CDN vcl_hash keys on
    full req.url; CDN vcl_recv now also runs querystring.filter_except
    (S3-API allow-list) + querystring.sort so unexpected params can't
    fracture the cache or leak the auth key into req.hash.
  • Cross-tenant scope enforcement — /api/alerts/* and /api/views/*
    filter every read by analyst-session service_ids and gate every
    mutation with pre-flight get_alert_by_id / get_view_by_id lookups so
    unauthorised mutations never land. Cache layer audited: every
    per-tenant cache key includes service_id.
  • Path-traversal cages — /api/download and cache-cleanup paths
    realpath + commonpath check; service_id alphanumeric/dash regex at
    path helpers; bucket-name separator rejection in cleanup.
  • Secret & data hygiene — share-DB TOCTOU on claim_token replaced
    with atomic UPDATE-with-rowcount; quarantine narrowed to actual
    SQLite corruption signatures (was wiping the DB on any
    OperationalError); scrypt timing equalised across hit/miss to close
    the email-enumeration oracle; rate-limiter dicts bounded; stack-
    trace key stripped from HTTPException.detail with sweep fixture
    asserting no route leaks tracebacks.
  • SSH host-key pinning — configs/ssh_known_hosts with fail-safe
    loader; the tunnel manager refuses to start if the pinned file is
    missing/empty.
  • Scorer signal tightening — Python+Rust parity:
    L1_SCORE_COOKIE_TAMPERED=100 (was capped at 75 alongside
    missing/expired), L1_ROBOTIC_DWELL_LOW_S 0.5 → 0.20 (closes the
    0.20s–0.50s dwell-band evasion). Sliding-window mean (cookie schema
    v3) tracked as a 1.2 follow-up.

3 — Scoring-recv URL exclusion regex (new operator control)

The "which requests get sent to Compute" condition was previously a
hard-coded _ASSET_EXT_REGEX in code. Operators can now override it
per-service from the Session Scoring page; the default static-asset
extension list still ships as the fallback.

  • Backend — recv_snippet + generate_scoring_vcl accept an
    exclude_url_regex parameter; persisted in
    cfg.scoring.exclude_url_regex (None / "" = use default).
    update_recv_exclusion_regex orchestrator clones only the active
    version, swaps the recv snippet, validates, activates — ~5–15s vs.
    the full enable_scoring flow.
  • New endpoints — GET /api/services/{id}/scoring/exclude-regex
    (returns current + default + effective) and PUT
    /api/services/{id}/scoring/exclude-regex?confirm=true (token-gated;
    audit-logged as scoring_exclude_regex_changed).
  • Three-layer validation before any VCL ships:
    1. Input policy — length cap (2 KB), no double-quote / control
      chars, must compile under Python re.
    2. falco static analysis (github.com/ysugimoto/falco) on the
      assembled recv snippet (catches composition errors that slip past
      Python's compiler).
    3. Fastly's own VCL compiler at activate time.
  • Frontend — ExcludeRegexCard on the overview tab: textarea
    pre-populated with current value, "Show default" toggle, "Reset to
    default" button, inline lint-error display, confirm-dialog before
    publish.
  • Infra — falco v2.3.0 baked into the backend Docker image; production
    sets SCORING_REQUIRE_FALCO=1 so a missing binary fails closed
    instead of degrading to input-policy-only.

Tooling additions

  • Secret scanner — gitleaks v8.30.1 wired into pre-commit
    (.pre-commit-config.yaml), make secret-scan (chained into make ci),
    and .github/workflows/ci.yml. Configuration in .gitleaks.toml
    extends the default ruleset with path allowlists for tracked test
    fixtures, Rust lockfile checksums, the public SSH host key, and
    gitignored runtime directories. AGENTS.md §Secrets documents the
    policy and suppression playbook.

Infrastructure

  • Backend + frontend Docker base: python:3.12-slim-bullseye →
    python:3.12-slim-bookworm. Remaining base-image CVEs are deep-
    dependency / OpenSSL CVEs every major Python base inherits.
  • Falco v2.3.0 in the backend image — required by the scoring-recv-
    snippet validator.
  • Dependency freshness sweep on all four ecosystems:
    • Python: aiohttp 3.13.5 → 3.14.0, cfn-lint 1.51.2 → 1.51.4,
      distlib, filelock, idna 3.17 → 3.18, joserfc 1.6.8 → 1.7.0.
    • Frontend: @tanstack/react-query 5.100.14 → 5.101.0 (+ devtools),
      @types/react 19.2.15 → 19.2.16, eslint-config-next 16.2.6 →
      16.2.7, next 16.2.6 → 16.2.7, react / react-dom 19.2.6 → 19.2.7.
    • Rust: bitflags 2.11.1 → 2.12.1.
    • Deferred (major bumps reserved for 1.2): TypeScript 5.9 → 6.0
      (compiler-API breaking changes); Fastly Rust SDK 0.11 → 0.12
      (Compute@Edge API churn); jsdom / eslint / vitest where we're
      already ahead of the npm "latest" tag.

Versioning

Bumped to 1.1.0 in pyproject.toml, frontend/package.json, and the
FastAPI app.version. CHANGELOG updated under [1.1.0] - 2026-06-03
with Security + Infrastructure sections.

Test coverage

backend pytest 3,053
Rust scorer 65 (+8)
frontend vitest 265 (+13)
VCL tests 9 (same)

New test files for this release:
tests/utils/test_sql_validator.py (60)
tests/utils/test_vcl_validator.py (18)
tests/test_proxy_headers_regression.py (10)
tests/test_no_trace_leakage_sweep.py (4)
tests/routers/test_provision_teardown_auth.py (9)
tests/routers/test_cross_tenant_scope.py (9)
tests/routers/test_scoring_exclude_regex.py (9)

Notes for reviewers

  • Branch was squashed from 70+ commits; full per-commit history is in
    git reflog locally. The squash makes this reviewable as one
    semantic unit (v1.1.0 release) instead of paging through unrelated
    intermediate work.
  • Every security-relevant change has acceptance tests. OpenAPI
    snapshot regenerated.
  • Stale v1.1.0 tag was deleted before the squash. After merge, tag
    main with v1.1.0 rather than the PR branch.

Test plan

  • make ci passes locally
  • Deployed to dev VM (fastly-log-analysis in us-central1-a) — all
    three containers healthy, GET /api/health returns 200
  • Falco verified in production image: v2.3.0
  • Exclude-regex endpoint reachable + returns expected shape
  • CDN VCL active version updated with new querystring filter_except
    + sort + req.url cache key
  • Gitleaks scan clean against full branch history
  • Reviewer: open /admin/session-scoring, scroll to the URL
    exclusion regex card, paste a custom regex (e.g. .(healthz)$),
    click Save → publish flow completes

Co-Authored-By: Claude Opus 4.7 noreply@anthropic.com

dmichael-fastly added a commit that referenced this pull request Jun 3, 2026
…ecv exclusion override

This branch ships v1.1.0, three workstreams squashed from 70+ original
commits. 170+ files changed; `make ci` green (lint + format + mypy +
3,087 backend pytest + 9 vcl + 65 Rust scorer + 265 frontend vitest +
OSV no-vulns). Deployed and verified on the dev VM.

## 1 — Session scoring (Phases A / B / C)

End-to-end edge anomaly-detection pipeline for Fastly Compute. Layer 1
behavioural (cookie compliance, impossibly-fast browsing, robotic
dwell) + Layer 2 transition-matrix scoring + combined 0–100 quantized
score. Dual-implemented in Python (backend/scoring) and Rust
(compute/scorer); cross-language wire-format tests pin the AES-GCM
cookie codec byte-for-byte.

- Edge: Compute scorer, 6-snippet VCL preflight (recv / pass / fetch /
  deliver / miss / enforce), AES-GCM cookie carrying rotating sid +
  transition state, `fastly.ddos_detected` bypass.
- Backend: training pipeline, FOS-published matrix versioning,
  labelled-session retrain loop, /scoring/evaluation + /scoring/health
  + composite /scoring/dashboard, matrix version history + rollback,
  AES key rotation with grace window, sliding cookie lifetime,
  scoring audit log, threshold enforcement that 429s flagged requests
  at the edge within seconds of commit.
- Admin UI at /admin/session-scoring: StatusPanel with live ROC-AUC
  against accumulated labels, ScoringHealthCard, ThresholdSlider with
  counterfactual flag/pass preview + precision/recall, RocPrCurves,
  TopFlaggedTable, LabelsTab with click-to-view-events, RetrainButton,
  RotateKeyButton, MatrixVersionsCard, per-reason AUC breakdown,
  session-events viewer, ExcludeRegexCard (see §3), help popups.

See docs/session_scoring_runbook.md + docs/features.md for the runbook
and feature reference.

## 2 — Security remediation (Phases 0–4)

40-finding security audit closed in five phases. All fixes deployed
and verified. Full breakdown in the `### Security` block of
CHANGELOG.md 1.1.0; one-line summary per phase:

- Phase 0 — uvicorn `--proxy-headers` + Host-spoof bypass /
  leftmost-XFF / teardown-auth (#012, #013, #029, #017, #034). Three
  extras spotted during the IR sweep:
  - E1 Caddyfile peer-IP gate on `Fastly-Client-IP → XFF` rewrite
    (port 80 was open to `0.0.0.0/0`).
  - E2 docker json-file log rotation (50 MB × 10 compressed).
  - E3 `generate_analyst_invite` fail-fast on missing token +
    defensive Fastly-response shape check.
- Phase 1 — 14-finding trivial sweep: tenant scoping (#7 / #008 /
  #019), TOCTOU (#2), quarantine narrowing (#3), email-enum
  timing equalisation (#4), `isoparse` validation (#1),
  `service_id` path-traversal regex (#6), session re-sync (#010),
  rate-limiter bounds (#014), VCL UA/referer cap (#016), GET→POST CSRF
  (#020), read_only query-param removal (#026), stack-trace strip
  (#027 / #028) + sweep fixture.
- Phase 2 — backend/utils/sql_validator.py implements Decision B:
  statement-type whitelist + recursive parse-tree walker (catalog +
  function blocklists) + fail-closed parse + audit log + perf budget
  (#031 / #033 / #035). escape_sql_literal helper + characterisation
  tests at four ingest sites (#009). VCL preamble unsetting
  client-spoofable internal headers (#021). Origin-metric VCL
  log-injection gates (#015). Path-traversal cages in /api/download
  (#5) + cache cleanup (#022). SSH host-key pinning via
  configs/ssh_known_hosts with fail-safe _ensure_known_hosts (#011).
- Phase 3 — Fastly vcl_hash keys on full req.url not just path
  (#024). Next.js /admin middleware gates on Caddy-injected
  X-Proxied-By-Caddy marker not Host header (#032). Scorer
  Python+Rust parity: L1_SCORE_COOKIE_TAMPERED=100,
  L1_ROBOTIC_DWELL_LOW_S 0.5 → 0.20 (#036 / #037). #038 sliding-
  window mean documented as tracked follow-up.
- Phase 4 — cross-tenant scope enforcement on /api/alerts/* and
  /api/views/* with pre-flight get_alert_by_id / get_view_by_id
  helpers so unauthorised mutations never land (#039 / #040). NGWAF
  workspace listing token-gated (#018). #025 covered by Phase 0 #017.
  Cache-layer audit confirmed every per-tenant cache includes
  service_id in the key.

## 3 — Scoring-recv URL exclusion regex (new operator control)

The "which requests get sent to Compute" condition was previously a
hard-coded _ASSET_EXT_REGEX in code. Operators can now override it
per-service from the Session Scoring page; the default static-asset
extension list still ships as the fallback.

- Backend — recv_snippet + generate_scoring_vcl accept an
  exclude_url_regex parameter; persisted in
  cfg.scoring.exclude_url_regex (None / "" = use default).
  update_recv_exclusion_regex orchestrator clones only the active
  version, swaps the recv snippet, validates, activates — ~5–15s vs.
  the full enable_scoring flow.
- New endpoints — GET /api/services/{id}/scoring/exclude-regex
  (returns current + default + effective) and PUT
  /api/services/{id}/scoring/exclude-regex?confirm=true (token-gated;
  audit-logged as scoring_exclude_regex_changed).
- Three-layer validation before any VCL ships:
  1. Input policy — length cap (2 KB), no double-quote / control
     chars, must compile under Python re.
  2. falco static analysis (github.com/ysugimoto/falco) on the
     assembled recv snippet (catches composition errors that slip past
     Python's compiler).
  3. Fastly's own VCL compiler at activate time.
- Frontend — ExcludeRegexCard on the overview tab: textarea
  pre-populated with current value, "Show default" toggle, "Reset to
  default" button, inline lint-error display, confirm-dialog before
  publish.
- Infra — falco v2.3.0 baked into the backend Docker image; production
  sets SCORING_REQUIRE_FALCO=1 so a missing binary fails closed
  instead of degrading to input-policy-only.

## Infrastructure

- Backend + frontend Docker base: python:3.12-slim-bullseye →
  python:3.12-slim-bookworm (cuts CVE-laden Debian 11 base; remaining
  13 high CVEs are deep-dependency / OpenSSL CVEs every major Python
  base inherits).
- Falco v2.3.0 in the backend image — required by the scoring-recv-
  snippet validator.
- Dependency freshness sweep on all four ecosystems:
  - Python: aiohttp 3.13.5 → 3.14.0, cfn-lint 1.51.2 → 1.51.4,
    distlib, filelock, idna 3.17 → 3.18, joserfc 1.6.8 → 1.7.0.
  - Frontend: @tanstack/react-query 5.100.14 → 5.101.0 (+ devtools),
    @types/react 19.2.15 → 19.2.16, eslint-config-next 16.2.6 →
    16.2.7, next 16.2.6 → 16.2.7, react / react-dom 19.2.6 → 19.2.7.
  - Rust: bitflags 2.11.1 → 2.12.1.
  - Deferred (major bumps reserved for 1.2): TypeScript 5.9 → 6.0
    (compiler-API breaking changes); Fastly Rust SDK 0.11 → 0.12
    (Compute@Edge API churn); jsdom / eslint / vitest where we're
    already ahead of the npm "latest" tag.

## Versioning

Bumped to 1.1.0 in pyproject.toml, frontend/package.json, and the
FastAPI app.version. CHANGELOG updated under [1.1.0] - 2026-06-03
with Security + Infrastructure sections.

## Test coverage

  backend pytest    3,087   (+321 vs v1.0.0)
  Rust scorer          65   (+8)
  frontend vitest     265   (+13)
  VCL tests             9   (same)

New test files for this release:
  tests/utils/test_sql_validator.py            (60)
  tests/utils/test_vcl_validator.py            (18)
  tests/test_proxy_headers_regression.py       (10)
  tests/test_no_trace_leakage_sweep.py          (4)
  tests/routers/test_provision_teardown_auth.py (9)
  tests/routers/test_cross_tenant_scope.py      (9)
  tests/routers/test_scoring_exclude_regex.py   (9)

## Notes for reviewers

- Branch was squashed from 70+ commits; full per-commit history is in
  git reflog locally. The squash makes this reviewable as one
  semantic unit (v1.1.0 release) instead of paging through unrelated
  intermediate work.
- Every security fix has acceptance tests. OpenAPI snapshot
  regenerated.
- Audit-finding working docs (docs/security_remediation_final_*.md,
  audit-findings/) were intentionally .gitignored and cleaned up at
  the end of the cycle — a fresh audit will produce fresh artifacts.
- Stale v1.1.0 tag was deleted before the squash. After merge, tag
  main with v1.1.0 rather than the PR branch.

## Test plan

- [x] `make ci` passes locally
- [x] Deployed to dev VM (fastly-log-analysis in us-central1-a) — all
      three containers healthy, GET /api/health returns 200
- [x] Falco verified in production image: v2.3.0
- [x] Exclude-regex endpoint reachable + returns expected shape
- [x] Off-network attack probes: Host-spoof, SQL injection
      (read_csv_auto, information_schema, getenv, duckdb_secrets),
      unauthenticated teardown — all rejected with expected status codes
- [ ] Reviewer: open /admin/session-scoring, scroll to the URL
      exclusion regex card, paste a custom regex (e.g. \.(healthz)$),
      click Save → publish flow completes

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
dmichael-fastly added a commit that referenced this pull request Jun 3, 2026
…ecv exclusion override

This branch ships v1.1.0, three workstreams squashed from 70+ original
commits. 170+ files changed; `make ci` green (lint + format + mypy +
3,087 backend pytest + 9 vcl + 65 Rust scorer + 265 frontend vitest +
OSV no-vulns). Deployed and verified on the dev VM.

## 1 — Session scoring (Phases A / B / C)

End-to-end edge anomaly-detection pipeline for Fastly Compute. Layer 1
behavioural (cookie compliance, impossibly-fast browsing, robotic
dwell) + Layer 2 transition-matrix scoring + combined 0–100 quantized
score. Dual-implemented in Python (backend/scoring) and Rust
(compute/scorer); cross-language wire-format tests pin the AES-GCM
cookie codec byte-for-byte.

- Edge: Compute scorer, 6-snippet VCL preflight (recv / pass / fetch /
  deliver / miss / enforce), AES-GCM cookie carrying rotating sid +
  transition state, `fastly.ddos_detected` bypass.
- Backend: training pipeline, FOS-published matrix versioning,
  labelled-session retrain loop, /scoring/evaluation + /scoring/health
  + composite /scoring/dashboard, matrix version history + rollback,
  AES key rotation with grace window, sliding cookie lifetime,
  scoring audit log, threshold enforcement that 429s flagged requests
  at the edge within seconds of commit.
- Admin UI at /admin/session-scoring: StatusPanel with live ROC-AUC
  against accumulated labels, ScoringHealthCard, ThresholdSlider with
  counterfactual flag/pass preview + precision/recall, RocPrCurves,
  TopFlaggedTable, LabelsTab with click-to-view-events, RetrainButton,
  RotateKeyButton, MatrixVersionsCard, per-reason AUC breakdown,
  session-events viewer, ExcludeRegexCard (see §3), help popups.

See docs/session_scoring_runbook.md + docs/features.md for the runbook
and feature reference.

## 2 — Security remediation (Phases 0–4)

40-finding security audit closed in five phases. All fixes deployed
and verified. Full breakdown in the `### Security` block of
CHANGELOG.md 1.1.0; one-line summary per phase:

- Phase 0 — uvicorn `--proxy-headers` + Host-spoof bypass /
  leftmost-XFF / teardown-auth (#012, #013, #029, #017, #034). Three
  extras spotted during the IR sweep:
  - E1 Caddyfile peer-IP gate on `Fastly-Client-IP → XFF` rewrite
    (port 80 was open to `0.0.0.0/0`).
  - E2 docker json-file log rotation (50 MB × 10 compressed).
  - E3 `generate_analyst_invite` fail-fast on missing token +
    defensive Fastly-response shape check.
- Phase 1 — 14-finding trivial sweep: tenant scoping (#7 / #008 /
  #019), TOCTOU (#2), quarantine narrowing (#3), email-enum
  timing equalisation (#4), `isoparse` validation (#1),
  `service_id` path-traversal regex (#6), session re-sync (#010),
  rate-limiter bounds (#014), VCL UA/referer cap (#016), GET→POST CSRF
  (#020), read_only query-param removal (#026), stack-trace strip
  (#027 / #028) + sweep fixture.
- Phase 2 — backend/utils/sql_validator.py implements Decision B:
  statement-type whitelist + recursive parse-tree walker (catalog +
  function blocklists) + fail-closed parse + audit log + perf budget
  (#031 / #033 / #035). escape_sql_literal helper + characterisation
  tests at four ingest sites (#009). VCL preamble unsetting
  client-spoofable internal headers (#021). Origin-metric VCL
  log-injection gates (#015). Path-traversal cages in /api/download
  (#5) + cache cleanup (#022). SSH host-key pinning via
  configs/ssh_known_hosts with fail-safe _ensure_known_hosts (#011).
- Phase 3 — Fastly vcl_hash keys on full req.url not just path
  (#024). Next.js /admin middleware gates on Caddy-injected
  X-Proxied-By-Caddy marker not Host header (#032). Scorer
  Python+Rust parity: L1_SCORE_COOKIE_TAMPERED=100,
  L1_ROBOTIC_DWELL_LOW_S 0.5 → 0.20 (#036 / #037). #038 sliding-
  window mean documented as tracked follow-up.
- Phase 4 — cross-tenant scope enforcement on /api/alerts/* and
  /api/views/* with pre-flight get_alert_by_id / get_view_by_id
  helpers so unauthorised mutations never land (#039 / #040). NGWAF
  workspace listing token-gated (#018). #025 covered by Phase 0 #017.
  Cache-layer audit confirmed every per-tenant cache includes
  service_id in the key.

## 3 — Scoring-recv URL exclusion regex (new operator control)

The "which requests get sent to Compute" condition was previously a
hard-coded _ASSET_EXT_REGEX in code. Operators can now override it
per-service from the Session Scoring page; the default static-asset
extension list still ships as the fallback.

- Backend — recv_snippet + generate_scoring_vcl accept an
  exclude_url_regex parameter; persisted in
  cfg.scoring.exclude_url_regex (None / "" = use default).
  update_recv_exclusion_regex orchestrator clones only the active
  version, swaps the recv snippet, validates, activates — ~5–15s vs.
  the full enable_scoring flow.
- New endpoints — GET /api/services/{id}/scoring/exclude-regex
  (returns current + default + effective) and PUT
  /api/services/{id}/scoring/exclude-regex?confirm=true (token-gated;
  audit-logged as scoring_exclude_regex_changed).
- Three-layer validation before any VCL ships:
  1. Input policy — length cap (2 KB), no double-quote / control
     chars, must compile under Python re.
  2. falco static analysis (github.com/ysugimoto/falco) on the
     assembled recv snippet (catches composition errors that slip past
     Python's compiler).
  3. Fastly's own VCL compiler at activate time.
- Frontend — ExcludeRegexCard on the overview tab: textarea
  pre-populated with current value, "Show default" toggle, "Reset to
  default" button, inline lint-error display, confirm-dialog before
  publish.
- Infra — falco v2.3.0 baked into the backend Docker image; production
  sets SCORING_REQUIRE_FALCO=1 so a missing binary fails closed
  instead of degrading to input-policy-only.

## Infrastructure

- Backend + frontend Docker base: python:3.12-slim-bullseye →
  python:3.12-slim-bookworm (cuts CVE-laden Debian 11 base; remaining
  13 high CVEs are deep-dependency / OpenSSL CVEs every major Python
  base inherits).
- Falco v2.3.0 in the backend image — required by the scoring-recv-
  snippet validator.
- Dependency freshness sweep on all four ecosystems:
  - Python: aiohttp 3.13.5 → 3.14.0, cfn-lint 1.51.2 → 1.51.4,
    distlib, filelock, idna 3.17 → 3.18, joserfc 1.6.8 → 1.7.0.
  - Frontend: @tanstack/react-query 5.100.14 → 5.101.0 (+ devtools),
    @types/react 19.2.15 → 19.2.16, eslint-config-next 16.2.6 →
    16.2.7, next 16.2.6 → 16.2.7, react / react-dom 19.2.6 → 19.2.7.
  - Rust: bitflags 2.11.1 → 2.12.1.
  - Deferred (major bumps reserved for 1.2): TypeScript 5.9 → 6.0
    (compiler-API breaking changes); Fastly Rust SDK 0.11 → 0.12
    (Compute@Edge API churn); jsdom / eslint / vitest where we're
    already ahead of the npm "latest" tag.

## Versioning

Bumped to 1.1.0 in pyproject.toml, frontend/package.json, and the
FastAPI app.version. CHANGELOG updated under [1.1.0] - 2026-06-03
with Security + Infrastructure sections.

## Test coverage

  backend pytest    3,087   (+321 vs v1.0.0)
  Rust scorer          65   (+8)
  frontend vitest     265   (+13)
  VCL tests             9   (same)

New test files for this release:
  tests/utils/test_sql_validator.py            (60)
  tests/utils/test_vcl_validator.py            (18)
  tests/test_proxy_headers_regression.py       (10)
  tests/test_no_trace_leakage_sweep.py          (4)
  tests/routers/test_provision_teardown_auth.py (9)
  tests/routers/test_cross_tenant_scope.py      (9)
  tests/routers/test_scoring_exclude_regex.py   (9)

## Notes for reviewers

- Branch was squashed from 70+ commits; full per-commit history is in
  git reflog locally. The squash makes this reviewable as one
  semantic unit (v1.1.0 release) instead of paging through unrelated
  intermediate work.
- Every security fix has acceptance tests. OpenAPI snapshot
  regenerated.
- Audit-finding working docs (docs/security_remediation_final_*.md,
  audit-findings/) were intentionally .gitignored and cleaned up at
  the end of the cycle — a fresh audit will produce fresh artifacts.
- Stale v1.1.0 tag was deleted before the squash. After merge, tag
  main with v1.1.0 rather than the PR branch.

## Test plan

- [x] `make ci` passes locally
- [x] Deployed to dev VM (fastly-log-analysis in us-central1-a) — all
      three containers healthy, GET /api/health returns 200
- [x] Falco verified in production image: v2.3.0
- [x] Exclude-regex endpoint reachable + returns expected shape
- [x] Off-network attack probes: Host-spoof, SQL injection
      (read_csv_auto, information_schema, getenv, duckdb_secrets),
      unauthenticated teardown — all rejected with expected status codes
- [ ] Reviewer: open /admin/session-scoring, scroll to the URL
      exclusion regex card, paste a custom regex (e.g. \.(healthz)$),
      click Save → publish flow completes

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
dmichael-fastly added a commit that referenced this pull request Jun 3, 2026
…ecv exclusion override

This branch ships v1.1.0, three workstreams squashed from 70+ original
commits. 170+ files changed; `make ci` green (lint + format + mypy +
3,087 backend pytest + 9 vcl + 65 Rust scorer + 265 frontend vitest +
OSV no-vulns). Deployed and verified on the dev VM.

## 1 — Session scoring (Phases A / B / C)

End-to-end edge anomaly-detection pipeline for Fastly Compute. Layer 1
behavioural (cookie compliance, impossibly-fast browsing, robotic
dwell) + Layer 2 transition-matrix scoring + combined 0–100 quantized
score. Dual-implemented in Python (backend/scoring) and Rust
(compute/scorer); cross-language wire-format tests pin the AES-GCM
cookie codec byte-for-byte.

- Edge: Compute scorer, 6-snippet VCL preflight (recv / pass / fetch /
  deliver / miss / enforce), AES-GCM cookie carrying rotating sid +
  transition state, `fastly.ddos_detected` bypass.
- Backend: training pipeline, FOS-published matrix versioning,
  labelled-session retrain loop, /scoring/evaluation + /scoring/health
  + composite /scoring/dashboard, matrix version history + rollback,
  AES key rotation with grace window, sliding cookie lifetime,
  scoring audit log, threshold enforcement that 429s flagged requests
  at the edge within seconds of commit.
- Admin UI at /admin/session-scoring: StatusPanel with live ROC-AUC
  against accumulated labels, ScoringHealthCard, ThresholdSlider with
  counterfactual flag/pass preview + precision/recall, RocPrCurves,
  TopFlaggedTable, LabelsTab with click-to-view-events, RetrainButton,
  RotateKeyButton, MatrixVersionsCard, per-reason AUC breakdown,
  session-events viewer, ExcludeRegexCard (see §3), help popups.

See docs/session_scoring_runbook.md + docs/features.md for the runbook
and feature reference.

## 2 — Security remediation (Phases 0–4)

40-finding security audit closed in five phases. All fixes deployed
and verified. Full breakdown in the `### Security` block of
CHANGELOG.md 1.1.0; one-line summary per phase:

- Phase 0 — uvicorn `--proxy-headers` + Host-spoof bypass /
  leftmost-XFF / teardown-auth (#012, #013, #029, #017, #034). Three
  extras spotted during the IR sweep:
  - E1 Caddyfile peer-IP gate on `Fastly-Client-IP → XFF` rewrite
    (port 80 was open to `0.0.0.0/0`).
  - E2 docker json-file log rotation (50 MB × 10 compressed).
  - E3 `generate_analyst_invite` fail-fast on missing token +
    defensive Fastly-response shape check.
- Phase 1 — 14-finding trivial sweep: tenant scoping (#7 / #008 /
  #019), TOCTOU (#2), quarantine narrowing (#3), email-enum
  timing equalisation (#4), `isoparse` validation (#1),
  `service_id` path-traversal regex (#6), session re-sync (#010),
  rate-limiter bounds (#014), VCL UA/referer cap (#016), GET→POST CSRF
  (#020), read_only query-param removal (#026), stack-trace strip
  (#027 / #028) + sweep fixture.
- Phase 2 — backend/utils/sql_validator.py implements Decision B:
  statement-type whitelist + recursive parse-tree walker (catalog +
  function blocklists) + fail-closed parse + audit log + perf budget
  (#031 / #033 / #035). escape_sql_literal helper + characterisation
  tests at four ingest sites (#009). VCL preamble unsetting
  client-spoofable internal headers (#021). Origin-metric VCL
  log-injection gates (#015). Path-traversal cages in /api/download
  (#5) + cache cleanup (#022). SSH host-key pinning via
  configs/ssh_known_hosts with fail-safe _ensure_known_hosts (#011).
- Phase 3 — Fastly vcl_hash keys on full req.url not just path
  (#024). Next.js /admin middleware gates on Caddy-injected
  X-Proxied-By-Caddy marker not Host header (#032). Scorer
  Python+Rust parity: L1_SCORE_COOKIE_TAMPERED=100,
  L1_ROBOTIC_DWELL_LOW_S 0.5 → 0.20 (#036 / #037). #038 sliding-
  window mean documented as tracked follow-up.
- Phase 4 — cross-tenant scope enforcement on /api/alerts/* and
  /api/views/* with pre-flight get_alert_by_id / get_view_by_id
  helpers so unauthorised mutations never land (#039 / #040). NGWAF
  workspace listing token-gated (#018). #025 covered by Phase 0 #017.
  Cache-layer audit confirmed every per-tenant cache includes
  service_id in the key.

## 3 — Scoring-recv URL exclusion regex (new operator control)

The "which requests get sent to Compute" condition was previously a
hard-coded _ASSET_EXT_REGEX in code. Operators can now override it
per-service from the Session Scoring page; the default static-asset
extension list still ships as the fallback.

- Backend — recv_snippet + generate_scoring_vcl accept an
  exclude_url_regex parameter; persisted in
  cfg.scoring.exclude_url_regex (None / "" = use default).
  update_recv_exclusion_regex orchestrator clones only the active
  version, swaps the recv snippet, validates, activates — ~5–15s vs.
  the full enable_scoring flow.
- New endpoints — GET /api/services/{id}/scoring/exclude-regex
  (returns current + default + effective) and PUT
  /api/services/{id}/scoring/exclude-regex?confirm=true (token-gated;
  audit-logged as scoring_exclude_regex_changed).
- Three-layer validation before any VCL ships:
  1. Input policy — length cap (2 KB), no double-quote / control
     chars, must compile under Python re.
  2. falco static analysis (github.com/ysugimoto/falco) on the
     assembled recv snippet (catches composition errors that slip past
     Python's compiler).
  3. Fastly's own VCL compiler at activate time.
- Frontend — ExcludeRegexCard on the overview tab: textarea
  pre-populated with current value, "Show default" toggle, "Reset to
  default" button, inline lint-error display, confirm-dialog before
  publish.
- Infra — falco v2.3.0 baked into the backend Docker image; production
  sets SCORING_REQUIRE_FALCO=1 so a missing binary fails closed
  instead of degrading to input-policy-only.

## Infrastructure

- Backend + frontend Docker base: python:3.12-slim-bullseye →
  python:3.12-slim-bookworm (cuts CVE-laden Debian 11 base; remaining
  13 high CVEs are deep-dependency / OpenSSL CVEs every major Python
  base inherits).
- Falco v2.3.0 in the backend image — required by the scoring-recv-
  snippet validator.
- Dependency freshness sweep on all four ecosystems:
  - Python: aiohttp 3.13.5 → 3.14.0, cfn-lint 1.51.2 → 1.51.4,
    distlib, filelock, idna 3.17 → 3.18, joserfc 1.6.8 → 1.7.0.
  - Frontend: @tanstack/react-query 5.100.14 → 5.101.0 (+ devtools),
    @types/react 19.2.15 → 19.2.16, eslint-config-next 16.2.6 →
    16.2.7, next 16.2.6 → 16.2.7, react / react-dom 19.2.6 → 19.2.7.
  - Rust: bitflags 2.11.1 → 2.12.1.
  - Deferred (major bumps reserved for 1.2): TypeScript 5.9 → 6.0
    (compiler-API breaking changes); Fastly Rust SDK 0.11 → 0.12
    (Compute@Edge API churn); jsdom / eslint / vitest where we're
    already ahead of the npm "latest" tag.

## Versioning

Bumped to 1.1.0 in pyproject.toml, frontend/package.json, and the
FastAPI app.version. CHANGELOG updated under [1.1.0] - 2026-06-03
with Security + Infrastructure sections.

## Test coverage

  backend pytest    3,087   (+321 vs v1.0.0)
  Rust scorer          65   (+8)
  frontend vitest     265   (+13)
  VCL tests             9   (same)

New test files for this release:
  tests/utils/test_sql_validator.py            (60)
  tests/utils/test_vcl_validator.py            (18)
  tests/test_proxy_headers_regression.py       (10)
  tests/test_no_trace_leakage_sweep.py          (4)
  tests/routers/test_provision_teardown_auth.py (9)
  tests/routers/test_cross_tenant_scope.py      (9)
  tests/routers/test_scoring_exclude_regex.py   (9)

## Notes for reviewers

- Branch was squashed from 70+ commits; full per-commit history is in
  git reflog locally. The squash makes this reviewable as one
  semantic unit (v1.1.0 release) instead of paging through unrelated
  intermediate work.
- Every security fix has acceptance tests. OpenAPI snapshot
  regenerated.
- Audit-finding working docs (docs/security_remediation_final_*.md,
  audit-findings/) were intentionally .gitignored and cleaned up at
  the end of the cycle — a fresh audit will produce fresh artifacts.
- Stale v1.1.0 tag was deleted before the squash. After merge, tag
  main with v1.1.0 rather than the PR branch.

## Test plan

- [x] `make ci` passes locally
- [x] Deployed to dev VM (fastly-log-analysis in us-central1-a) — all
      three containers healthy, GET /api/health returns 200
- [x] Falco verified in production image: v2.3.0
- [x] Exclude-regex endpoint reachable + returns expected shape
- [x] Off-network attack probes: Host-spoof, SQL injection
      (read_csv_auto, information_schema, getenv, duckdb_secrets),
      unauthenticated teardown — all rejected with expected status codes
- [ ] Reviewer: open /admin/session-scoring, scroll to the URL
      exclusion regex card, paste a custom regex (e.g. \.(healthz)$),
      click Save → publish flow completes

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
dmichael-fastly added a commit that referenced this pull request Jun 3, 2026
…ecv exclusion override

This branch ships v1.1.0, three workstreams squashed from 70+ original
commits. 170+ files changed; `make ci` green (lint + format + mypy +
3,087 backend pytest + 9 vcl + 65 Rust scorer + 265 frontend vitest +
OSV no-vulns). Deployed and verified on the dev VM.

## 1 — Session scoring (Phases A / B / C)

End-to-end edge anomaly-detection pipeline for Fastly Compute. Layer 1
behavioural (cookie compliance, impossibly-fast browsing, robotic
dwell) + Layer 2 transition-matrix scoring + combined 0–100 quantized
score. Dual-implemented in Python (backend/scoring) and Rust
(compute/scorer); cross-language wire-format tests pin the AES-GCM
cookie codec byte-for-byte.

- Edge: Compute scorer, 6-snippet VCL preflight (recv / pass / fetch /
  deliver / miss / enforce), AES-GCM cookie carrying rotating sid +
  transition state, `fastly.ddos_detected` bypass.
- Backend: training pipeline, FOS-published matrix versioning,
  labelled-session retrain loop, /scoring/evaluation + /scoring/health
  + composite /scoring/dashboard, matrix version history + rollback,
  AES key rotation with grace window, sliding cookie lifetime,
  scoring audit log, threshold enforcement that 429s flagged requests
  at the edge within seconds of commit.
- Admin UI at /admin/session-scoring: StatusPanel with live ROC-AUC
  against accumulated labels, ScoringHealthCard, ThresholdSlider with
  counterfactual flag/pass preview + precision/recall, RocPrCurves,
  TopFlaggedTable, LabelsTab with click-to-view-events, RetrainButton,
  RotateKeyButton, MatrixVersionsCard, per-reason AUC breakdown,
  session-events viewer, ExcludeRegexCard (see §3), help popups.

See docs/session_scoring_runbook.md + docs/features.md for the runbook
and feature reference.

## 2 — Security remediation (Phases 0–4)

40-finding security audit closed in five phases. All fixes deployed
and verified. Full breakdown in the `### Security` block of
CHANGELOG.md 1.1.0; one-line summary per phase:

- Phase 0 — uvicorn `--proxy-headers` + Host-spoof bypass /
  leftmost-XFF / teardown-auth (#012, #013, #029, #017, #034). Three
  extras spotted during the IR sweep:
  - E1 Caddyfile peer-IP gate on `Fastly-Client-IP → XFF` rewrite
    (port 80 was open to `0.0.0.0/0`).
  - E2 docker json-file log rotation (50 MB × 10 compressed).
  - E3 `generate_analyst_invite` fail-fast on missing token +
    defensive Fastly-response shape check.
- Phase 1 — 14-finding trivial sweep: tenant scoping (#7 / #008 /
  #019), TOCTOU (#2), quarantine narrowing (#3), email-enum
  timing equalisation (#4), `isoparse` validation (#1),
  `service_id` path-traversal regex (#6), session re-sync (#010),
  rate-limiter bounds (#014), VCL UA/referer cap (#016), GET→POST CSRF
  (#020), read_only query-param removal (#026), stack-trace strip
  (#027 / #028) + sweep fixture.
- Phase 2 — backend/utils/sql_validator.py implements Decision B:
  statement-type whitelist + recursive parse-tree walker (catalog +
  function blocklists) + fail-closed parse + audit log + perf budget
  (#031 / #033 / #035). escape_sql_literal helper + characterisation
  tests at four ingest sites (#009). VCL preamble unsetting
  client-spoofable internal headers (#021). Origin-metric VCL
  log-injection gates (#015). Path-traversal cages in /api/download
  (#5) + cache cleanup (#022). SSH host-key pinning via
  configs/ssh_known_hosts with fail-safe _ensure_known_hosts (#011).
- Phase 3 — Fastly vcl_hash keys on full req.url not just path
  (#024). Next.js /admin middleware gates on Caddy-injected
  X-Proxied-By-Caddy marker not Host header (#032). Scorer
  Python+Rust parity: L1_SCORE_COOKIE_TAMPERED=100,
  L1_ROBOTIC_DWELL_LOW_S 0.5 → 0.20 (#036 / #037). #038 sliding-
  window mean documented as tracked follow-up.
- Phase 4 — cross-tenant scope enforcement on /api/alerts/* and
  /api/views/* with pre-flight get_alert_by_id / get_view_by_id
  helpers so unauthorised mutations never land (#039 / #040). NGWAF
  workspace listing token-gated (#018). #025 covered by Phase 0 #017.
  Cache-layer audit confirmed every per-tenant cache includes
  service_id in the key.

## 3 — Scoring-recv URL exclusion regex (new operator control)

The "which requests get sent to Compute" condition was previously a
hard-coded _ASSET_EXT_REGEX in code. Operators can now override it
per-service from the Session Scoring page; the default static-asset
extension list still ships as the fallback.

- Backend — recv_snippet + generate_scoring_vcl accept an
  exclude_url_regex parameter; persisted in
  cfg.scoring.exclude_url_regex (None / "" = use default).
  update_recv_exclusion_regex orchestrator clones only the active
  version, swaps the recv snippet, validates, activates — ~5–15s vs.
  the full enable_scoring flow.
- New endpoints — GET /api/services/{id}/scoring/exclude-regex
  (returns current + default + effective) and PUT
  /api/services/{id}/scoring/exclude-regex?confirm=true (token-gated;
  audit-logged as scoring_exclude_regex_changed).
- Three-layer validation before any VCL ships:
  1. Input policy — length cap (2 KB), no double-quote / control
     chars, must compile under Python re.
  2. falco static analysis (github.com/ysugimoto/falco) on the
     assembled recv snippet (catches composition errors that slip past
     Python's compiler).
  3. Fastly's own VCL compiler at activate time.
- Frontend — ExcludeRegexCard on the overview tab: textarea
  pre-populated with current value, "Show default" toggle, "Reset to
  default" button, inline lint-error display, confirm-dialog before
  publish.
- Infra — falco v2.3.0 baked into the backend Docker image; production
  sets SCORING_REQUIRE_FALCO=1 so a missing binary fails closed
  instead of degrading to input-policy-only.

## Infrastructure

- Backend + frontend Docker base: python:3.12-slim-bullseye →
  python:3.12-slim-bookworm (cuts CVE-laden Debian 11 base; remaining
  13 high CVEs are deep-dependency / OpenSSL CVEs every major Python
  base inherits).
- Falco v2.3.0 in the backend image — required by the scoring-recv-
  snippet validator.
- Dependency freshness sweep on all four ecosystems:
  - Python: aiohttp 3.13.5 → 3.14.0, cfn-lint 1.51.2 → 1.51.4,
    distlib, filelock, idna 3.17 → 3.18, joserfc 1.6.8 → 1.7.0.
  - Frontend: @tanstack/react-query 5.100.14 → 5.101.0 (+ devtools),
    @types/react 19.2.15 → 19.2.16, eslint-config-next 16.2.6 →
    16.2.7, next 16.2.6 → 16.2.7, react / react-dom 19.2.6 → 19.2.7.
  - Rust: bitflags 2.11.1 → 2.12.1.
  - Deferred (major bumps reserved for 1.2): TypeScript 5.9 → 6.0
    (compiler-API breaking changes); Fastly Rust SDK 0.11 → 0.12
    (Compute@Edge API churn); jsdom / eslint / vitest where we're
    already ahead of the npm "latest" tag.

## Versioning

Bumped to 1.1.0 in pyproject.toml, frontend/package.json, and the
FastAPI app.version. CHANGELOG updated under [1.1.0] - 2026-06-03
with Security + Infrastructure sections.

## Test coverage

  backend pytest    3,087   (+321 vs v1.0.0)
  Rust scorer          65   (+8)
  frontend vitest     265   (+13)
  VCL tests             9   (same)

New test files for this release:
  tests/utils/test_sql_validator.py            (60)
  tests/utils/test_vcl_validator.py            (18)
  tests/test_proxy_headers_regression.py       (10)
  tests/test_no_trace_leakage_sweep.py          (4)
  tests/routers/test_provision_teardown_auth.py (9)
  tests/routers/test_cross_tenant_scope.py      (9)
  tests/routers/test_scoring_exclude_regex.py   (9)

## Notes for reviewers

- Branch was squashed from 70+ commits; full per-commit history is in
  git reflog locally. The squash makes this reviewable as one
  semantic unit (v1.1.0 release) instead of paging through unrelated
  intermediate work.
- Every security fix has acceptance tests. OpenAPI snapshot
  regenerated.
- Audit-finding working docs (docs/security_remediation_final_*.md,
  audit-findings/) were intentionally .gitignored and cleaned up at
  the end of the cycle — a fresh audit will produce fresh artifacts.
- Stale v1.1.0 tag was deleted before the squash. After merge, tag
  main with v1.1.0 rather than the PR branch.

## Test plan

- [x] `make ci` passes locally
- [x] Deployed to dev VM (fastly-log-analysis in us-central1-a) — all
      three containers healthy, GET /api/health returns 200
- [x] Falco verified in production image: v2.3.0
- [x] Exclude-regex endpoint reachable + returns expected shape
- [x] Off-network attack probes: Host-spoof, SQL injection
      (read_csv_auto, information_schema, getenv, duckdb_secrets),
      unauthenticated teardown — all rejected with expected status codes
- [ ] Reviewer: open /admin/session-scoring, scroll to the URL
      exclusion regex card, paste a custom regex (e.g. \.(healthz)$),
      click Save → publish flow completes

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@dmichael-fastly dmichael-fastly force-pushed the session-scoring branch 2 times, most recently from e6650ee to 70929f0 Compare June 3, 2026 22:02
@dmichael-fastly dmichael-fastly changed the title v1.1.0: session scoring + 40-finding security remediation + scoring-recv exclusion override v1.1.0: session scoring + security hardening + scoring-recv exclusion override Jun 3, 2026
@dmichael-fastly dmichael-fastly force-pushed the session-scoring branch 20 times, most recently from dc91229 to a446c3a Compare June 4, 2026 17:54
@dmichael-fastly dmichael-fastly force-pushed the session-scoring branch 5 times, most recently from 19b2f14 to 231d96a Compare June 5, 2026 14:26
Squashed from the working set on session-scoring. Covers the session
scoring + dashboard performance work from the prior squash baseline
plus the recent additions:

- DUCKDB_POOL_MAX_SIZE env knob (was hardcoded to 8 per service)
- run.sh: compose NODE_OPTIONS instead of clobbering, refuse to bind
  ports commonly used by SSH tunnels to a remote backend/frontend
- Dashboard stale-view retry: detect when /api/dashboard/aggregates
  returns inconsistent results (metadata reports recent logs but every
  aggregation comes back empty) and let React Query retry up to twice.
  Mitigates the intermittent "no data" symptom during metadata_sync
  cron ticks; doesn't address the underlying writer contention.
- scripts/dev/sync-from-remote.sh: developer-only tool that mirrors a
  remote data tree locally and scrubs credentials/crons in the copied
  configs so the local backend can serve the synced volume without
  writing back.
- .vscode/ added to .gitignore (local editor config).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@dmichael-fastly

Copy link
Copy Markdown
Collaborator Author

Closing and re-opening as a fresh PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant