Skip to content

feat(storage): per-finding hexad emission (issue #33 S1)#55

Closed
hyperpolymath wants to merge 2 commits into
mainfrom
feat/issue-33-s1-finding-hexads
Closed

feat(storage): per-finding hexad emission (issue #33 S1)#55
hyperpolymath wants to merge 2 commits into
mainfrom
feat/issue-33-s1-finding-hexads

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

First slice of #33. Adds a per-WeakPoint hexad emission path to persist_assemblyline_report so the estate sweep can persist one hexad per finding in addition to the existing aggregate hexad. Subject identity is finding:<repo>:<file>:<line>:<category> — stable across runs, which is the property the upcoming S2 (campaign register-pr) and S3 (cross-repo query) slices need to join on without diffing JSON blobs.

What's new

  • HexadSemantic.finding: Option<FindingSemantic> — additive, skip_serializing_if = "Option::is_none", so older readers and stored hexads are unaffected.
  • FindingSemantic struct carries the modalities the issue specified: finding_id, repo_name, file, line, category, rule_id, rule_name, severity, description, first_seen_run, last_seen_run, framework.
  • rule_id / rule_name reuse the canonical SARIF mapping (src/report/sarif.rs::rule_id / rule_name lifted to pub(crate) — same PA001..PA025 codes the SARIF output already uses).
  • build_finding_hexads(report) -> Vec<PanicAttackHexad> — one hexad per non-suppressed WeakPoint across every repo result.
  • STORE_FINDING_HEXADS_ENV = "PANIC_ATTACK_STORE_FINDING_HEXADS" — when set non-empty AND StorageMode::VerisimDb is configured, persist_assemblyline_report writes one file per finding under <dir>/hexads/findings/ in addition to the existing aggregate hexad in <dir>/hexads/.

Behaviour preserved

  • Default path unchanged: env var unset → no per-finding writes. No CLI flag added; opt-in is purely env-driven.
  • Aggregate hexad still emitted in every VerisimDb run.
  • Suppressed WeakPoints skipped — keeps the hexad store aligned with fleet/CI counts (same convention as elsewhere).

Out of scope (S2 / S3)

  • Back-stamping first_seen_run from a prior hexad — S1 sets first_seen_run == last_seen_run. S2 is responsible.
  • HTTP push for finding hexads — file-side only for S1 to keep API chattiness out of the early surface.
  • New CLI subcommand (panic-attack campaign) — S2.
  • panic-attack query — S3.

Test plan

  • cargo test --all — 215 lib + 13 + 16 + 6 + 12 + 3 + 7 + 12 + 14 + 20 + 10 + 8 + 22 + 22 + 12 + 2 doc — all green.
  • cargo clippy --all-targets -- -D warnings — clean.
  • cargo fmt --all — clean.
  • 7 new tests covering: id stability across calls, id discrimination by category, one-hexad-per-WeakPoint, suppression skipped, canonical PA-codes, file write + JSON round-trip, env-var default-off.

Refs #33.

🤖 Generated with Claude Code

Adds a per-WeakPoint hexad path to persist_assemblyline_report so a
batch scan can persist one hexad per finding in addition to the existing
aggregate hexad. Subject identity is `finding:<repo>:<file>:<line>:<category>`,
chosen for cross-run stability so the upcoming S2 (campaign register-pr)
and S3 (query) slices can join on it without diffing JSON.

New public surface:
- HexadSemantic gains an optional `finding: Option<FindingSemantic>`
  (additive, skip_serializing_if = none → existing consumers unaffected).
- FindingSemantic carries finding_id / repo / file / line / category /
  rule_id / rule_name / severity / description / first_seen_run /
  last_seen_run / framework. rule_id and rule_name reuse the canonical
  SARIF mapping (sarif.rs::rule_id / rule_name now pub(crate)).
- build_finding_hexads(report) -> Vec<PanicAttackHexad>.
- STORE_FINDING_HEXADS_ENV = "PANIC_ATTACK_STORE_FINDING_HEXADS" — when
  set non-empty AND StorageMode::VerisimDb is configured,
  persist_assemblyline_report writes one file per finding under
  `<dir>/hexads/findings/`.

Behaviour preserved:
- Default path unchanged (env var off → no per-finding writes).
- Aggregate hexad still emitted in every VerisimDb run.
- Suppressed WeakPoints are skipped, keeping the store aligned with
  fleet/CI counts.

S1 sets first_seen_run == last_seen_run; back-stamping from a prior
hexad is S2's job (per the issue), not S1's.

Tests: 7 new (id stability, category discrimination, count per WP,
suppression skip, canonical rule_id/name, file write + round-trip,
env-var default-off). Full suite: 215 lib + 13 + 16 + 6 + 12 + 3 + 7
+ 12 + 14 + 20 + 10 + 8 + 22 + 22 + 12 + 2 doc — all green. Clippy
clean with -D warnings.

Refs #33.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… prerequisite)

Recon flagged that PANIC_ATTACK_STORE_FINDING_HEXADS=1 was dead without a
manifest configuring reports.storage-targets=verisimdb. The env check sat
inside the VerisimDb arm of persist_assemblyline_report, but storage_modes()
defaulted to [Filesystem] only — so the operational opt-in path was
unreachable without a fully-populated 0-AI-MANIFEST.a2ml.

Add resolve_storage_modes() that augments declared modes with VerisimDb when
the env var is truthy. Wire it at the single binding site in main.rs.

Smoke-verified end-to-end: assemblyline scan against a tiny multi-repo dir
now emits 5 per-finding hexads under hexads/findings/ from env var alone.

3 new tests + 1 existing finding_hexads_disabled_by_default test now share
a Mutex to serialize their env-var mutations under cargo's parallel runner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 45 issues detected

Severity Count
🔴 Critical 4
🟠 High 16
🟡 Medium 25

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Nickel file missing SPDX-License-Identifier header (1 occurrences, CWE-1104)",
    "type": "ncl_missing_spdx",
    "file": "/home/runner/work/panic-attack/panic-attack/reports/panic-attack-20260211180017.ncl",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "expect() in hot path (2 occurrences, CWE-754)",
    "type": "expect_in_hot_path",
    "file": "/home/runner/work/panic-attack/panic-attack/src/attestation/chain.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "unwrap_or(0) with dangerous default (1 occurrences, CWE-754)",
    "type": "unwrap_dangerous_default",
    "file": "/home/runner/work/panic-attack/panic-attack/src/attestation/evidence.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "unwrap_or(0) with dangerous default (1 occurrences, CWE-754)",
    "type": "unwrap_dangerous_default",
    "file": "/home/runner/work/panic-attack/panic-attack/src/ambush/mod.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "unwrap_or(0) with dangerous default (3 occurrences, CWE-754)",
    "type": "unwrap_dangerous_default",
    "file": "/home/runner/work/panic-attack/panic-attack/src/kanren/strategy.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "unwrap_or(0) with dangerous default (3 occurrences, CWE-754)",
    "type": "unwrap_dangerous_default",
    "file": "/home/runner/work/panic-attack/panic-attack/src/axial/mod.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "expect() in hot path (4 occurrences, CWE-754)",
    "type": "expect_in_hot_path",
    "file": "/home/runner/work/panic-attack/panic-attack/src/assail/analyzer.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "unwrap() without prior check -- DoS via panic (4 occurrences, CWE-754)",
    "type": "unwrap_without_check",
    "file": "/home/runner/work/panic-attack/panic-attack/benches/scan_bench.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "high"
  },
  {
    "reason": "expect() in hot path (2 occurrences, CWE-754)",
    "type": "expect_in_hot_path",
    "file": "/home/runner/work/panic-attack/panic-attack/benches/scan_bench.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

hyperpolymath added a commit that referenced this pull request May 27, 2026
## Summary

Adds `panic-attack sweep-tracker` subcommand — an issue-#32-shaped sweep
tracker derived from the per-finding (issue #33 S1) and campaign-state
(issue #33 S2) hexad stores. Complements (does not replace) the existing
per-finding `campaign status` table.

Distinct from `campaign status`:
- **Hierarchical**, not flat — grouped by repo and/or by category.
- **Estate summary** header: total findings, repos, criticals, highs,
  PR-filed, dismissed, open-no-PR.
- **Always sourced from the finding store**: a finding with no campaign
  hexad still appears (state `open`); `campaign status` shows only
  rows with a campaign event.

### CLI

```
panic-attack sweep-tracker [--verisimdb-dir DIR] [--output FILE]
                          [--by-repo | --by-category]
```

No flag = both sections. `--by-repo` / `--by-category` select one
section only (mutually exclusive via clap arg group).

### Output shape

```
# Estate sweep tracker

_Generated <ISO>_

**Estate summary**: N findings across R repos (C critical, H high).
M PR-filed, D dismissed, U open (no PR).

## By repo

### alpha (2 findings, 1 critical)
- [x] PA001 src/lib.rs:23 — pr-merged ([#42](https://github.com/...))
- [ ] PA004 src/ffi.rs:7  — open
...
```

### Determinism

- Repos sorted alphabetically.
- Findings within each repo sorted by `(rule_id, file, line,
finding_id)`.
- Categories sorted by `rule_id`.

## Implementation

- New module `src/sweep_tracker/` with public `render_report(base_dir,
shape)` and `ReportShape::{ByRepo, ByCategory, Both}` (default `Both`).
- Reuses `storage::load_finding_hexads` /
`storage::load_campaign_hexads`
  — no new I/O paths.
- New CLI variant `Commands::SweepTracker` wired in `src/main.rs`.
- 7 unit tests (1 extra above the spec floor of 5):
  empty-store, by-repo grouping, by-category grouping, campaign-state
  join (open / pr-merged / dismissed), deterministic ordering,
  both-shape ordering, PR-number label parser.

## Notes on base

This PR depends on issue-#33 S1 (`feat/issue-33-s1-finding-hexads`,
PR #55) and S2 (`feat/issue-33-s2-campaign-state`, PR #56) for the
loaders and `CampaignSemantic` type. Branched off the S2 tip so the
diff is minimal; base is `main` as standing policy. Once #55 and #56
land this PR's diff will narrow to just `src/sweep_tracker/` plus the
small wiring delta in `src/lib.rs` + `src/main.rs`.

## Test plan

- [x] `cargo test --lib` — 227 tests pass, including 7 new sweep_tracker
tests
- [x] `cargo clippy --all-targets -- -D warnings` clean
- [x] `cargo fmt --all -- --check` clean
- [ ] Smoke-test against a real `verisimdb-data/` produced by an
  assemblyline run with `PANIC_ATTACK_STORE_FINDING_HEXADS=1`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath added a commit that referenced this pull request May 27, 2026
## Summary

Third slice of
[#33](#33). Adds a
\`panic-attack query\` subcommand that evaluates a small S-expression
query language over the per-finding hexads from #55 (S1) joined with the
campaign-state hexads from #56 (S2).

> Stacked on #56 — diff against \`main\` includes the S1+S2 changes
until those land; this PR rebases cleanly.

## Supported forms (S3 initial)

\`\`\`scheme
(category UnsafeCode)
(rule-id PA004)
(severity Critical)
(repo <name-substring>)          ; case-insensitive substring
(file <path-substring>)          ; case-insensitive substring
(pr-state pr-filed|pr-merged|pr-closed|dismissed|nil)
(and <expr> <expr> ...)
(or  <expr> <expr> ...)
(not <expr>)
\`\`\`

\`(pr-state nil)\` matches any finding **without a campaign hexad** —
i.e. the operationally important "open work not yet PR'd" view that the
estate-sweep campaign needs most.

## CLI

\`\`\`
panic-attack query "(and (category UnsafeCode) (pr-state nil))"
panic-attack query "(severity Critical)" --format json
panic-attack query "(repo alpha)" --verisimdb-dir verisimdb-data
\`\`\`

Default output: fixed-width table. JSON via \`--format json\`.

## Deferred to S3 follow-ups

Three follow-ups will land in the next PRs in this stack:
- \`(crosslang :from FFI :to ProofDrift)\` — needs integration with
\`src/kanren/crosslang.rs\`.
- \`(diff :since <date> :category <X>)\` — needs an explicit
baseline-run cursor.
- \`panic-attack campaign poll\` (was S2 scope cut) — GitHub PR-state
polling.

## Implementation notes

- Small hand-rolled S-expression tokenizer/parser (~170 LOC) — doesn't
depend on the a2ml parser since the query surface is narrower.
- Evaluator pre-joins findings with their latest campaign event
(newest-by-\`created_at\` wins per \`finding_id\`) before filtering.
\`(pr-state ...)\` is a free clause inside \`and\`/\`or\` rather than a
special case.

## Test plan

- [x] \`cargo test --lib\` — 239 green (19 new in \`src/query/\`).
- [x] \`cargo clippy --all-targets -- -D warnings\` — clean.
- [x] \`cargo fmt --all\` — clean.
- [x] End-to-end CLI smoke: hand-crafted finding hexad + \`campaign
register-pr\` + \`query (and (category UnsafeCode) (pr-state pr-filed))
--format json\` returns the expected JSON with the joined campaign
state.

Refs #33. Stacked on #56 (S2) → #55 (S1).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath added a commit that referenced this pull request May 27, 2026
… follow-up) (#61)

## Summary
- Adds a `HexadSemantic.crosslang: Option<CrosslangSemantic>` facet and
a
  `build_crosslang_hexads(...)` helper that drives the kanren
`CrossLangAnalyzer` per repo (ingest → extract → load_rules → analyze →
  query_interactions) and emits one hexad per derived
  `CrossLangInteraction`.
- New env var `PANIC_ATTACK_STORE_CROSSLANG_HEXADS` (separate from
  `PANIC_ATTACK_STORE_FINDING_HEXADS`) opts a run into emission;
  `persist_assemblyline_report` writes to `<dir>/hexads/crosslang/`
  file-side only.
- Adds `load_crosslang_hexads(base_dir)` so the paired query-evaluator
PR
  can match against persisted facts; falls back to empty `Vec` when the
dir is missing (the evaluator treats that as "use co-occurrence proxy").

## Why
Tightens the `(crosslang :from :to)` query from a same-repo
co-occurrence
proxy to a true FFI/cross-language reachability check against
persisted kanren-derived facts. PR 1 of a 2-PR stack; PR 2 switches the
evaluator over while preserving fall-back semantics.

## Test plan
- [x] `cargo test --lib` — 252 tests pass, including 4 new
      `storage::tests::*crosslang*` cases (build-empty, build-from-FFI,
      write/read roundtrip + missing-dir, env-var default-off + opt-in).
- [x] `cargo clippy --all-targets -- -D warnings` clean.
- [x] `cargo fmt --all` no diff.

Stacks under: issue #33 S1/S2/S3 PRs (#55, #56, #57, #58). Filed against
`main` per orphan-trap rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hyperpolymath
Copy link
Copy Markdown
Owner Author

Superseded — content already absorbed into main via #62 (sweep-tracker PR branched off S2 tip and carried the S1+S2 storage scaffold into main directly). All FindingSemantic / build_finding_hexads / SARIF rule_id pub(crate) changes are present in main as of 09b80f4. Closing as no-op rebase target.

auto-merge was automatically disabled May 27, 2026 13:30

Pull request was closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant