Skip to content

fix: bound three unbounded file reads (resolves self-scan Critical)#51

Merged
hyperpolymath merged 1 commit into
mainfrom
fix/bounded-reads-attestation-cache-classifications
May 26, 2026
Merged

fix: bound three unbounded file reads (resolves self-scan Critical)#51
hyperpolymath merged 1 commit into
mainfrom
fix/bounded-reads-attestation-cache-classifications

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Self-scanning panic-attack with the latest release surfaced a Critical UnboundedAllocation in attestation/mod.rs::verify_attestation_file — and, on closer inspection, two sibling sites the analyzer's per-file heuristic glossed over. All three were the last residual violations of the codebase-wide invariant that every file read must be capacity-bounded.

Discovery path

Ran the freshly-built release of panic-attack on its own source tree:

./target/release/panic-attack assail . --output /tmp/panic-self-scan.json

Among the 12 findings, one Critical pointed at src/attestation/mod.rs — the heuristic correctly fired because that file has zero other .take(...) calls. The other two sites (src/assemblyline.rs:120, src/assail/mod.rs:230) live in files that DO use .take(...) elsewhere, so the analyzer's per-file all-or-nothing classifier flagged them as bounded overall. Manual triage caught those — separate concern, doesn't need analyzer changes here.

Fixes

All three follow the established pattern of every other read in the codebase (File::open + .take(LIMIT) + .read_to_string).

Site Cap Rationale
src/attestation/mod.rs::verify_attestation_file 16 MiB JSON envelopes embed three hashes plus a small signature; legitimate envelopes < 1 MiB.
src/assemblyline.rs::load_cache_file 256 MiB Fingerprint caches grow with estate size; large enough for multi-thousand-repo rollups.
src/assail/mod.rs::load_user_classifications 4 MiB Hand-edited audit a2ml registries are small.

Verification

  • cargo build --bin panic-attack --features signing,http — clean
  • cargo clippy --all-targets --features signing,http -- -D warnings — clean
  • cargo test --bin panic-attack --features signing,http — 236 passed, 0 failed
  • cargo fmt --check — clean
  • Self-scan before: 12 findings (1 Critical UnboundedAllocation)
  • Self-scan after: 11 findings (Critical resolved; residuals are intentional — examples/vulnerable_program.rs unsafe blocks, test unwraps, etc.)
  • GPG-signed commit

Out of scope

The analyzer's per-file heuristic for read_to_string boundedness is correct-by-default (files that use .take(...) are probably bounded throughout), but the two sites it missed argue for a per-call check. That's a detector improvement, separate from these data-path fixes; would file as a follow-up if the team thinks it's worth it.

🤖 Generated with Claude Code

Self-scanning panic-attack with the latest release surfaced a Critical
UnboundedAllocation in attestation/mod.rs and, on closer inspection, two
sibling sites the analyzer's per-file heuristic glossed over. All three
follow the established pattern of every other read in the codebase
(File::open + .take(LIMIT) + .read_to_string) and were the residual
violations of the "every file read must be capacity-bounded" invariant.

Fixes:

1. src/attestation/mod.rs::verify_attestation_file — was reading the
   attestation envelope with fs::read_to_string. JSON envelopes are
   small (well under 1 MiB legitimately); capped at 16 MiB.

2. src/assemblyline.rs::load_cache_file — was reading the fingerprint
   cache JSON with fs::read_to_string. Caches grow with estate size;
   capped at 256 MiB (large enough for multi-thousand-repo rollups).

3. src/assail/mod.rs::load_user_classifications — was reading the
   user-classification a2ml registry with fs::read_to_string. Hand-
   edited audit files; capped at 4 MiB.

Verification:

- cargo build / clippy --all-targets --features signing,http -D warnings — clean
- cargo test --bin panic-attack --features signing,http — 236 passed, 0 failed
- cargo fmt --check — clean
- self-scan before: 12 findings (1 Critical UnboundedAllocation in attestation)
- self-scan after:  11 findings (Critical resolved; residual findings are
  intentional — examples/vulnerable_program.rs unsafe blocks, test
  unwraps, etc.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hyperpolymath hyperpolymath enabled auto-merge (squash) May 26, 2026 10:22
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 48 issues detected

Severity Count
🔴 Critical 4
🟠 High 16
🟡 Medium 28

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Nickel file missing SPDX-License-Identifier header (1 occurrences, CWE-1104)",
    "type": "ncl_missing_spdx",
    "file": "/home/runner/work/panic-attack/panic-attack/reports/panic-attack-20260211180017.ncl",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "expect() in hot path (2 occurrences, CWE-754)",
    "type": "expect_in_hot_path",
    "file": "/home/runner/work/panic-attack/panic-attack/src/attestation/chain.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "unwrap_or(0) with dangerous default (1 occurrences, CWE-754)",
    "type": "unwrap_dangerous_default",
    "file": "/home/runner/work/panic-attack/panic-attack/src/attestation/evidence.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "unwrap_or(0) with dangerous default (1 occurrences, CWE-754)",
    "type": "unwrap_dangerous_default",
    "file": "/home/runner/work/panic-attack/panic-attack/src/ambush/mod.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "unwrap_or(0) with dangerous default (3 occurrences, CWE-754)",
    "type": "unwrap_dangerous_default",
    "file": "/home/runner/work/panic-attack/panic-attack/src/kanren/strategy.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "unwrap_or(0) with dangerous default (3 occurrences, CWE-754)",
    "type": "unwrap_dangerous_default",
    "file": "/home/runner/work/panic-attack/panic-attack/src/axial/mod.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "expect() in hot path (4 occurrences, CWE-754)",
    "type": "expect_in_hot_path",
    "file": "/home/runner/work/panic-attack/panic-attack/src/assail/analyzer.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  },
  {
    "reason": "unwrap() without prior check -- DoS via panic (4 occurrences, CWE-754)",
    "type": "unwrap_without_check",
    "file": "/home/runner/work/panic-attack/panic-attack/benches/scan_bench.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "high"
  },
  {
    "reason": "expect() in hot path (2 occurrences, CWE-754)",
    "type": "expect_in_hot_path",
    "file": "/home/runner/work/panic-attack/panic-attack/benches/scan_bench.rs",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

@hyperpolymath hyperpolymath merged commit 0f6ea94 into main May 26, 2026
26 checks passed
@hyperpolymath hyperpolymath deleted the fix/bounded-reads-attestation-cache-classifications branch May 26, 2026 10:25
hyperpolymath added a commit that referenced this pull request May 27, 2026
…ureProtocol

The cross-language InsecureProtocol detector was flagging JSON-LD `@type`,
`@id`, `@context` namespace URIs and JSON-Schema `$schema` identifiers
as if they were configured HTTP endpoints. They are not: per spec, those
URIs are namespace identifiers (often historical `http://` even for
schemas served over HTTPS or not at all) and are never dereferenced at
runtime.

Choice rationale (vs verisimdb / user-classification registry):

- VeriSimDB is storage + query, not a classifier — it cannot pre-empt
  an FP at detection time; it would just persist the FP and need a
  downstream rule.
- The user-classification registry (`audits/assail-classifications.a2ml`)
  is the right tool for per-instance audited TPs (`UnsafeCode in
  zig_bridge.rs §1` etc.), but JSON-LD identifier URIs are a
  CATEGORICAL false-positive class shared by every JSON-LD / JSON-Schema
  consumer in the estate. Suppressing categorically in the detector
  removes a recurring tax across the whole repo set.

Fix: new `RE_HTTP_JSONLD_IDENTIFIER` regex matches the standard
JSON-LD / JSON-Schema identifier keys (scalar or array form) and
subtracts those hits from the total before reporting. Both shapes
are covered:

  {"@type":  "http://..."}
  {"types":  ["http://..."]}
  {"$schema": "http://..."}

Exempted keys: @id, @type, @context, @vocab, @graph (JSON-LD);
id, type, types (common shorthands); $schema, $id, $ref (JSON Schema).

Genuine endpoints remain flagged. A field keyed `"url"`, `"endpoint"`,
`"api_url"` etc. is not in the exempt set, so a real config URL like
`{"url": "http://insecure.example.com"}` still produces a finding.

Test fixtures use a runtime-composed URL (`format!("htt{}p://...","")`)
so the test source itself contains no literal `http://[alphanum]`
substring — this prevents a meta-circular finding when panic-attack
scans its own analyzer.rs.

Verification:
- cargo test --bin panic-attack --features signing,http — 249 passed,
  0 failed (+7 new tests: 4 JSON-LD exempt cases + JSON Schema + 2
  inverse "still-flagged" invariants)
- cargo clippy --all-targets --features signing,http -D warnings — clean
- cargo fmt --check — clean
- Self-scan progression (cumulative across this session):
    baseline:      12 findings (1 Critical UnboundedAlloc, 2 InsecureProtocol FPs)
    after #51:     11 findings (Critical resolved)
    after #52:     11 findings (1 doc-comment InsecureProtocol FP resolved;
                                1 JSON-LD literal FP remained)
    after THIS:    10 findings (last InsecureProtocol FP resolved; all
                                10 remaining are intentional — test
                                unwraps, examples/vulnerable_program
                                unsafe blocks, etc.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath added a commit that referenced this pull request May 27, 2026
…ureProtocol (#53)

## Summary

The cross-language InsecureProtocol detector was flagging JSON-LD
`@type`, `@id`, `@context` namespace URIs and JSON-Schema `$schema`
identifiers as if they were configured HTTP endpoints. They aren't: per
spec, those URIs are namespace identifiers (often historical `http://`
even for schemas served over HTTPS or not at all) and are never
dereferenced at runtime.

**Stacked on #52** (line-comment stripper) — that PR handled doc-comment
FPs; this one handles JSON-LD literal FPs that survive into runtime
code.

## Why a detector enhancement rather than verisimdb or the
user-classification registry

| Approach | Verdict |
|---|---|
| **VeriSimDB** | Storage + query, not a classifier. Cannot pre-empt FP
at detection time — would persist the FP and need a downstream rule.
Wrong layer. |
| **User-classification registry**
(`audits/assail-classifications.a2ml`) | Right tool for per-instance
audited TPs (\"UnsafeCode in `zig_bridge.rs` §1\"). Wrong tool for a
**categorical** FP class shared by every JSON-LD / JSON-Schema consumer
in the estate — would require N entries across N repos. |
| **Detector enhancement** (this PR) | Removes the recurring tax across
every estate repo with a single rule. The FP class is well-defined by
spec, low-risk to suppress. |

## Fix

New `RE_HTTP_JSONLD_IDENTIFIER` regex that matches the standard JSON-LD
/ JSON-Schema identifier keys (scalar or array form) and subtracts those
hits from the InsecureProtocol total before reporting.

| Pattern | Subtracted |
|---|---|
| `\"@type\": \"http://...\"` | ✓ |
| `\"@id\": \"http://...\"` | ✓ |
| `\"@context\": \"http://...\"` | ✓ |
| `\"types\": [\"http://...\"]` | ✓ |
| `\"$schema\": \"http://...\"` | ✓ |
| `\"url\": \"http://...\"` | ✗ — not an identifier key |
| `client.get(\"http://...\")` | ✗ — bare endpoint |

Exempted keys: `@id`, `@type`, `@context`, `@vocab`, `@graph` (JSON-LD);
`id`, `type`, `types` (common shorthands); `$schema`, `$id`, `$ref`
(JSON Schema).

## Regression coverage

7 new tests in `assail::analyzer::tests` (via a shared
`count_http_findings` test helper):

| Test | Asserts |
|---|---|
| `jsonld_at_type_uri_is_exempt` | `{\"@type\": \"http://...\"}` → 0
findings |
| `jsonld_at_id_uri_is_exempt` | `{\"@id\": \"http://...\"}` → 0
findings |
| `jsonld_at_context_uri_is_exempt` | `{\"@context\": \"http://...\"}` →
0 findings |
| `jsonld_types_array_is_exempt` | `{\"types\": [\"http://...\"]}` → 0
findings (exact self-scan repro) |
| `json_schema_dollar_schema_is_exempt` | `{\"$schema\":
\"http://...\"}` → 0 findings |
| `real_endpoint_url_is_still_flagged` | `client.get(\"http://...\")` →
>0 findings (invariant) |
| `endpoint_key_named_url_is_still_flagged` | `{\"url\":
\"http://...\"}` → >0 findings (invariant) |

Test URLs are runtime-composed (`format!(\"htt{}p://...\", \"\")`) so
the source itself contains no literal `http://[alphanum]` substring —
prevents a meta-circular self-scan finding when panic-attack scans its
own `analyzer.rs`.

## Verification

- [x] `cargo test --bin panic-attack --features signing,http` — **249
passed** (+7 new tests)
- [x] `cargo clippy --all-targets --features signing,http -- -D
warnings` — clean
- [x] `cargo fmt --check` — clean
- [x] **Self-scan progression** (cumulative):
- baseline: 12 findings (1 Critical UnboundedAlloc, 2 InsecureProtocol
FPs)
  - after #51: 11 findings (Critical resolved)
- after #52: 11 findings (doc-comment InsecureProtocol FP resolved;
JSON-LD literal FP remained)
  - **after this PR: 10 findings (last InsecureProtocol FP resolved)**

All 10 remaining findings are intentional (test unwraps,
`examples/vulnerable_program.rs` unsafe blocks, scaffold `flake.nix`,
etc.).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant