fix: bound three unbounded file reads (resolves self-scan Critical)#51
Merged
hyperpolymath merged 1 commit intoMay 26, 2026
Merged
Conversation
Self-scanning panic-attack with the latest release surfaced a Critical UnboundedAllocation in attestation/mod.rs and, on closer inspection, two sibling sites the analyzer's per-file heuristic glossed over. All three follow the established pattern of every other read in the codebase (File::open + .take(LIMIT) + .read_to_string) and were the residual violations of the "every file read must be capacity-bounded" invariant. Fixes: 1. src/attestation/mod.rs::verify_attestation_file — was reading the attestation envelope with fs::read_to_string. JSON envelopes are small (well under 1 MiB legitimately); capped at 16 MiB. 2. src/assemblyline.rs::load_cache_file — was reading the fingerprint cache JSON with fs::read_to_string. Caches grow with estate size; capped at 256 MiB (large enough for multi-thousand-repo rollups). 3. src/assail/mod.rs::load_user_classifications — was reading the user-classification a2ml registry with fs::read_to_string. Hand- edited audit files; capped at 4 MiB. Verification: - cargo build / clippy --all-targets --features signing,http -D warnings — clean - cargo test --bin panic-attack --features signing,http — 236 passed, 0 failed - cargo fmt --check — clean - self-scan before: 12 findings (1 Critical UnboundedAllocation in attestation) - self-scan after: 11 findings (Critical resolved; residual findings are intentional — examples/vulnerable_program.rs unsafe blocks, test unwraps, etc.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🔍 Hypatia Security ScanFindings: 48 issues detected
View findings[
{
"reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Nickel file missing SPDX-License-Identifier header (1 occurrences, CWE-1104)",
"type": "ncl_missing_spdx",
"file": "/home/runner/work/panic-attack/panic-attack/reports/panic-attack-20260211180017.ncl",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
},
{
"reason": "expect() in hot path (2 occurrences, CWE-754)",
"type": "expect_in_hot_path",
"file": "/home/runner/work/panic-attack/panic-attack/src/attestation/chain.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
},
{
"reason": "unwrap_or(0) with dangerous default (1 occurrences, CWE-754)",
"type": "unwrap_dangerous_default",
"file": "/home/runner/work/panic-attack/panic-attack/src/attestation/evidence.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "critical"
},
{
"reason": "unwrap_or(0) with dangerous default (1 occurrences, CWE-754)",
"type": "unwrap_dangerous_default",
"file": "/home/runner/work/panic-attack/panic-attack/src/ambush/mod.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "critical"
},
{
"reason": "unwrap_or(0) with dangerous default (3 occurrences, CWE-754)",
"type": "unwrap_dangerous_default",
"file": "/home/runner/work/panic-attack/panic-attack/src/kanren/strategy.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "critical"
},
{
"reason": "unwrap_or(0) with dangerous default (3 occurrences, CWE-754)",
"type": "unwrap_dangerous_default",
"file": "/home/runner/work/panic-attack/panic-attack/src/axial/mod.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "critical"
},
{
"reason": "expect() in hot path (4 occurrences, CWE-754)",
"type": "expect_in_hot_path",
"file": "/home/runner/work/panic-attack/panic-attack/src/assail/analyzer.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
},
{
"reason": "unwrap() without prior check -- DoS via panic (4 occurrences, CWE-754)",
"type": "unwrap_without_check",
"file": "/home/runner/work/panic-attack/panic-attack/benches/scan_bench.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "high"
},
{
"reason": "expect() in hot path (2 occurrences, CWE-754)",
"type": "expect_in_hot_path",
"file": "/home/runner/work/panic-attack/panic-attack/benches/scan_bench.rs",
"action": "flag",
"rule_module": "code_safety",
"severity": "medium"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
4 tasks
hyperpolymath
added a commit
that referenced
this pull request
May 27, 2026
…ureProtocol
The cross-language InsecureProtocol detector was flagging JSON-LD `@type`,
`@id`, `@context` namespace URIs and JSON-Schema `$schema` identifiers
as if they were configured HTTP endpoints. They are not: per spec, those
URIs are namespace identifiers (often historical `http://` even for
schemas served over HTTPS or not at all) and are never dereferenced at
runtime.
Choice rationale (vs verisimdb / user-classification registry):
- VeriSimDB is storage + query, not a classifier — it cannot pre-empt
an FP at detection time; it would just persist the FP and need a
downstream rule.
- The user-classification registry (`audits/assail-classifications.a2ml`)
is the right tool for per-instance audited TPs (`UnsafeCode in
zig_bridge.rs §1` etc.), but JSON-LD identifier URIs are a
CATEGORICAL false-positive class shared by every JSON-LD / JSON-Schema
consumer in the estate. Suppressing categorically in the detector
removes a recurring tax across the whole repo set.
Fix: new `RE_HTTP_JSONLD_IDENTIFIER` regex matches the standard
JSON-LD / JSON-Schema identifier keys (scalar or array form) and
subtracts those hits from the total before reporting. Both shapes
are covered:
{"@type": "http://..."}
{"types": ["http://..."]}
{"$schema": "http://..."}
Exempted keys: @id, @type, @context, @vocab, @graph (JSON-LD);
id, type, types (common shorthands); $schema, $id, $ref (JSON Schema).
Genuine endpoints remain flagged. A field keyed `"url"`, `"endpoint"`,
`"api_url"` etc. is not in the exempt set, so a real config URL like
`{"url": "http://insecure.example.com"}` still produces a finding.
Test fixtures use a runtime-composed URL (`format!("htt{}p://...","")`)
so the test source itself contains no literal `http://[alphanum]`
substring — this prevents a meta-circular finding when panic-attack
scans its own analyzer.rs.
Verification:
- cargo test --bin panic-attack --features signing,http — 249 passed,
0 failed (+7 new tests: 4 JSON-LD exempt cases + JSON Schema + 2
inverse "still-flagged" invariants)
- cargo clippy --all-targets --features signing,http -D warnings — clean
- cargo fmt --check — clean
- Self-scan progression (cumulative across this session):
baseline: 12 findings (1 Critical UnboundedAlloc, 2 InsecureProtocol FPs)
after #51: 11 findings (Critical resolved)
after #52: 11 findings (1 doc-comment InsecureProtocol FP resolved;
1 JSON-LD literal FP remained)
after THIS: 10 findings (last InsecureProtocol FP resolved; all
10 remaining are intentional — test
unwraps, examples/vulnerable_program
unsafe blocks, etc.)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hyperpolymath
added a commit
that referenced
this pull request
May 27, 2026
…ureProtocol (#53) ## Summary The cross-language InsecureProtocol detector was flagging JSON-LD `@type`, `@id`, `@context` namespace URIs and JSON-Schema `$schema` identifiers as if they were configured HTTP endpoints. They aren't: per spec, those URIs are namespace identifiers (often historical `http://` even for schemas served over HTTPS or not at all) and are never dereferenced at runtime. **Stacked on #52** (line-comment stripper) — that PR handled doc-comment FPs; this one handles JSON-LD literal FPs that survive into runtime code. ## Why a detector enhancement rather than verisimdb or the user-classification registry | Approach | Verdict | |---|---| | **VeriSimDB** | Storage + query, not a classifier. Cannot pre-empt FP at detection time — would persist the FP and need a downstream rule. Wrong layer. | | **User-classification registry** (`audits/assail-classifications.a2ml`) | Right tool for per-instance audited TPs (\"UnsafeCode in `zig_bridge.rs` §1\"). Wrong tool for a **categorical** FP class shared by every JSON-LD / JSON-Schema consumer in the estate — would require N entries across N repos. | | **Detector enhancement** (this PR) | Removes the recurring tax across every estate repo with a single rule. The FP class is well-defined by spec, low-risk to suppress. | ## Fix New `RE_HTTP_JSONLD_IDENTIFIER` regex that matches the standard JSON-LD / JSON-Schema identifier keys (scalar or array form) and subtracts those hits from the InsecureProtocol total before reporting. | Pattern | Subtracted | |---|---| | `\"@type\": \"http://...\"` | ✓ | | `\"@id\": \"http://...\"` | ✓ | | `\"@context\": \"http://...\"` | ✓ | | `\"types\": [\"http://...\"]` | ✓ | | `\"$schema\": \"http://...\"` | ✓ | | `\"url\": \"http://...\"` | ✗ — not an identifier key | | `client.get(\"http://...\")` | ✗ — bare endpoint | Exempted keys: `@id`, `@type`, `@context`, `@vocab`, `@graph` (JSON-LD); `id`, `type`, `types` (common shorthands); `$schema`, `$id`, `$ref` (JSON Schema). ## Regression coverage 7 new tests in `assail::analyzer::tests` (via a shared `count_http_findings` test helper): | Test | Asserts | |---|---| | `jsonld_at_type_uri_is_exempt` | `{\"@type\": \"http://...\"}` → 0 findings | | `jsonld_at_id_uri_is_exempt` | `{\"@id\": \"http://...\"}` → 0 findings | | `jsonld_at_context_uri_is_exempt` | `{\"@context\": \"http://...\"}` → 0 findings | | `jsonld_types_array_is_exempt` | `{\"types\": [\"http://...\"]}` → 0 findings (exact self-scan repro) | | `json_schema_dollar_schema_is_exempt` | `{\"$schema\": \"http://...\"}` → 0 findings | | `real_endpoint_url_is_still_flagged` | `client.get(\"http://...\")` → >0 findings (invariant) | | `endpoint_key_named_url_is_still_flagged` | `{\"url\": \"http://...\"}` → >0 findings (invariant) | Test URLs are runtime-composed (`format!(\"htt{}p://...\", \"\")`) so the source itself contains no literal `http://[alphanum]` substring — prevents a meta-circular self-scan finding when panic-attack scans its own `analyzer.rs`. ## Verification - [x] `cargo test --bin panic-attack --features signing,http` — **249 passed** (+7 new tests) - [x] `cargo clippy --all-targets --features signing,http -- -D warnings` — clean - [x] `cargo fmt --check` — clean - [x] **Self-scan progression** (cumulative): - baseline: 12 findings (1 Critical UnboundedAlloc, 2 InsecureProtocol FPs) - after #51: 11 findings (Critical resolved) - after #52: 11 findings (doc-comment InsecureProtocol FP resolved; JSON-LD literal FP remained) - **after this PR: 10 findings (last InsecureProtocol FP resolved)** All 10 remaining findings are intentional (test unwraps, `examples/vulnerable_program.rs` unsafe blocks, scaffold `flake.nix`, etc.). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Self-scanning panic-attack with the latest release surfaced a Critical UnboundedAllocation in
attestation/mod.rs::verify_attestation_file— and, on closer inspection, two sibling sites the analyzer's per-file heuristic glossed over. All three were the last residual violations of the codebase-wide invariant that every file read must be capacity-bounded.Discovery path
Ran the freshly-built release of panic-attack on its own source tree:
Among the 12 findings, one Critical pointed at
src/attestation/mod.rs— the heuristic correctly fired because that file has zero other.take(...)calls. The other two sites (src/assemblyline.rs:120,src/assail/mod.rs:230) live in files that DO use.take(...)elsewhere, so the analyzer's per-file all-or-nothing classifier flagged them as bounded overall. Manual triage caught those — separate concern, doesn't need analyzer changes here.Fixes
All three follow the established pattern of every other read in the codebase (
File::open+.take(LIMIT)+.read_to_string).src/attestation/mod.rs::verify_attestation_filesrc/assemblyline.rs::load_cache_filesrc/assail/mod.rs::load_user_classificationsVerification
cargo build --bin panic-attack --features signing,http— cleancargo clippy --all-targets --features signing,http -- -D warnings— cleancargo test --bin panic-attack --features signing,http— 236 passed, 0 failedcargo fmt --check— cleanexamples/vulnerable_program.rsunsafe blocks, test unwraps, etc.)Out of scope
The analyzer's per-file heuristic for
read_to_stringboundedness is correct-by-default (files that use.take(...)are probably bounded throughout), but the two sites it missed argue for a per-call check. That's a detector improvement, separate from these data-path fixes; would file as a follow-up if the team thinks it's worth it.🤖 Generated with Claude Code