Skip to content

feat(assail): exempt JSON-LD / JSON-Schema identifier URIs from InsecureProtocol#53

Merged
hyperpolymath merged 1 commit into
mainfrom
fix/cross-lang-jsonld-aware-detector
May 27, 2026
Merged

feat(assail): exempt JSON-LD / JSON-Schema identifier URIs from InsecureProtocol#53
hyperpolymath merged 1 commit into
mainfrom
fix/cross-lang-jsonld-aware-detector

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

The cross-language InsecureProtocol detector was flagging JSON-LD @type, @id, @context namespace URIs and JSON-Schema $schema identifiers as if they were configured HTTP endpoints. They aren't: per spec, those URIs are namespace identifiers (often historical http:// even for schemas served over HTTPS or not at all) and are never dereferenced at runtime.

Stacked on #52 (line-comment stripper) — that PR handled doc-comment FPs; this one handles JSON-LD literal FPs that survive into runtime code.

Why a detector enhancement rather than verisimdb or the user-classification registry

Approach Verdict
VeriSimDB Storage + query, not a classifier. Cannot pre-empt FP at detection time — would persist the FP and need a downstream rule. Wrong layer.
User-classification registry (audits/assail-classifications.a2ml) Right tool for per-instance audited TPs ("UnsafeCode in zig_bridge.rs §1"). Wrong tool for a categorical FP class shared by every JSON-LD / JSON-Schema consumer in the estate — would require N entries across N repos.
Detector enhancement (this PR) Removes the recurring tax across every estate repo with a single rule. The FP class is well-defined by spec, low-risk to suppress.

Fix

New RE_HTTP_JSONLD_IDENTIFIER regex that matches the standard JSON-LD / JSON-Schema identifier keys (scalar or array form) and subtracts those hits from the InsecureProtocol total before reporting.

Pattern Subtracted
\"@type\": \"http://...\"
\"@id\": \"http://...\"
\"@context\": \"http://...\"
\"types\": [\"http://...\"]
\"$schema\": \"http://...\"
\"url\": \"http://...\" ✗ — not an identifier key
client.get(\"http://...\") ✗ — bare endpoint

Exempted keys: @id, @type, @context, @vocab, @graph (JSON-LD); id, type, types (common shorthands); $schema, $id, $ref (JSON Schema).

Regression coverage

7 new tests in assail::analyzer::tests (via a shared count_http_findings test helper):

Test Asserts
jsonld_at_type_uri_is_exempt {\"@type\": \"http://...\"} → 0 findings
jsonld_at_id_uri_is_exempt {\"@id\": \"http://...\"} → 0 findings
jsonld_at_context_uri_is_exempt {\"@context\": \"http://...\"} → 0 findings
jsonld_types_array_is_exempt {\"types\": [\"http://...\"]} → 0 findings (exact self-scan repro)
json_schema_dollar_schema_is_exempt {\"$schema\": \"http://...\"} → 0 findings
real_endpoint_url_is_still_flagged client.get(\"http://...\") → >0 findings (invariant)
endpoint_key_named_url_is_still_flagged {\"url\": \"http://...\"} → >0 findings (invariant)

Test URLs are runtime-composed (format!(\"htt{}p://...\", \"\")) so the source itself contains no literal http://[alphanum] substring — prevents a meta-circular self-scan finding when panic-attack scans its own analyzer.rs.

Verification

All 10 remaining findings are intentional (test unwraps, examples/vulnerable_program.rs unsafe blocks, scaffold flake.nix, etc.).

🤖 Generated with Claude Code

@hyperpolymath hyperpolymath enabled auto-merge (squash) May 26, 2026 11:09
…ureProtocol

The cross-language InsecureProtocol detector was flagging JSON-LD `@type`,
`@id`, `@context` namespace URIs and JSON-Schema `$schema` identifiers
as if they were configured HTTP endpoints. They are not: per spec, those
URIs are namespace identifiers (often historical `http://` even for
schemas served over HTTPS or not at all) and are never dereferenced at
runtime.

Choice rationale (vs verisimdb / user-classification registry):

- VeriSimDB is storage + query, not a classifier — it cannot pre-empt
  an FP at detection time; it would just persist the FP and need a
  downstream rule.
- The user-classification registry (`audits/assail-classifications.a2ml`)
  is the right tool for per-instance audited TPs (`UnsafeCode in
  zig_bridge.rs §1` etc.), but JSON-LD identifier URIs are a
  CATEGORICAL false-positive class shared by every JSON-LD / JSON-Schema
  consumer in the estate. Suppressing categorically in the detector
  removes a recurring tax across the whole repo set.

Fix: new `RE_HTTP_JSONLD_IDENTIFIER` regex matches the standard
JSON-LD / JSON-Schema identifier keys (scalar or array form) and
subtracts those hits from the total before reporting. Both shapes
are covered:

  {"@type":  "http://..."}
  {"types":  ["http://..."]}
  {"$schema": "http://..."}

Exempted keys: @id, @type, @context, @vocab, @graph (JSON-LD);
id, type, types (common shorthands); $schema, $id, $ref (JSON Schema).

Genuine endpoints remain flagged. A field keyed `"url"`, `"endpoint"`,
`"api_url"` etc. is not in the exempt set, so a real config URL like
`{"url": "http://insecure.example.com"}` still produces a finding.

Test fixtures use a runtime-composed URL (`format!("htt{}p://...","")`)
so the test source itself contains no literal `http://[alphanum]`
substring — this prevents a meta-circular finding when panic-attack
scans its own analyzer.rs.

Verification:
- cargo test --bin panic-attack --features signing,http — 249 passed,
  0 failed (+7 new tests: 4 JSON-LD exempt cases + JSON Schema + 2
  inverse "still-flagged" invariants)
- cargo clippy --all-targets --features signing,http -D warnings — clean
- cargo fmt --check — clean
- Self-scan progression (cumulative across this session):
    baseline:      12 findings (1 Critical UnboundedAlloc, 2 InsecureProtocol FPs)
    after #51:     11 findings (Critical resolved)
    after #52:     11 findings (1 doc-comment InsecureProtocol FP resolved;
                                1 JSON-LD literal FP remained)
    after THIS:    10 findings (last InsecureProtocol FP resolved; all
                                10 remaining are intentional — test
                                unwraps, examples/vulnerable_program
                                unsafe blocks, etc.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hyperpolymath hyperpolymath force-pushed the fix/cross-lang-jsonld-aware-detector branch from 59e62fd to ec868fd Compare May 27, 2026 12:28
@hyperpolymath hyperpolymath merged commit 016791c into main May 27, 2026
@hyperpolymath hyperpolymath deleted the fix/cross-lang-jsonld-aware-detector branch May 27, 2026 12:28
hyperpolymath added a commit that referenced this pull request May 27, 2026
…ic package-extension pattern) (#71)

## Summary

Adds a small predicate at the `analyze_julia` DCE detection site that
subtracts the Julia package-extension idiom: `*Ext.jl` files and the
conventional `ext/<Name>.jl` directory layout. These files use `eval` /
`Meta.parse` legitimately as part of the language's extension mechanism.

Mirrors the shape of #53 (JSON-LD InsecureProtocol exemption) — small
guard inline at the detector, plus regression tests for the exempt and
non-exempt cases.

## Motivation

`julia-ecosystem#6` logged 209 panic-attack findings of which ~202 were
this exact pattern. Without an exemption, every Julia repo with a single
package extension produces a flood of false positives that drowns out
real findings.

## Changes

- `src/assail/analyzer.rs` (analyze_julia): guard adds
`is_julia_package_extension` check matching `*Ext.jl` / `ext/` / `/ext/`
paths before pushing the `DynamicCodeExecution` WeakPoint.
- New tests in `#[cfg(test)] mod tests`:
- `julia_ext_jl_dce_is_exempt` — \`FooExt.jl\` with \`Meta.parse\` is
exempt
- `julia_ext_dir_dce_is_exempt` — \`ext/MyExtension.jl\` with \`eval\`
is exempt
- `julia_regular_file_still_flags_eval` — non-extension file still flags

## Test plan

- [x] \`cargo test --lib julia_\` — all 3 new tests pass locally
- [ ] CI green
- [ ] Re-scan julia-ecosystem to confirm ~202 findings drop

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 74 issues detected

Severity Count
🔴 Critical 7
🟠 High 16
🟡 Medium 51

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Issue in boj-build.yml",
    "type": "unknown",
    "file": "boj-build.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in cargo-audit.yml",
    "type": "unknown",
    "file": "cargo-audit.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in casket-pages.yml",
    "type": "unknown",
    "file": "casket-pages.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in casket-pages.yml",
    "type": "unknown",
    "file": "casket-pages.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in codeql.yml",
    "type": "unknown",
    "file": "codeql.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in coverage.yml",
    "type": "unknown",
    "file": "coverage.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dependency-review.yml",
    "type": "unknown",
    "file": "dependency-review.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "unknown",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "unknown",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  },
  {
    "reason": "Issue in dogfood-gate.yml",
    "type": "unknown",
    "file": "dogfood-gate.yml",
    "action": "flag",
    "rule_module": "workflow_audit",
    "severity": "medium"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant