Skip to content

test(claims): CI coverage for claim 3 rug-pull + claim 4 nonce binding#356

Merged
imran-siddique merged 1 commit into
mainfrom
feat/claim3-4-ci-tests
Jun 26, 2026
Merged

test(claims): CI coverage for claim 3 rug-pull + claim 4 nonce binding#356
imran-siddique merged 1 commit into
mainfrom
feat/claim3-4-ci-tests

Conversation

@imran-siddique

Copy link
Copy Markdown
Contributor

What

Adds the pytest coverage for claim 3 (rug-pull detection) and claim 4 (TRACE Claim nonce binding) that the README CI table and the paper's evaluation section flagged as pending.

Closes #353, closes #354.

Tests

tests/unit/test_claim3_rug_pull_detection.py (6):

  • definition + catalog hash determinism
  • avalanche (>64/256 bits) on a one-sentence description tamper
  • aggregate catalog hash changes on a single-tool tamper
  • CatalogHashMismatch raised fail-closed when a tampered catalog is loaded under the approved pinned hash
  • approved catalog passes its own pinned hash
  • tamper is undetectable without pinning (the malicious sentence loads; only the pin turns it into a block)

tests/unit/test_claim4_trace_claim_nonce.py (6):

  • nonce determinism
  • session binding (nonce changes with session_id)
  • instance binding (nonce changes with TEE key)
  • a session-A claim's nonce does not match session-B's expected nonce
  • session_id tamper breaks the Ed25519 signature
  • audit-entry removal breaks the export signature

Patterned after the existing test_claim1_hash_binding.py / test_claim2_session_gap.py suites and the claim 3/4 experiment scripts.

Verification

In a clean venv against main: ruff check src/ tests/ clean, pytest on claims 3+4 → 12 passed.

🤖 Generated with Claude Code

…binding

Closes the gaps tracked in #353 and #354. 6 tests each, patterned after the
claim 1/2 suites and asserting the same invariants their experiments demonstrate:

- claim 3: definition/catalog hash determinism, avalanche on description tamper,
  catalog hash propagation, fail-closed CatalogHashMismatch under a pinned hash
- claim 4: nonce determinism, session + instance binding, session-id tamper breaks
  the Ed25519 signature, audit-entry removal breaks the export signature

Updates experiments/README.md CI table. Verified: ruff clean, 12 tests pass.
@imran-siddique imran-siddique merged commit 2c3f594 into main Jun 26, 2026
12 checks passed
@imran-siddique imran-siddique deleted the feat/claim3-4-ci-tests branch June 26, 2026 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test(claim4): pytest coverage for TRACE Claim nonce binding properties test(claim3): pytest coverage for rug-pull detection properties

1 participant