Skip to content

feat(experiments): land claim 1-4 experiments + claim 1-2 CI tests on main#355

Merged
imran-siddique merged 2 commits into
mainfrom
feat/land-claim1-4-experiments
Jun 26, 2026
Merged

feat(experiments): land claim 1-4 experiments + claim 1-2 CI tests on main#355
imran-siddique merged 2 commits into
mainfrom
feat/land-claim1-4-experiments

Conversation

@imran-siddique

Copy link
Copy Markdown
Contributor

What

Lands the experiments for claims 1-4 and the CI tests for claims 1-2 onto main.

Why

The experiments/README.md and the cMCP paper both reference reproducible run.py scripts for all six claims. But on main only claim 5 and claim 6 existed. The claim 1-4 work was merged via #345 into feat/cloud-deploy-tutorials (the wrong base) and never reached main, so 4 of the 7 documented reproduction commands failed with No such file. This restores the reproducibility claim.

Contents

  • experiments/claim1-policy-hash-binding/
  • experiments/claim2-session-vs-call-policy/
  • experiments/claim2-false-positive-rate/
  • experiments/claim3-rug-pull-detection/
  • experiments/claim4-trace-claim-nonce/
  • tests/unit/test_claim1_hash_binding.py (6 tests)
  • tests/unit/test_claim2_session_gap.py (6 tests)

CI tests for claims 3 and 4 remain tracked in #353 and #354 (unchanged; the README CI table already reflects this).

Verification

In a clean venv against main's cmcp_runtime:

  • All 7 experiments/*/run.py scripts exit 0
  • pytest on claims 1, 2, 5, 6: 30 passed

🤖 Generated with Claude Code

imran-siddique and others added 2 commits June 26, 2026 11:10
…on main

The README and paper already reference experiments for claims 1-4, but the
dirs only reached feat/cloud-deploy-tutorials via #345 (merged into the wrong
base) and never landed on main. This restores reproducibility: all 7 run.py
scripts in experiments/ now exist on main.

- claim1-policy-hash-binding, claim2-session-vs-call-policy,
  claim2-false-positive-rate, claim3-rug-pull-detection, claim4-trace-claim-nonce
- CI tests for claims 1 and 2 (claims 3 and 4 tracked in #353, #354)

Verified: all 7 experiment scripts exit 0 and 30 claim tests pass against
main's cmcp_runtime in a clean venv.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@imran-siddique imran-siddique merged commit e95e0f7 into main Jun 26, 2026
11 checks passed
@imran-siddique imran-siddique deleted the feat/land-claim1-4-experiments branch June 26, 2026 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant