Skip to content

feat(mcp): Ed25519 op-envelope signing for destructive ops (KLA-411)#25

Open
jklaassenjc wants to merge 2 commits into
mainfrom
juergen/kla-411-ed25519-signing
Open

feat(mcp): Ed25519 op-envelope signing for destructive ops (KLA-411)#25
jklaassenjc wants to merge 2 commits into
mainfrom
juergen/kla-411-ed25519-signing

Conversation

@jklaassenjc
Copy link
Copy Markdown
Collaborator

@jklaassenjc jklaassenjc commented Apr 28, 2026

Summary

Closes KLA-411. Direct response to the user-feedback ask in KLA-408:

For critical tools or actions, such as deletion in a production environment, it would be useful to request JIT authz to have a cryptographic proof of the operation so that responsibility can be traced.

Phase 1 (#23, v1.16.0) shipped TTY step-up. This slice adds the cryptographic proof half: every successful destructive MCP op emits a signed manifest to ~/.config/jc/mcp-audit-signed.log, verifiable post-hoc with jc audit verify.

How it works

Step What happens
First destructive op fires Server lazily generates Ed25519 keypair: priv → keychain at service="jc", account "<profile>:signing_key". Pub → config at profiles.<name>.signing_pubkey.
Each successful destructive op Chokepoint in addTypedTool builds a manifest `{tool, args_redacted, target, timestamp, nonce, operator_pubkey}`, signs the canonical JSON, appends to ~/.config/jc/mcp-audit-signed.log (one line per op, mode 0600).
Failed op No manifest — the op didn't happen, the regular audit log records the failure.
Plan mode (no execute=true) No manifest — non-mutating preview.
Verification `jc audit verify` reads the log, checks each signature against the configured pubkey, exits 0 / non-zero with the offending manifest's nonce.

Why pure Go, no cgo

The user can pair this with their preferred step-up authenticator (TTY today, Touch ID in KLA-412, out-of-band approval in KLA-413) without platform constraints. Signing just adds a forensic record on top.

Crypto: `crypto/ed25519` + `crypto/rand` from the stdlib. No third-party dependency. Manifest canonicalization uses the encoding/json deterministic key order (alphabetical) so verifiers can reconstruct the signed bytes exactly.

What ships

New files

  • `internal/mcp/sign.go` — manifestSigner interface, noopSigner, ed25519Signer, VerifyManifestStream(). Pure Go, mutex-protected lazy key load.
  • `internal/mcp/sign_test.go` — 10 unit + 3 integration tests covering signer round-trip, keypair reuse, nonce uniqueness, tamper detection, wrong-pubkey rejection, sensitive-arg redaction, chokepoint correctness on success/failure/plan paths.
  • `internal/cmd/audit.go` — `jc audit verify` subcommand.
  • `internal/cmd/audit_test.go` — 4 CLI tests.

Modified

  • `internal/keychain/keychain.go` — SetSigningKey/GetSigningKey/DeleteSigningKey helpers.
  • `internal/config/config.go` — `mcp.sign_destructive_ops` config, getter, `ValidConfigKeys`, profile-level `signing_pubkey` accessors.
  • `internal/mcp/server.go` — `Options.SignDestructiveOps` + `SigningProfile` + unexported `signer` test seam, wired into NewServer.
  • `internal/mcp/tools.go` — chokepoint emits signed manifest after a successful destructive op.
  • `internal/cmd/mcp.go` — `--sign-destructive` flag.
  • `docs/AUTH.md` — new "Signed audit log" subsection, updated threat-model table to reflect tamper-evident detection on credential-exfiltration and compromised-agent scenarios.

Demo

# Enable signing
jc config set mcp.sign_destructive_ops true

# Or one-off
jc mcp serve --sign-destructive --transport http

# Run a destructive op (from any MCP client)
# … users_delete with execute: true …

# Verify the chain
$ jc audit verify
OK — verified 1 signed manifest(s) for profile "default".

# After tampering
$ sed -i '' 's/alice/bob/' ~/.config/jc/mcp-audit-signed.log
$ jc audit verify
Verified 0 manifest(s) before failure: signature mismatch on manifest #1 (nonce=…, tool=users_delete)
exit 1

Test plan

  • `go build ./...` clean
  • `go test ./...` — full suite passing
  • 10 sign unit tests — noop dispatch, factory; signer round-trip; keypair reuse across multiple ops; nonce uniqueness; verify happy path; tamper detection; wrong-pubkey rejection; sensitive-arg redaction
  • 3 chokepoint integration tests — signs successful destructive ops, skips failed ops, skips plan mode
  • 4 audit-verify CLI tests — OK path, tamper detection exits non-zero, no-pubkey error, command registration under root
  • Flag-defaults test — `--sign-destructive` defaults to false

Defaults

Off. Backwards-compatible with every existing deployment. Operators opt in via `mcp.sign_destructive_ops: true` or `--sign-destructive`.

Follow-ups (separate tickets)

  • KLA-412 — Touch ID step-up authenticator (Phase 2 part 2)
  • KLA-413 — Out-of-band approval (Phase 3)

🤖 Generated with Claude Code


Note

Medium Risk
Adds cryptographic signing and verification around destructive MCP operations, touching keychain storage, config persistence, and the MCP execution chokepoint; bugs could break auditability or cause unexpected signing failures. Defaults are opt-in and failures don’t block ops, reducing operational risk.

Overview
Adds an opt-in signed audit trail for MCP destructive operations: when enabled via mcp.sign_destructive_ops or jc mcp serve --sign-destructive, each successful execute:true destructive tool call appends a per-profile Ed25519-signed JSON manifest to ~/.config/jc/mcp-audit-signed.log (with redacted args, target, timestamp, and nonce).

Introduces jc audit verify to validate the signed log against the profile’s stored public key (or an override key), and extends config/keychain support to lazily generate/store the signing keypair (private key in OS keychain under "<profile>:signing_key", public key persisted to profiles.<name>.signing_pubkey). Documentation and tests are updated to cover the new signing/verification flow and tamper detection behavior.

Reviewed by Cursor Bugbot for commit fdd5d1c. Bugbot is set up for automated code reviews on this repo. Configure here.

Direct response to the user feedback ask in KLA-408:

> For critical tools or actions, such as deletion in a production
> environment, it would be useful to request JIT authz to have a
> cryptographic proof of the operation so that responsibility can be
> traced.

Phase 1 (PR #23, v1.16.0) shipped TTY step-up. This slice adds the
*cryptographic proof* piece — every successful destructive MCP op
emits a signed manifest to a forensic audit log that's hard to forge
without also stealing the per-profile keychain entry.

What ships:

- internal/mcp/sign.go — manifestSigner interface, noopSigner (zero
  cost when disabled), ed25519Signer, VerifyManifestStream() helper.
  Pure Go, no cgo. Thread-safe via sync.Mutex.
- Lazy per-profile keypair generation on first signed op. Private key
  in OS keychain (service="jc", account="<profile>:signing_key");
  pubkey in config (profiles.<name>.signing_pubkey). No plaintext
  fallback — fail closed if keychain unavailable.
- Manifest shape: { tool, args_redacted, target, timestamp, nonce,
  operator_pubkey, signature }. Sensitive params are redacted via the
  existing audit-log redaction pass before signing.
- internal/keychain/keychain.go — SetSigningKey/GetSigningKey/Delete
  helpers, mirroring the client-secret pattern.
- internal/config/config.go — mcp.sign_destructive_ops bool config,
  MCPSignDestructiveOps() getter, ValidConfigKeys + coerceValue.
  SigningPubkey()/SetSigningPubkey() for profile-level pubkey access.
- internal/mcp/tools.go — chokepoint in addTypedTool's wrappedHandler
  emits a signed manifest *after* a destructive op succeeds. Failures
  don't sign (the op didn't happen). Signing errors warn on stderr but
  do not roll back the op (the API call already fired).
- internal/cmd/mcp.go — --sign-destructive flag wired through to
  Options.SignDestructiveOps + SigningProfile.
- internal/cmd/audit.go (new) — `jc audit verify` subcommand with
  --profile / --log / --pubkey flags. Returns OK + count on success;
  exits non-zero with offending manifest's nonce on tamper.
- docs/AUTH.md — new "Signed audit log" section, updated threat-model
  table to reflect tamper-evident detection on credential exfiltration
  and compromised-agent scenarios.

Tests:

- 10 sign unit tests (noop, factory, signer round-trip, keypair reuse
  across ops, nonce uniqueness, verify happy path, tamper detection,
  wrong-pubkey rejection, sensitive-arg redaction)
- 3 chokepoint integration tests (signs successful destructive ops,
  skips failed ops, skips plan mode)
- 4 audit-verify CLI tests (OK path, tamper detection exits non-zero,
  no-pubkey error, command registration)
- Flag-defaults test for --sign-destructive

This closes the original "cryptographic proof" half of the user
feedback. The remaining slices (Touch ID for stdio-mode prompts,
out-of-band approval for dual-control workflows) are KLA-412 and
KLA-413 respectively.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 965a444. Configure here.

Comment thread internal/mcp/sign.go
Address Cursor Bugbot finding on internal/mcp/sign.go:142.

Before: ensureKeyLoaded treated *any* error from keychainGetSigningKey
as "key missing, regenerate." A transient non-not-found error (locked
keychain, permission denied, corrupted entry) would have caused jc to
generate a fresh keypair and overwrite the existing entry, permanently
severing the verification chain for every previously signed manifest.

After: switch on the error type. Genuine keychain.ErrNotFound (the
expected first-run state) bootstraps. Anything else propagates and
fail-closes the signing attempt — operator sees a clear error,
existing key + audit chain are preserved.

Mechanics:
- internal/keychain/keychain.go: re-export keyring.ErrNotFound as
  keychain.ErrNotFound + IsNotFound helper, so call sites can
  distinguish without importing zalando/go-keyring directly.
- internal/mcp/sign.go: rewrite the get-or-generate switch with
  explicit cases for "found", "empty value w/ no error" (treat as
  bootstrap), "ErrNotFound" (bootstrap), default (propagate).
- internal/mcp/sign_test.go: regression test asserting that a
  transient keychain error neither regenerates nor writes the audit
  log; counterpart test confirming ErrNotFound still bootstraps and
  that the second op reuses the same keypair.

Test-fixture seam updated to return keychain.ErrNotFound instead of
errors.New("not found") so existing tests still exercise the bootstrap
path correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants