MF-H02: bind session key registration to a single owner per kind#739
Conversation
Session key registration accepted any (user_address, session_key) pair without
proving possession of the session key. Combined with GetAppSessionKeyOwner
resolving ownership by MAX(version) over session_key alone, an attacker could
shadow-register a victim's session_key under their own wallet, outpace the
victim's version, and have the victim's session-key signatures attributed to
the attacker (or rejected outright) during quorum verification on app sessions.
Three layers stack to close the issue:
1. UNIQUE (session_key, kind) on current_session_key_states_v1 binds each
session key to one wallet per kind. LockSessionKeyState seeds the pointer
row, locks (session_key, kind) under SELECT ... FOR UPDATE, and surfaces a
generic ErrSessionKeyNotAllowed when the locked row is owned by another
wallet — no constraint-name parsing needed at write time.
2. session_key_sig column added to both history tables and to the RPC + core
types. Both submit handlers require the session-key holder to co-sign the
same packed payload that already binds user_address (app side) or the new
PackChannelSessionKeyOwnershipV1 payload (channel side, which adds
user_address since the existing UserSig pack omits it). Validated on every
submit.
3. Reject submits where user_address == session_key.
The collision rejection path returns "invalid_session_key_state: session_key
not allowed" and logs details server-side; the API never confirms whether a
given session_key is registered to someone else.
Migration runs a pre-flight RAISE EXCEPTION on existing (session_key, kind)
duplicates so cross-wallet collisions surface for ops review before the
constraint adds.
SDK Go and TS gain Sign{App,Channel}SessionKeyOwnership helpers; cerebro
fetches the session-key private key up-front and refuses to register a key it
does not control. ts-compat and example-app updated symmetrically.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds dual-signature session-key ownership across Go/TS SDKs and Nitronode, paginated key-state APIs, DB pointer/locking with per-user caps, ERC20 transfer hardening, CI protocol-drift checks/scripts, and updates examples/docs/tests. ChangesSession-key ownership + pagination + drift guards
Sequence Diagram(s)sequenceDiagram
participant Client (TS/Go)
participant Nitronode API
participant DB
participant Wallet
participant SessionKey
Wallet->>Client (TS/Go): user_sig over packed state
SessionKey->>Client (TS/Go): session_key_sig over ownership payload
Client (TS/Go)->>Nitronode API: submit_*_session_key_state
Nitronode API->>DB: LockSessionKeyState + (new key) CountSessionKeysForUser
Nitronode API->>DB: Store*State + upsert current pointer
Nitronode API-->>Client (TS/Go): ok with pagination on queries
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
- LockSessionKeyState now returns an error instead of (0, nil) when the SELECT FOR UPDATE that follows the seed insert cannot find a row. Silent fallthrough would have let a submit bypass ownership enforcement in the (impossible-but-not-impossible) case of pointer-row disappearance. - Reorder the session key ownership migration to add session_key_sig columns first, then run the duplicate-row pre-flight and the UNIQUE constraint. Constraint failure no longer leaves the schema partially applied if goose is ever run with per-statement (not per-file) commits. - Lock in transform compatibility for empty session_key_sig (rows written before the column existed serialize as "" through the Go RPC layer). - Drop a stale comment in cerebro that referenced an "activate it as the state signer" step removed by the original commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ihsraham
left a comment
There was a problem hiding this comment.
i'd hold approval for now. the new submit path is closer, but the notes below still affect h-02 closure.
Address review feedback that the prior ownership-proof payload shared the same ABI shape as user_sig, leaving a cross-domain replay window. Fold user_address into the metadata hash and have both signatures sign the same PackChannelKeyStateV1 payload — session_key already binds the packed bytes via PackChannelKeyStateV1, so the combined hash + pack binds the signed payload to a single (wallet, session_key) pair on both sides. Also merge ValidateChannelSessionKeyAuthSigV1 + ValidateChannelSessionKeySigV1 into a single ValidateChannelSessionKeyStateV1 since both validates now share the same packed payload and recoverer. - pkg/core: metadata hash takes user_address (not session_key); single validate covers both sigs; delete PackChannelSessionKeyOwnershipV1 - sdk/go, sdk/ts, cerebro, nitronode: callers pass user_address; both signers use PackChannelKeyStateV1 - handler collapses two validate calls into one - tests: replay regression covers both cross-key and cross-wallet substitution Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two reviewer findings on the H-02 migration path: - The `DO $$ ... END $$;` pre-flight block was missing goose's StatementBegin/StatementEnd markers, so goose would split it on `;` and run the unique-constraint ADD before the duplicate-row check. Wrap the block so it stays a single statement. - The new (session_key, kind) unique constraint was only added by the postgres migration; sqlite goes through AutoMigrate and silently allowed multiple pointer rows for the same key/kind under different wallets. Add a matching uniqueIndex gorm tag (same name as the postgres constraint) so both paths converge on the same invariant. Adds a regression test that verifies the constraint at the DB layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
nksazonov
left a comment
There was a problem hiding this comment.
Clean approach to a hard problem — the pointer-table + (session_key, kind) unique constraint elegantly closes the global-max-version attack, and the co-signature requirement is a solid possession proof. Two non-blocking observations below.
ihsraham
left a comment
There was a problem hiding this comment.
i left a few closure comments inline. the main upgraded path looks much closer, but i would resolve these before approval.
Align docs/api.yaml channel session_key_sig description with the actual payload (PackChannelKeyStateV1(session_key, metadata_hash), where metadata_hash binds user_address) and update the example-app README register/revoke snippets to sign session_key_sig before submit so they do not hit the new validator rejection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The server recovers SessionKeySig as a raw 65-byte EIP-191 signature, but the ownership helpers accepted a broad signer interface (sign.Signer in Go, StateSigner in TS) that includes wrappers prepending protocol-type bytes (ChannelDefaultSigner, ChannelSessionKeyStateSigner, AppSessionWalletSignerV1, ...). Such a caller could produce a submit that the server rejects. Narrow both Go helpers (SignChannelSessionKeyOwnership, SignAppSessionKeyOwnership) to *sign.EthereumMsgSigner and both TS helpers (signChannelSessionKeyOwnership, signAppSessionKeyOwnership) to EthereumMsgSigner. To keep the Go API ergonomic, widen NewEthereumMsgSigner / NewEthereumMsgSignerFromRaw return types from the interface to *EthereumMsgSigner so existing callers continue to satisfy sign.Signer without type-asserting. Refresh the TS public-API drift snapshot for the narrowed signatures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A pre-MF-H02 binary running the MF-H01 schema (which already creates current_session_key_states_v1 + writes to it) would still happily insert history rows without session_key_sig after this migration adds the nullable column. The new GetAppSessionKeyOwner / GetChannelSessionKeyOwner lookups would then trust those unproven rows as legitimate owners during the rolling-deploy window. Add NOT VALID check constraints on both history tables enforcing that session_key_sig is non-empty. NOT VALID skips the legacy backfill scan so pre-existing rows pass; only future inserts are validated. Old pods during rollout fail closed at the DB layer when they would otherwise write an unproven row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LockSessionKeyState inserts a version=0 placeholder before any signature or version check; that row is intentionally never deleted on failure paths (sig validation, version mismatch, cap exceeded, mid-tx errors). The seed is the (session_key, kind) ownership reservation, not a transient placeholder — once a wallet stakes a claim, no other wallet can ever take that key for that kind. CountSessionKeysForUser already excludes version=0 rows so quota math is unaffected, but the permanence of the ownership bind was non-obvious. Extend the docstring to call this out so future maintainers understand cleanup-on-failure is intentionally absent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The channel session_key handler delegates both signature checks to core.ValidateChannelSessionKeyStateV1 in two lines; the app session handler open-coded the same logic (decode hex, create recoverer, recover, compare for user_sig + session_key_sig) in ~50 lines. Add app.ValidateAppSessionKeyStateV1 mirroring the channel pattern, covering both signatures + the bare session_key_sig non-empty check, and reduce the handler to a single delegating call. The handler test asserts the new error wording for the user_sig mismatch path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
looks close on h-02. the server path now requires the session-key ownership proof and locks i left one approval-blocking compat note because |
…ippets Mirror the main-SDK narrowing on sdk-compat: signChannelSessionKeyOwnership and signAppSessionKeyOwnership now accept EthereumMsgSigner, not StateSigner. Without this, sdk-compat typecheck fails at the inner-client call site and callers can be pointed at a wrapped signer the node rejects. Import is type-only to keep barrel SSR-safe. Update sdk/ts and sdk/go README snippets (app + channel) to construct the raw EthereumMsgSigner directly with a comment explaining why a wrapped StateSigner / sign.Signer is not acceptable here. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ihsraham
left a comment
There was a problem hiding this comment.
approving for h-02. the node-side fix now binds session-key registration to key possession, the compat signer type and snapshot are aligned, and all review threads are resolved.
i would keep legacy rows without session_key_sig as an ops cleanup note, not a blocker for the new-write fix.
- MF-L01: fix(contracts/ChannelHub): cap ERC20 transfer returndata copy to 32 bytes (#726) - MF-H01: fix(nitronode): paginate get_last_key_states endpoints (#724) - MF-I01-I02: fix(contracts): address security audit findings I-01 and I-02 (#728) - MF-C01: rpc: cap inbound WebSocket frame size and rate-limit per connection (#723) - MF-L02: docs(protocol): qualify enforcement guarantee for intent-specific execution paths (#737) - MF-L02-I03-I04_I05: fix(contracts): add more Node trust assumptions and requirements (#738) - MF-M01: backfill state user_sig from on-chain events (#731) - MF-M02: fix(rpc): release Serve wait group on processSink overflow (#732) - Fix SDK acknowledgement before home channel creation (#734) - MF-I06: fix(nitronode): gate escrow transitions on home channel onchain materialization (#730) - MF-M05: fix(nitronode): enforce TLS by default for Postgres (#733) - MF-M07: Unblock receiver states after finalized escrow operations (#735) - MF-M04: feat: provide tooling for and enhance docs on ValidatorRegistered event (#744) - MF-L04: fix(contracts): reject redundant native value (#741) - MF-H02: bind session key registration to a single owner per kind (#739) - MF-I07: fix(contracts): enforce max challenge duration (#752) - MF-M08: fix(rpc): replace Origin label with application_id on connection gauge (#745) - MF-C02: fix(core): add ChannelStatusClosing to gate post-finalize state transitions (#746) - MF-L06: fix(contracts): clear stale challengeExpireAt on cooperative escrow finalization (#754) - MF-I08: docs: document ChannelClosed event orientation ambiguity during abandoned migration (#755) - MF-M09: fix(nitronode): auto-challenge home channel on withheld escrow finalize (#753) - MF-L09: fix(nitronode): validate parsed app session nonce (#751) - MF-L05: docs(contracts): document informational events not guaranteed to emit (#756) - MF-L08: fix(nitronode/api): default get_last_key_states to active-only with include_inactive opt-in (#749) - MF-L10: fix: emit escrowIds array in EscrowDepositsPurged event and handle it in Nitronode (#757)
Summary
UNIQUE (session_key, kind)tocurrent_session_key_states_v1and rejecting collisions with a genericinvalid_session_key_state: session_key not allowed(no ownership leak).session_key_sigco-signature from the session-key holder on every submit, so nobody can register a key they do not control. App side reuses the existing packed payload (already bindsuser_address); channel side gets a newPackChannelSessionKeyOwnershipV1that addsuser_addressto the ownership payload.user_address == session_key).Why
Without these, an attacker could register the victim's session key under the attacker's wallet, outpace the victim's version, and have
GetAppSessionKeyOwner(which resolved ownership byMAX(version)oversession_keyalone) return the attacker for the victim's signatures. Result: the victim's delegated quorum signatures got attributed to the attacker or rejected as non-participant.An attacker without the session-key private key never reaches the pointer lookup — they fail at
session_key_sig does not match session_keyand learn nothing about existing registrations. The genericsession_key not allowedresponse is reachable only when the submitter can produce a valid possession proof, i.e. when they already control the key in question; surfacing existence to that caller is acceptable.Migration
20260508000000_session_key_ownership_constraints.sqlruns a pre-flightRAISE EXCEPTIONon existing(session_key, kind)duplicates before adding the constraint. Any duplicates surface for ops review since they are exploitation evidence under the old behavior.SDK + tooling
sdk/go:SignAppSessionKeyOwnership/SignChannelSessionKeyOwnershiphelpers.sdk/ts:client.signAppSessionKeyOwnership/client.signChannelSessionKeyOwnership+packChannelSessionKeyOwnershipV1.sdk/ts-compat: facade methods exposed.cerebroCLI: requires the session-key private key up-front and refuses to register a key it does not control.example-app: enable + active-disable populatesession_key_sig; arbitrary-key revoke surfaces a friendly error since the example app does not retain other keys' private keys.Known design implication
session_key_sigis required on every submit, including revokes. A wallet that loses the session-key private key cannot revoke its own delegation early — it has to wait forexpires_at. Documented inline in the SDK READMEs and example app.Test plan
go build ./...go vet ./...go test ./...cd sdk/ts && npm run typecheck && npm test && npm run lintcd sdk/ts-compat && npm test && npm run lint🤖 Generated with Claude Code