Skip to content

fix(identity): keep verification_token stable on domain reclaim#167

Open
jiashuoz wants to merge 2 commits into
mainfrom
fix/register-domain-stable-token
Open

fix(identity): keep verification_token stable on domain reclaim#167
jiashuoz wants to merge 2 commits into
mainfrom
fix/register-domain-stable-token

Conversation

@jiashuoz
Copy link
Copy Markdown
Member

Summary

Calling register_domain twice on the same unverified domain silently rotates the e2a-verify= TXT verification token. A caller that already published the records on DNS — which is the only reason to have called register_domain in the first place — is invalidated with no signal. verify_domain then fails with txt: missing and the user has to chase down why their published TXT no longer matches.

Reproduced against prod MCP today:

First  register_domain("foo.example.com") → verification_token: e2a-verify=f2ad6c0c...
Second register_domain("foo.example.com") → verification_token: e2a-verify=271ccc60...

Same row (created_at unchanged), but a new token. The DKIM key was already stable — the same protection just wasn't extended to verification_token.

Fix

In internal/identity/store.go, drop verification_token = EXCLUDED.verification_token from the ON CONFLICT DO UPDATE clause in ClaimOrCreateDomain. The existing token stays put on re-claim, matching how DKIM columns are already treated and matching what the MCP tool description already promises ("Idempotent against an existing-but-unverified row").

The freshly-minted token in Go is harmlessly discarded by Postgres on the conflict path; no restructuring needed.

Test plan

  • New test TestClaimOrCreateDomain_StableOnReclaim asserts VerificationToken, DKIMPublicKey, and CreatedAt are unchanged across a second reclaim
  • go test ./internal/identity/... — all green (23.9s)
  • go test ./internal/agent/... — all green (43.6s, covers the HTTP handler path)
  • Manual reproduction of pre-fix behavior in prod MCP captured above

Out of scope

  • The MCP-layer tool description ("Idempotent against an existing-but-unverified row") is now accurate; no docstring change needed.
  • The user_id = EXCLUDED.user_id reassignment on conflict is left as-is — separate concern, separate PR if we want to revisit.
  • verify_domain returning 412 as a tool error instead of a structured {verified:false, mx:missing, …} payload is also separate.

🤖 Generated with Claude Code

jiashuoz and others added 2 commits May 26, 2026 21:55
ClaimOrCreateDomain minted a fresh verification_token on every call and
the upsert wrote it on conflict, so a second register on an unverified
domain row silently rotated the e2a-verify= TXT value. A caller who had
already published the original TXT (the whole point of the first call)
would then fail verify_domain with no signal that the token had
changed under them.

Drop verification_token from the ON CONFLICT SET clause so the existing
value sticks, mirroring how DKIM is already handled. The newly-minted
token in the Go layer is harmlessly discarded by Postgres on the
re-claim path. created_at and the DKIM key were already stable.
The previous fix stopped verification_token from rotating on re-claim,
but the conflict branch still ran SET user_id = EXCLUDED.user_id on
unverified rows. That left a squatting window: user A registers
foo.example.com and publishes the e2a-verify TXT, user B then calls
register_domain on the same name. user_id flips to user B, the token
is now stable (good for A's lost case, bad here) — and user B can
verify against the TXT record A already published on DNS that A
controls.

Tighten the ON CONFLICT WHERE to AND domains.user_id = \$2 so only the
same user can re-claim an unverified row. Change the SET to a no-op
(SET user_id = domains.user_id) so RETURNING still surfaces the
existing row for the same-user case. A different-user attempt falls
through to the existing SELECT path and returns "domain not available",
matching the verified+different-user behavior.

The only non-test caller (handleRegisterDomain) already maps that
error to 400. No behavior change for legitimate same-user flows.

Add TestClaimOrCreateDomain_CrossUserReclaimRejected covering: user B's
ClaimOrCreateDomain returns an error, user B's LookupDomain still
can't see the row, and user A's row keeps its original user_id and
verification_token.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant