docs(ongoing): design doc for verifiable email recovery#3836
Open
sea-snake wants to merge 42 commits into
Open
docs(ongoing): design doc for verifiable email recovery#3836sea-snake wants to merge 42 commits into
sea-snake wants to merge 42 commits into
Conversation
Adds docs/ongoing/email-recovery.md covering the production design that should supersede PoC PR dfinity#3760: - DKIM hardening: vetted parser, trusted-body retention, byte-exact canonicalization (gateway contract change), tag enforcement. - DMARC alignment with PSL-based organizational-domain matching. - DNSSEC-validated DNS records as canister call arguments, replacing DoH HTTP outcalls in the recovery hot path. - Email recovery as a first-class authn method alongside the existing recovery phrase and recovery device, with setup and recovery flows, Candid surface, storage model, and rollout plan.
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a draft design document describing a production architecture for verifiable email-based recovery in Internet Identity, intended to supersede the DKIM Postbox PoC in #3760.
Changes:
- Introduces a comprehensive design for DKIM hardening (library-based verification, raw header bytes contract,
l=handling, tag enforcement). - Specifies DMARC alignment and a DNSSEC-proof-bundle approach to avoid HTTP outcalls during verification.
- Proposes an email recovery registration/recovery flow and corresponding (draft) Candid surface.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
sea-snake
commented
May 4, 2026
- Drop all Postbox/inbox content. Email recovery does not store inbound mail; verification is in-flight only. Postbox is out of scope for this design. - Drop "Recovery device" references — that surface is no longer exposed to end users; only recovery phrase remains today. - Drop the salted hash + lookup-hint scheme. The lookup index is keyed by lowercased address directly; enumeration is gated by DKIM (§3.1). Recovery flow no longer pre-leaks the address to the canister; the anchor is derived from the verified From:. - Drop the @dfinity/dnssec-bundle TS library / WASM helper. The FE assembles bundles directly from DoH in plain TypeScript. - Manage page: rename to "Recovery methods" with phrase + email cards, point at the actual svelte routes; FLOWS.mdx is stale and should be ignored. - DNSSEC root anchor moves from bundled-in-WASM to a deploy/ upgrade arg (II is deployed weekly). Note that the IANA root KSK rolls about once a decade. - Tighten the DNSSEC verification pseudocode to be explicit that the trust anchor is a DS digest, not a DNSKEY. - Rename EmailRecoveryRegisterArg -> EmailRecoveryProof since the same shape powers both registration and recovery delegation. - Add the missing prepare_recovery / get_delegation Candid entries. - Clarify l= handling: verification-time only, search nonce in the canonicalized signed prefix; no decanonicalization. - Replace the bare "3137585324" reference with a link to the PoC PR comment thread it pointed at. - Open questions: aliases don't collapse, lost-mailbox out of scope, multi-anchor-per-address forbidden, IDN settled.
Restructure §8 around the user-visible flow:
1. User enters email
2. FE submits {address + DNSSEC records}, canister returns challenge
3. User emails the magic token
4. Canister verifies and finalizes (setup) or issues delegation (recovery)
Splits the canister API into prepare_* (validates DNS, caches the
verified DKIM pubkey + DMARC policy) and finalize_* (verifies the
email signature against the cached pubkey, no DNS work). Heavy
DNSSEC validation happens once, up front, before the user has to
do anything irreversible. Final call payload is small.
Updated mermaid diagrams in §8.4 (setup) and §8.5 (recovery) to
show the prepare/email/finalize structure with explicit step
numbers.
Added §8.6 with ASCII screen mockups: Recovery methods card on
Manage, the setup wizard's three steps, the recovery sign-in
picker, and the error state.
Renamed Candid methods for symmetry: prepare_register /
finalize_register and prepare_recover / finalize_recover.
Replaced the old EmailRecoveryProof shape with a DNS-only
prepare arg (EmailRecoveryDnsProof) and a small finalize arg.
…ster Reshape the flow so the SMTP gateway forwards each received email directly to the canister via email_recovery_deliver, and the FE polls the canister (not the gateway) for the verification outcome. The previous draft had the FE pulling the raw email from the gateway and re-uploading it to the canister via finalize_*; besides being a multi-KB upload from the browser for no reason, it didn't match the actual intended flow. New shape: prepare_register(anchor, addr, dns_proof) → challenge prepare_recover(addr, dns_proof, session_pk) → challenge deliver(challenge_id, raw_email) ← called by gateway status(challenge_id) → Pending | Succeeded* | Failed | Expired get_delegation(challenge_id, session_key, expiration) → SignedDelegation The session_pk for recovery is now passed in at prepare time and parked in the pending-challenge entry; the eventual delegation is bound to that key when deliver runs. The FE generates the keypair locally before the prepare call. Also rewrote §4 high-level architecture mermaid to match the new flow, fixed the §8.5 mermaid parse error (the `NOTE:` keyword inside a self-loop message was being interpreted as a separate mermaid Note statement), and dropped the per-IP rate limit table row — II doesn't track per-IP state and that's out of scope here.
Three concrete shape changes:
1. Rename methods to match OpenID convention; restore smtp_request.
- email_recovery_prepare_register → email_recovery_credential_prepare_add
- email_recovery_prepare_recover → email_recovery_prepare_delegation
- email_recovery_remove → email_recovery_credential_remove
- email_recovery_deliver → smtp_request (PoC's name; the
gateway-protocol method stays as-is, signature unchanged from dfinity#3760)
Status variants renamed from SucceededRegister/SucceededRecover to
RegistrationSucceeded/RecoveryReady.
2. Drop challenge_id; the nonce is the unique identifier.
- The FE polls and the canister-side challenge map are both keyed by
the nonce (a human-typeable token like II-Recovery-A1B2C3D4).
- Recipient mailbox is now a static string per kind: register@id.ai
and recover@id.ai. No per-challenge id in the address.
- Canister identifies the pending challenge by extracting the nonce
from the canonicalized signed body of the inbound email.
3. Drop the multi-selector hedge.
- A DKIM-signed email carries one signature for one selector. The FE
looks up the active selector for the user's email provider in a
small built-in map (gmail.com → 20230601, outlook.com → selector1,
etc.), fetches the DKIM TXT record for that one selector + DMARC +
DNSSEC chain, and ships only that.
- SelectorMismatch error variant covers the rare case where the
provider rotated between prepare and send.
The FE no longer ships a hardcoded `provider → selector` map. Instead it ships a small list of common selector names (selector1, selector2, default, dkim, k1, current-year date strings, etc.) and probes `<candidate>._domainkey.<domain>` via DoH in parallel. Whichever names return a valid DKIM TXT record are the active selectors. This: - removes the maintenance burden of keeping a per-provider selector map current, - adapts automatically to selector rotations and new providers as long as their naming follows common patterns, - requires the same DoH plumbing the FE already uses for DNSSEC bundle assembly. Updated §4, §8.1, §8.3, §8.4, and §8.5 to describe the probe-then- fetch flow and inserted a new "discover selector" mermaid step ahead of the bundle-assembly step in both setup and recovery.
Two diagram issues:
- Mermaid sequence diagrams treat `;` in message bodies as a statement
separator, which broke the long self-message verification descriptions
in §8.4 and §8.5 ("got NEWLINE, expecting arrow"). Replaced `;` with
`,` (and a couple of `:` characters that were similarly ambiguous).
- §4 high-level architecture rewritten as a sequenceDiagram instead of
flowchart LR. Sequence diagrams convey the temporal order of the call
flow more directly than a left-to-right flowchart, which matches how
the rest of §8 already presents the setup and recovery flows.
The `<candidate>` HTML entities in §8.4 and §8.5 sequence-diagram
messages were being decoded to literal `<` and `>`, which mermaid's
sequence-diagram tokenizer then treated as the start of an arrow
(`->>`, `-->>`, etc.). The parser failed with "expecting arrow, got
NEWLINE" partway through the message.
Replaced the placeholder syntax with plain prose
("candidate._domainkey.gmail.com for candidates selector1, …") and
reworded the response message to avoid the same trap.
Verified locally with `mmdc` (mermaid CLI) — all three diagrams (§4,
§8.4, §8.5) now render to SVG without parse errors.
…adata IdentityInfo.metadata is legacy. The frontend already has a feature-flag infrastructure at src/frontend/src/lib/state/featureFlags.ts using createFeatureFlagStore(...) — same pattern as DISCOVERABLE_PASSKEY_FLOW, GUIDED_UPGRADE, etc. Values persist in localStorage and can be flipped from the browser console via window.__featureFlags.<NAME>.set(true). Phase 1 registers an EMAIL_RECOVERY flag with default `false`; phase 2 flips the default to `true`.
The trust anchor is public IANA data and the live value is already recoverable from the canister's last upgrade arg via the IC management canister. A bespoke /.well-known endpoint adds surface area without adding any verifiable signal.
This was referenced May 6, 2026
Closed
Restructure the doc around the new flow we landed on in review:
- Pre-email: FE submits a *skeleton* DnsProofBundle covering the
DNSSEC chain + (optional) DMARC leaf. No DKIM leaf — the active
selector lives only inside the eventual email's DKIM-Signature
header, so the FE can't fetch it yet.
- Post-email: canister parses `s=`, verifies `bh=`, drops the body,
and stashes a ~500 B partial-verification record (headers digest,
signature blob, selector, signing domain, from_domain, claimed
address). Status flips to `NeedDkimLeaf { selector }`.
- FE polls, sees the selector, walks DNSSEC for that one leaf, and
submits via a new `email_recovery_submit_dkim_leaf` method. Canister
validates the leaf against the cached chain, completes the DKIM
signature check, and finalises.
Eliminates the FE-side `SELECTOR_CANDIDATES` probe entirely — the
selector is authoritative from the email itself. Section 4 architecture
diagram, §7.4 bundle assembly, §7.6 path-comparison table, §8.1 user
flow, §8.2 storage (Vec model + IdentityInfo surfacing), §8.3 candid
surface, §8.4/§8.5 sequence diagrams, error-UX table all updated to
match.
Also documents storage-model widenings already in code: anchor
`email_recovery` is now `Option<Vec<...>>` on the storable form (so
multi-credential lifts to a pure API change), `RecoveryReady` carries
`anchor_number` so the FE seeds its auth store directly, and
`check_authorization` recognises email-recovery delegation principals
as a third authn-method kind alongside device + OpenID.
Two drift items from the implementation that hadn't yet made it into the design doc: - §8.9 bounded-state table now lists the per-pending-entry byte caps explicitly (DMARC TXT 1024 B, session_pk 1024 B, address per RFC 5321, partial-verification record ~500 B), alongside the post-email DKIM TXT cap of 4096 B applied at submission. These are what actually keeps the heap bounded once an attacker fills the slot count — the slot cap alone isn't enough. - §8.2 storage section now describes the two payload-free archive Operation variants (AddEmailRecovery / RemoveEmailRecovery), including the rationale for emitting no address bytes: domain or provider would let archive consumers correlate anchors back to mailbox providers, which §3.1 already pushes against. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two changes: - Narrow `DnsProofBundle` from `leaves: Vec<SignedRRset>` to `leaf: Option<SignedRRset>`. With the two-phase flow each bundle only ever needs one leaf: at prepare time it's the optional DMARC TXT, at submit_dkim_leaf time it's the DKIM TXT. Single-leaf is also a smaller candid argument — the Vec wrapping was overhead for a max-1 collection. The §7.2 example struct comment is rewritten to match. - Status table: PR dfinity#3843 (recovery flow) and PR dfinity#3844 (frontend wizards) are open and in review — they were still showing as Planned. Date stamp also bumped to 2026-05-06. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The semicolon in "Gateway forwards email; canister..." was being parsed by mermaid as a statement separator, leaving "canister parses signature, caches partial verification" as a free-standing fragment that the parser tried to read as `Actor message` and then choked on the comma. Replace the semicolons with em-dashes in the two affected `Note over` lines (one in §8.4, one in §8.5). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ation
Real-world DKIM resolution often crosses zone boundaries via CNAME — Proton's
proton.me → proton.ch, Tutanota's tutanota.com → tutanota.de, M365 custom
domains, etc. DNSSEC signatures don't span zones, so each hop in the chain
needs to be authenticated by its own zone's DNSKEY.
§7.2 — DnsProofBundle is now { root_dnskey, chains: Vec<DelegationChain>,
hops: Vec<SignedRRset> }. One chain per signing zone touched, one hop per
RRset in the resolution sequence. RRSIG.signer_name picks which zone's
DNSKEY validates each hop.
§7.3 — Verification builds (zone → DNSKEY) map from chains, validates each
hop under the zone its RRSIG names, then walks the hop sequence end-to-end:
hops[0].name == requested_name, consecutive owner = previous CNAME target,
final hop type matches requested type, no loops, ≤ MAX_CNAME_HOPS.
§7.4 — Caller-side walker follows CNAMEs at submit time and supplies any
new zone chains needed; abandons the bundle if any hop is unsigned.
§7.6 — Adds the live.com case: apex DS exists but DKIM CNAMEs into
unsigned protection.outlook.com, so the chain breaks end-to-end and the
domain belongs on the DoH allowlist despite the apex being signed.
Provider-category table covers the four real-world shapes.
§8 — Cached state at prepare is a (zone → DNSKEY) map that grows at
submit_dkim_leaf time when the resolution crosses into a new signed zone.
submit_dkim_leaf takes (nonce, hops, extra_chains) — empty extra_chains
for the Gmail-style case.
…table The status table listed PRs 1-8 but missed the 9th, which adds the IANA root anchor fetcher and wires `dnssec_config` + `doh_config` into the install args produced by deploy-common.bash and make-upgrade-proposal. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
9 tasks
Aligns the design doc with the recent canister-side change that derives the accepted recipient mailbox from \`related_origins\`: - §8.3 \`EmailRecoveryChallenge\` loses its \`mailbox\` field. The FE pairs the user-part (\`register\` / \`recover\`) with \`window.location.hostname\` to render the user-facing label, so each tab automatically shows the alias matching the origin the user is on. The canister accepts \`register@<h>\` / \`recover@<h>\` for any host \`<h>\` listed in \`related_origins\` — equal aliases, no canonical pick. - §8.3 Candid surface adds \`smtp_request_validate\` query the off- chain SMTP gateway calls at RCPT TO time to decide whether to accept the connection (Ok for the two recipients we handle, 550 otherwise). Without this the gateway has no way to know which recipients we accept and falls back to whatever default policy it was deployed with. - §4.1 SMTP gateway notes both surfaces are open / anonymously callable. - Glossary, §4 narrative, §8.1, §8.4, §8.5, mermaid diagrams, ASCII recovery flow note all updated to reference \`register@<host>\` / \`recover@<host>\` instead of hardcoded \`@id.ai\`. Concrete examples retained where they clarify what the user sees on prod vs beta. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- §4 / §8.4 / §8.5 mermaid: replace `<host>` / `<window.location.hostname>` with `[host]` / `[window.location.hostname]`. The mermaid sequence-diagram parser decodes HTML entities before lexing, so the trailing `;` of `>` was being read as a statement separator and split the message mid-line — the same failure mode the previous fix (commit 0f33477) cleared by removing literal semicolons from `Note over` text. Verified all three diagrams now parse via `mermaid.parse`. - §7.3 verification algorithm: relax the DNSKEY-RRset self-signature step from "by its KSK" to "by some DNSKEY in the RRset" and add a paragraph explaining why. The DS step still pins a KSK to the parent's DS digest; what changed is which DNSKEY validates the RRSIG over the DNSKEY RRset itself. Operator practice is split (Proton signs DNSKEY with the ZSK; Cloudflare with the KSK), and the previous KSK-only rule rejected otherwise-valid Proton bundles for no security gain. - §8.2 storage model: correct the memory IDs and storage shapes to match the implementation: * Reverse address index is **memory ID 23**, not 24. * Pending challenges live in a heap-only `thread_local! HashMap`, not a `StableBTreeMap` at memory ID 25. Document the "ephemeral by design" rationale (30-minute TTL, retry-from- scratch on upgrade). - §8.7 `DkimLeafMismatch` annotation: name provider-side DKIM key rotation as a typical cause alongside the transient-resolver case, observed twice on Proton during staging. User-facing copy is unchanged (same retry remediation).
Adds a future-work section sketching the hybrid path: FE walks DNSSEC as far as the chain holds, BE validates the signed prefix and DoH- quorums the unsigned tail iff the tail's owner zone is on the existing `DohConfig.allowed_domains` (with label-anchored suffix match). Captures the design we converged on for handling Workspace and M365 custom domains end-to-end: - Workspace + apex DNSSEC: pure DNSSEC, no outcall. - M365 + apex DNSSEC: hybrid; one BE outcall into onmicrosoft.com, which suffix-matches one allowlist entry covering every tenant. - live.com today: hybrid would shrink the DoH-trusted span to just the unsigned tail. - Apex unsigned: still rejected — structural, not addressable here. Documents the cost framing (BE outcall cycles vs FE-side free walk, per-tenant cache fragmentation, 1h TTL lever, §8.9 per-anchor caps) and the implementation surface (FE partial-bundle emission, BE verifier relaxation, allowlist match helper, `onmicrosoft.com` config addition). Renumber References from §12 to §13. Not building this now. The current dual-path (full DNSSEC | DoH allowlist) is sufficient for v1; the hybrid lands as a follow-up when custom-domain coverage becomes a priority.
Replace `dfinity.org` and `tackmann.net` with descriptive
placeholders ("an organization on Google Workspace whose registrar
publishes a DS", "a small organization self-hosting mail on a domain
whose registrar/nameserver setup never enabled DNSSEC"). The
operator-zone reference (`onmicrosoft.com`) stays — it's a
Microsoft-published service identifier, not a private user domain.
2 tasks
…y#3857 The §8.6 mockups had drifted from the implemented UX in PR dfinity#3857 (Recovery methods card layout, button labels, dialog patterns, wizard view names, copy). Bring the doc back in sync so reviewers reading the doc see the shipped flow: - Replace each ASCII mockup with the current rendering — two-up card grid, status text instead of [Active]/[Inactive] badges, 3-dot dropdown for Replace/Remove, stacked danger-then-cancel remove dialog, no in-wizard Cancel buttons, Steps indicator at the top of every wizard view, mailto button + per-row copy buttons in step 2, no orange warning block, no dedicated Done view (toast + manage-page card stand in), method picker with ButtonCards + hover-fade arrow, UnsupportedDomain view with collapsible technical detail. - Note the cross-page Shield-icon consistency for the recovery phrase option in the picker, and the max-w-5xl card-grid cap. - §8.1 step 2: "magic email" -> "confirmation email" so the prose lines up with the renamed SendConfirmationEmail.svelte. - §8.4 sequence diagram: the user clicks "Activate" on the inactive email card (the previous "Add email recovery" label is gone). - §8.7 status table: remove "Cancel link" / "Done view" mentions that no longer exist; describe the wizard's FailedView and the toast-on-success path. - §8.10 Frontend changes: drop the stale "Add email" wizard reference, document the (Active|Inactive)EmailRecovery split, the More options dropdown for Verify/Reset on unverified, and the picker refactor on /recovery. - §10 Phase 2: the Phase-2 enable step lights up both the manage-page card and the picker option simultaneously, since both gate on the same EMAIL_RECOVERY flag. Also runs the file through prettier (the file wasn't formatted before — most of the line-count churn is reflow).
Two more §8.6 mockup corrections after re-checking the implemented
wizard:
- Remove the `<Steps total={3} current={i} />` references and the
ASCII step-indicator headers from both wizard mockups. The wizard
in PR dfinity#3857 does not render a step indicator on any view.
- Move the shield-check authenticity badge from the heading row of
step 2 to the right edge of the `From:` row (where it actually
lives in `SendConfirmationEmail.svelte`). The `From:` placement
visually anchors the badge to the address it vouches for, instead
of the heading.
Also redrew the To/From/Subject/Body card mockup to match the rendered
two-line stacked-row layout (label uppercase on top, value below)
rather than the earlier inline `key: value` shape.
`SendConfirmationEmail.svelte` ends at the "Open in mail app" button: there is no rendered "Waiting for your email to arrive…" indicator and no "Expires in 29:42" countdown. The FE polls `email_recovery_status` silently in the background and flips the view when a terminal status arrives. Update both the §8.6 mockup and the §8.7 status table accordingly, plus a short prose note explaining why the visible countdown is absent.
This was referenced May 12, 2026
Merged
pull Bot
pushed a commit
to mikeyhodl/internet-identity
that referenced
this pull request
May 12, 2026
…iring (dfinity#3838) ## Summary First PR in the email-recovery stack (`docs/ongoing/email-recovery.md` §10 Phase 0). Lands a working RFC-4035-compliant DNSSEC verifier for caller-supplied DNS proof bundles, plus the trust-anchor wiring that drives it. PR 2 (DKIM verifier) and PRs 4–9 (storage + recovery methods) build on this. ## What's in this PR ### Verifier core - New `dnssec/` module under `src/internet_identity/src/`: - `types.rs` — `DnsProofBundle`, `SignedRRset`, `DelegationLink`, `Rrsig`, `DnsName`, `DnssecError`, `VerifiedRecord`. - `canonical.rs` — owner-name canonicalization, RR canonical form, RRSIG signed-data construction (RFC 4034 §3.1.8.1, §6.2, §6.3), DS digest input. - `signature.rs` — algorithm dispatch + DS digest matching (SHA-256). - `verify.rs` — four-step algorithm (root anchor match → chain walk → leaf RRSIG → freshness). - Algorithm coverage (RFC 8624 MUST set): - **alg 8** — RSA-SHA256 (RFC 5702): root, com., most legacy zones. - **alg 13** — ECDSA-P256-SHA256 (RFC 6605): most TLDs, Cloudflare, modern zones. - **alg 15** — Ed25519 (RFC 8080): rare in production but mandatory. - Anything else returns `UnsupportedAlgorithm`. ### Wiring - New `DnssecConfig` and `DnssecRootAnchor` types in `internet_identity_interface`, exposed at the top of `InternetIdentityInit` as `dnssec_config: opt opt DnssecConfig` (set/clear semantics matching `analytics_config` and `dummy_auth`). - Trust-anchor list plumbed through `init`/`post_upgrade` into `PersistentState.dnssec_config` (and `StorablePersistentState` for cross-upgrade persistence). - `internet_identity.did` updated. ### Tests 13 unit tests in `dnssec/` covering: - Real cloudflare.com chain verifies end-to-end (exercises alg 8 at root and alg 13 at com → cloudflare.com → leaf). - Empty trust-anchor list rejected with `NoTrustAnchors`. - Wrong trust anchor rejected with `RootAnchorMismatch`. - Flipped byte in root DNSKEY → `RootAnchorMismatch` or `BadSignature`. - Flipped byte in leaf signature → `BadSignature`. - Stale signature (clock advanced past expiration) → `StaleOrFutureSignature`. - Unsupported algorithm (alg 5 / RSA-SHA1) → `UnsupportedAlgorithm(5)`. - Plus canonical-encoding + RFC 3110 RSA key parsing unit tests. ### Test infrastructure - `test_vectors/dnssec/cloudflare-com-2026-05.json` — real DoH-captured chain (root DNSKEY + 2 delegation links + leaf TXT). - `test_vectors/dnssec/iana-root-anchors-2026-05.json` — IANA root KSK trust anchors (Klajeyz/2017 + Kmyv6jo/2024). - `scripts/capture-dnssec-chain.py` — reproducible capture script using dnspython + DoH wire format. Tests use a frozen now from the capture's metadata so freshness checks stay stable indefinitely. ### New deps - `domain` (NLnet Labs, pure Rust) — referenced in docstrings for canonicalisation primitives; signature verification is hand-rolled on top of RustCrypto. - `p256` — ECDSA P-256 verification. - `ed25519-dalek` — Ed25519 verification. All three build cleanly for wasm32-unknown-unknown. ## What's deferred to later PRs in the stack - Captures for additional providers (proton.me, protonmail.com, tutanota.com — gmail.com / icloud.com / outlook.com / fastmail.com don't sign with DNSSEC; this is acknowledged in design doc §7.6). - Synthetic Ed25519 (alg 15) test vector — most production zones are alg 8 or 13; alg 15 is structurally exercised by the dispatch logic but doesn't have a real captured chain in this PR. ## Test plan - [x] `cargo check -p internet_identity --target wasm32-unknown-unknown` — clean (no warnings). - [x] `cargo test -p internet_identity --bin internet_identity` — 238 tests pass (was 227 pre-PR). - [x] `cargo test -p internet_identity_interface --lib` — 42 tests pass. - [x] `cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings` — clean. - [x] `cargo fmt --check` — clean (modulo a pre-existing diff in attributes.rs unrelated to this PR). - [x] CI on dfinity#3838 — fully green. ## Design doc https://github.com/sea-snake/internet-identity/blob/design/email-recovery/docs/ongoing/email-recovery.md (PR pending review on dfinity/internet-identity) ## Stack This is PR 1 of a 12-PR series. Subsequent PRs: - **PR 2** — mail-auth-backed DKIM verifier, consuming DnsProofBundle from this PR. - **PR 3** — DMARC alignment. - **PRs 4–9** — storage + Candid + behavior for email recovery. - **PRs 10–12** — frontend (DoH walker, Manage UI, recovery wizard). ## PR Stack | # | PR | Description | Status | |---|---|---|---| | 0 | [dfinity#3836](dfinity#3836) | Design doc | Open | | 1 | [dfinity#3838](dfinity#3838) | DNSSEC verifier scaffold | Open | | 2 | [dfinity#3839](dfinity#3839) | DKIM verifier (RFC 6376) | Open | | 3 | [dfinity#3840](dfinity#3840) | DMARC alignment (RFC 7489) | Open | | 4 | [dfinity#3841](dfinity#3841) | DoH fallback | Open | | 5+6 | [dfinity#3842](dfinity#3842) | Setup flow (storage + smtp_request) | Open | | 7 | [dfinity#3843](dfinity#3843) | Recovery flow (delegation) | Open | | 8 | [dfinity#3844](dfinity#3844) | Frontend + feature flag | Open | | 9 | [dfinity#3855](dfinity#3855) | Deploy/upgrade scripts: dnssec_config + doh_config | Open | | 10 | [dfinity#3857](dfinity#3857) | Email-recovery UX overhaul | Open |
sea-snake-translation-bot
pushed a commit
to sea-snake-translation-bot/internet-identity
that referenced
this pull request
May 12, 2026
…ck) (dfinity#3877) ## Summary PR 2 of the email-recovery stack (`docs/ongoing/email-recovery.md` §10 Phase 0). Stacks on top of PR 3838 (DNSSEC verifier). Lands a hand-rolled RFC 6376 DKIM verifier that consumes a parsed `SmtpRequest` plus an already-trusted DKIM TXT record and returns a per-step `EmailVerificationStatus`. **Note:** This PR targets `main` but includes PR 3838's commits (DNSSEC verifier) as its base. Review the DKIM-specific changes by looking at commits after `9bbd8717` (the last PR 3838 commit). Once PR 3838 merges, this PR's diff will shrink to just the DKIM additions. ## Why hand-rolled The design originally specified `mail-auth` (Stalwart's well-tested DKIM library), but mail-auth pulls a non-optional `hickory-resolver` dep that fails to compile for `wasm32-unknown-unknown` (transitive: tokio + mio). Forking + patching mail-auth would be possible but creates perpetual rebase burden. We hand-roll instead — "the right way, no shortcuts" was the explicit guidance. ## What's in this PR ### `src/internet_identity_interface/src/internet_identity/types/smtp.rs` Brings forward the SMTP gateway protocol types from PoC PR 3760: `SmtpRequest`/`SmtpResponse`/`SmtpHeader`/`SmtpMessage`/`SmtpAddress`/`SmtpEnvelope`, the size bounds, and the input-bound validation (`format_address` lowercases both halves; `truncate_at_char_boundary` clamps to the previous UTF-8 boundary so a multi-byte subject can't trap the canister). Drops postbox-specific bits (PostboxEmail, ValidatedSmtpRequest, anchor-number parser). ### `src/internet_identity/src/dkim/` - **`types.rs`** — Algorithm (RsaSha256, Ed25519Sha256), HeaderCanon/BodyCanon (Relaxed, Simple), DkimCheck/DkimCheckName/DkimCheckStatus per-step diagnostics, EmailVerificationStatus / VerificationFailReason result shape. - **`parse.rs`** (RFC 6376 §3.5) — DKIM-Signature header tag-list parser. Splits structurally on `;` first then on the *first* `=` per element, so a literal `b=` substring inside another tag's base64 doesn't get misread as a new tag start (the bug class the PoC PR review specifically flagged). Folded whitespace inside base64 values is stripped before decoding. Tag names case-insensitive; duplicates rejected. - **`canonicalize.rs`** (§3.4.2 / §3.4.4) — relaxed header canon (lowercase name, unfold continuations, collapse WSP+ to single SP, strip trailing WSP, strip WSP around colon) and relaxed body canon (per-line WSP cleanup, drop trailing empty lines, ensure non-empty output ends in exactly one CRLF). - **`dns_record.rs`** (§3.6.2) — DKIM TXT record parser. Tag names case-insensitive (`P=` vs `p=` was a PoC bug), whitespace inside `p=` tolerated (multi-chunk DNS TXT records), `t=y`/`t=s` flags honoured, unknown tags ignored. - **`signature.rs`** — RSA-SHA256 (RFC 5702 / RFC 8301) and Ed25519-SHA256 (RFC 8463) signature verification on top of `rsa`+`sha2`+`ed25519-dalek` from PR 1's deps. Enforces 1024-bit RSA minimum per design §5.6. Ed25519 path wraps in SHA-256 per RFC 8463. Plus `body_hash_sha256` with optional `l=` truncation per §3.4.5. - **`verify.rs`** — orchestration. Multi-signature loop per §5.5 (accept on first pass), tag enforcement per design §5.4 (c=relaxed/* only, x= expiration, i= alignment with d=, k= match, t=y testing-mode), bottom-up header selection per §5.4 when h= lists a name multiple times, b=value blanking that's structural-position-aware so it doesn't mis-target an internal substring. - **`test_vectors.rs`** — `#[cfg(test)]` .eml loader + 8 end-to-end tests against committed fixtures. ### `test_vectors/dkim/` - 3 synthetic .eml files generated offline with dkimpy + a 2048-bit RSA key (`relaxed/relaxed`, `relaxed/simple`, `simple/simple`). - The matching DKIM TXT record (public key only). - README documenting provenance — the throwaway private key is **not** committed. ## Test plan - [x] `cargo check -p internet_identity --target wasm32-unknown-unknown` — clean. - [x] `cargo test -p internet_identity --bin internet_identity dkim` — 75 tests pass (parse 14, canonicalize 18, dns_record 16, signature 7, verify 12, end-to-end 8). - [x] `cargo test -p internet_identity --bin internet_identity` — 313 tests pass total (was 238 before this PR; +75 DKIM, plus a few in smtp types). - [x] `cargo test -p internet_identity_interface --lib` — 52 tests pass (was 42; +10 SMTP type tests). - [x] `cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings` — clean. - [x] `cargo fmt --check` — clean (modulo pre-existing diffs unrelated to this PR). ## Stack This is PR 2 of a 12-PR series. Includes PR 3838's commits as its base; once PR 3838 merges, the diff shrinks to just the DKIM additions. Subsequent PRs: - **PR 3** — DMARC alignment. - **PR 4** — DoH outcall fallback for unsigned domains (Gmail / Outlook / iCloud — see the design doc §7.6 and the team Slack writeup). - **PRs 5–9** — storage + Candid + behavior for email recovery. - **PRs 10–12** — frontend. ## PR Stack | # | PR | Description | Status | |---|---|---|---| | 0 | [dfinity#3836](dfinity#3836) | Design doc | Open | | 1 | [dfinity#3838](dfinity#3838) | DNSSEC verifier scaffold | Open | | 2 | [dfinity#3877](dfinity#3877) | DKIM verifier (RFC 6376) | Open | | 3 | [dfinity#3878](dfinity#3878) | DMARC alignment (RFC 7489) | Open | | 4 | [dfinity#3879](dfinity#3879) | DoH fallback | Open | | 5+6 | [dfinity#3880](dfinity#3880) | Setup flow (storage + smtp_request) | Open | | 7 | [dfinity#3881](dfinity#3881) | Recovery flow (delegation) | Open | | 8 | [dfinity#3882](dfinity#3882) | Frontend + feature flag | Open | | 9 | [dfinity#3883](dfinity#3883) | Deploy/upgrade scripts: dnssec_config + doh_config | Open | | 10 | [dfinity#3884](dfinity#3884) | Email-recovery UX overhaul | Open | --------- Co-authored-by: Arshavir Ter-Gabrielyan <arshavir.ter.gabrielyan@dfinity.org> Co-authored-by: Claude <noreply@anthropic.com>
aterga
added a commit
that referenced
this pull request
May 13, 2026
…of email-recovery stack) (#3878) ## Summary PR 3 of the email-recovery stack (`docs/ongoing/email-recovery.md` §6). Stacks on top of #3877 (DKIM verifier). Lands a hand-rolled DMARC alignment check and reshapes the verifier API: `dkim::verify_dkim` becomes a DKIM-only primitive, and the new `dmarc::verify_email` is the public top-level entry point that produces the combined `EmailVerificationStatus`. **Note:** This PR targets `main` but includes PRs 1+2's commits as its base. Review the DMARC-specific changes by looking at commits on top of `ec371aae3` (PR 2's tip). Once PRs 1+2 merge, this PR's diff shrinks to just the DMARC additions. ## What's in this PR ### `src/internet_identity/src/dmarc/` - **`types.rs`** — `DmarcOutcome` (Aligned / Misaligned / NoRecord / Malformed), `DmarcPolicy` (None / Quarantine / Reject), `AlignmentMode` (Strict / Relaxed), `DmarcRecord`, plus the combined `EmailVerificationStatus` that carries both DKIM diagnostics and the DMARC outcome on success. - **`parse.rs`** (RFC 7489 §6.3) — DMARC TXT record parser. Enforces `v=DMARC1` must be first, `p=` must be one of {none, quarantine, reject}, `pct=` 0..=100, rejects duplicate tags, ignores unknown / reporting tags. 12 unit tests. - **`from_header.rs`** (RFC 5322 / RFC 7489 §3.1.1) — single-mailbox From-header parser. Accepts bare addr-spec, name-addr, and quoted-display-name forms; rejects zero/multiple From: headers, address-lists, group syntax. Tolerates comma/colon inside quoted display names. 16 unit tests. - **`alignment.rs`** — strict (exact match) + relaxed (exact match OR label-aligned subdomain in either direction). Stricter than RFC-compliant relaxed alignment because we deliberately don't consult the PSL — design doc §6.4 documents the trust + asymmetric-failure-mode reasoning. The dot anchor on the subdomain check prevents `evilexample.com` from aliasing `example.com`. 8 unit tests. - **`verify.rs`** — orchestration. DKIM first; on failure, surface the DKIM reason verbatim. On DKIM pass, parse From and check DMARC alignment. Accepted iff Aligned, OR NoRecord with `dkim_domain == from_domain`. 8 unit tests. - **`test_vectors.rs`** — 5 end-to-end tests reusing PR 2's synthetic .eml fixtures. ### `src/internet_identity/src/dkim/types.rs` (rename + new variants) - Renamed `EmailVerificationStatus` → `DkimVerifyResult` (DKIM-only). The combined verdict moved to `dmarc::EmailVerificationStatus` so it can carry the `DmarcOutcome`. - Added `MalformedFromHeader(String)`, `DmarcMalformed(String)`, `DmarcMisaligned` to `VerificationFailReason`. ### `src/internet_identity/src/dkim/mod.rs` - Re-exports `verify` as `verify_dkim` so downstream callers (the dmarc layer) don't have to deal with both a `dkim::verify` and `dmarc::verify` in scope at the same time. ## Test plan - [x] `cargo check -p internet_identity --target wasm32-unknown-unknown` — clean. - [x] `cargo test -p internet_identity --bin internet_identity dmarc` — 49 tests pass (12 parse + 16 from_header + 8 alignment + 8 verify + 5 e2e). - [x] `cargo test -p internet_identity --bin internet_identity` — 365 tests pass total (was 313 with PR 2; +49 dmarc + 3 small reshape adjustments). - [x] `cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings` — clean. - [x] `cargo fmt --check` — clean (modulo pre-existing unrelated diffs). ## PR Stack | # | PR | Description | Status | |---|---|---|---| | 0 | [#3836](#3836) | Design doc | Open | | 1 | [#3838](#3838) | DNSSEC verifier scaffold | Open | | 2 | [#3877](#3877) | DKIM verifier (RFC 6376) | Open | | 3 | [#3878](#3878) | DMARC alignment (RFC 7489) | Open | | 4 | [#3879](#3879) | DoH fallback | Open | | 5+6 | [#3880](#3880) | Setup flow (storage + smtp_request) | Open | | 7 | [#3881](#3881) | Recovery flow (delegation) | Open | | 8 | [#3882](#3882) | Frontend + feature flag | Open | | 9 | [#3883](#3883) | Deploy/upgrade scripts: dnssec_config + doh_config | Open | | 10 | [#3884](#3884) | Email-recovery UX overhaul | Open | --------- Co-authored-by: Arshavir Ter-Gabrielyan <arshavir.ter.gabrielyan@dfinity.org> Co-authored-by: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📄 Read the design doc directly:
docs/ongoing/email-recovery.mdon the forkSummary
Adds
docs/ongoing/email-recovery.md— a design doc for the production approach that should supersede the DKIM postbox PoC in #3760.The design covers four threads in one architecture:
mail-auth), implement trusted-body retention (l=), enforce missing tag checks (i=,k=, future-datedt=), and change the gateway contract to deliver raw header bytes so byte-exact canonicalization is possible.From:header (not the SMTP envelope), check DKIMd=against the From: domain underadkim=s/adkim=rusing a bundled Public Suffix List, and surface the outcome in a renamedEmailVerificationStatus.DnsProofBundle. The canister verifies the chain against an IANA root anchor delivered as a deploy arg. No DoH outcall during recovery; deterministic verification without consensus tricks.The doc also includes a threat model, a test-corpus plan, a phased rollout, and the open questions (DNSSEC root key management, alias handling, IDN, enumeration mitigations).
Per request, this PR is the design doc only — no code changes. The PoC PR (#3760) is expected to close once Phase 0 of the rollout in §10 lands as a fresh PR series.
Reading order
cc @aterga
🤖 Generated with Claude Code