feat(dkim): hand-rolled RFC 6376 verifier (PR 2 of email-recovery stack)#3877
Conversation
Carries forward the wire-format Candid surface the off-chain SMTP gateway already targets in the PoC PR, slimmed to just what PR 2's DKIM verifier and PR 8's smtp_request dispatch will need: - SmtpRequest / SmtpResponse / SmtpHeader / SmtpMessage / SmtpAddress / SmtpEnvelope, plus the size bounds and SMTP error codes. - Validation helpers: validate_envelope, validate_message, validate_smtp_request. - format_address (lowercases both halves so envelope casing can't bypass per-anchor lookups) and truncate_at_char_boundary (fallback to previous UTF-8 boundary, avoiding the multi-byte panic the PoC review flagged). Drops the postbox-specific bits (PostboxEmail, ValidatedSmtpRequest, the to.user → anchor parser) — they don't fit the email-recovery design, see docs/ongoing/email-recovery.md §2 non-goals. 10 unit tests for bounds, address normalisation, and char-boundary truncation. Wasm32 build unaffected.
First piece of the hand-rolled DKIM verifier. Going hand-rolled (rather than depending on stalwartlabs/mail-auth) because mail-auth pulls a non-optional hickory-resolver dep that fails to compile for wasm32-unknown-unknown — we'd need to fork+patch with perpetual rebase burden to use it. This commit lands: - src/internet_identity/src/dkim/types.rs: Algorithm (RsaSha256, Ed25519Sha256), HeaderCanon / BodyCanon (Relaxed, Simple), DkimCheck / DkimCheckName / DkimCheckStatus per-step diagnostics, and the public EmailVerificationStatus / VerificationFailReason result shape. - src/internet_identity/src/dkim/parse.rs: structural tag-list parser for the DKIM-Signature header value. Splits on ';' first, then on the first '=' per element, so a literal 'b=' substring inside another tag's base64 doesn't get misread as the start of a new tag — that was the concrete bug the PoC PR review flagged. Folded whitespace inside base64 values is stripped before decoding. Tag names are case-insensitive; duplicate tags are rejected. - mod.rs scaffolding wired into main.rs. dnssec module remains in place. 14 unit tests cover: minimal required tags, case-insensitive tag names, folded whitespace, the b=-inside-bh= antipattern, v != 1, duplicate tags, unsupported algorithms (rsa-sha1), ed25519-sha256, l/t/x tags, explicit i= override, default canon, empty h=, missing required tags, malformed base64. Wasm32 build clean; full II suite still passes.
Implements the relaxed canonicalisation algorithms — the only ones our DKIM verifier accepts on the header side (the parsed-pair gateway contract precludes byte-exact 'simple/*' header canonicalization, see design doc §5.2). Header (§3.4.2): lowercase the name; unfold continuation lines (CRLF + WSP -> WSP); collapse runs of WSP within the value to a single SP; strip trailing WSP from the value; strip WSP around the colon. Single- pass implementation that handles all five steps in one walk. Body (§3.4.4): per-line WSP cleanup (collapse runs, strip trailing); drop empty lines from the end of the body; ensure non-empty output ends in exactly one CRLF; empty input maps to empty output (no synthetic trailing CRLF). 18 unit tests cover both algorithms, including the edge cases that trip up naïve implementations: empty bodies, bodies with only whitespace, bodies without a trailing CRLF, internal empty lines that must NOT be stripped, header values that are entirely whitespace, and folded continuations with both space and tab continuations.
Parses the published DKIM key record into a DkimDnsRecord: - key_type: rsa (default) or ed25519 - public_key: base64-decoded p= value (empty = revoked) - testing: t=y flag - strict_auid: t=s flag Tag handling per §3.6.2.1 / §3.6.2.2: - v=DKIM1 if present must be first; absent is acceptable. - Tag names are case-insensitive (the PoC PR review specifically flagged P= versus p= mismatch as a real-world bug). - Whitespace inside p= is tolerated — DNS TXT records can be split across multiple <character-string>s and may carry stray WSP at chunk boundaries. - Duplicate tags are rejected (defence-in-depth even though the RFC technically allows shadowing). - t= flag list is colon-separated; we honour 'y' and 's', ignore others. - Unknown tags are silently dropped per RFC. 16 unit tests cover defaulting, both key types, both t= flags individually and combined, malformed inputs (missing p, v not first, unsupported v/k), whitespace inside p, empty p, and key-type/algorithm matching. Wasm32 build remains clean.
Both algorithms RFC 8624 specifies as MUST for DKIM (RFC 8301 + RFC 8463), no others. Reuses rsa, sha2, and ed25519-dalek that are already in the workspace from PR 1. RSA path (RFC 5702 / RFC 6376 §3.3.1): - DKIM publishes RSA keys in SubjectPublicKeyInfo (DER), so we use RustCrypto's DecodePublicKey::from_public_key_der which handles the X.509 + PKCS#1 wrapping. - Enforces RSA_MIN_KEY_BITS = 1024 per design §5.6 (planned lift to 2048 once telemetry confirms zero impact). - Sanity-checks signature length == modulus length before invoking RSASSA-PKCS1-v1_5 verify, so a structural mismatch surfaces as a clear MalformedSignature instead of an opaque crypto error. Ed25519 path (RFC 8463): - Public key is 32 raw bytes (no SPKI wrapping), signature is 64 bytes. - Wraps the input in SHA-256 ourselves before calling ed25519-dalek's pure-Ed25519 verify, since RFC 8463 specifies signing the SHA-256 of the canonical header hash input rather than the input directly. Plus body_hash_sha256 (the bh= side): SHA-256 of the canonical body, optionally truncated to l= bytes per RFC 6376 §3.4.5. Truncation caps at the body length so an l= larger than the body just uses the whole thing. 7 unit tests cover algorithm/key-type mismatches, malformed RSA keys (non-SPKI bytes), wrong-length Ed25519 keys / signatures, and the body hash with and without l=. Wasm32 build remains clean.
Ties parse + canonicalize + dns_record + signature into the public `verify(email, dkim_txt, now_secs)` API. Implements: - Multi-signature loop per RFC 6376 §5.5: try every DKIM-Signature header in order, accept on first pass, return Unverified with the most recent reason if all fail. - Tag enforcement per design §5.4: - c=simple/* on header side rejected with UnsupportedCanonicalization - x= expiration check - i= alignment with d= (subdomain-permissive unless DNS t=s set) - DNS k= must match signature a= - DNS t=y testing-mode -> Unverified(TestingMode) - Header bottom-up selection per RFC 6376 §5.4 — when h= lists a name multiple times we pick distinct occurrences walking from the bottom. - DKIM-Signature header itself canonicalised with b=value blanked, no trailing CRLF, per §3.7. The b= blanker is structural — only blanks at top-level tag positions, so a literal 'b=' substring inside another tag's base64 is never mistargeted (the bug class the PoC PR review flagged). - Per-signature DkimCheck breakdown so a UI can render which step failed. 12 unit tests cover: AUID alignment (exact, subdomain, evil-suffix, local-only), b= blanking (simple, bh-not-blanked, b-at-start, no- internal-substring-misblank), no signature, c=simple rejection, expired signature, misaligned i=. The cryptographic happy-path tests land alongside the captured/synthetic test vectors in the next commit. 67 DKIM tests pass total (parse 14 + canonicalize 18 + dns_record 16 + signature 7 + verify 12). Wasm32 build clean.
Lands the synthetic .eml fixtures plus the loader that turns them into SmtpRequest-shaped data the verifier actually consumes. End-to-end tests exercise the full pipeline: parse_eml -> SmtpRequest -> verify. Synthetic vectors were generated offline using dkimpy (Python's reference DKIM implementation) against a freshly-generated throwaway 2048-bit RSA key. The private key is *not* committed — only the .eml output and the matching public DKIM TXT record. README in test_vectors/dkim/ documents provenance and regeneration. Three .eml files cover: - c=relaxed/relaxed (the common case for mainstream senders) - c=relaxed/simple (header relaxed, body simple — accepted) - c=simple/simple (rejected with UnsupportedCanonicalization) Eight tests in dkim/test_vectors.rs cover: - happy-path verification of relaxed/relaxed and relaxed/simple - simple/simple rejection - flipped body byte -> BodyHashMismatch - flipped signature byte -> SignatureInvalid (or related) - wrong public key -> SignatureInvalid - no DKIM-Signature header -> NoSignature - the parse_eml helper itself (sanity check) The .eml parser unfolds continuation lines per RFC 5322 §2.2.3, drops the conventional single SP after the colon (matching what the gateway does in production), and preserves the rest of the header value bytes verbatim — so relaxed canonicalisation downstream produces the same form the signer hashed. Verifier totals: 75 DKIM tests (parse 14, canonicalize 18, dns_record 16, signature 7, verify 12, end-to-end 8). Full II suite: 313 tests pass (was 238 pre-PR). Wasm32 build clean.
- Use matches!() instead of explicit match arms for KeyType::matches_signature_alg - rustfmt across all dkim/ files and types/smtp.rs - Stale test_vectors.rs doc-comment said the throwaway private key 'is committed' — corrected to 'is *not* committed' (the fixtures' provenance README is the source of truth)
- blank_b_tag_value: preserve original bytes around the b= tag instead
of always emitting literal 'b='. RFC 6376 §3.7 says only the tag
value (and surrounding WSP) is replaced; the tag name and the bytes
between the name and '=' must come through verbatim. Concretely,
signers that emit 'B=' (uppercase) or 'b\t=' / 'b =' (FWS between
the name and '=') are valid per §3.2 and were previously mis-handled
— relaxed canonicalisation collapses both forms to 'b:' downstream,
so the bug only surfaces on signatures from those few senders, but
by-construction is more honest. New tests cover uppercase B,
tab-between-name-and-equals, and space-between-name-and-equals.
- test_vectors.rs module-level docstring: corrected to say the private
key is *not* committed (the README and the actual tree both already
said this; the docstring was the only contradiction). Dropped the
reference to scripts/sign-dkim.py since that file isn't checked in.
- smtp.rs validate_smtp_request_envelope_only_ok: tightened the
assertion from matches!(..., Ok(()) | Err(SmtpResponse::Ok{..}))
to is_ok(). The Err arm was unreachable (SmtpResponse::Err always
wraps SmtpRequestError) and would have masked a regression where
validation mistakenly returned the Ok variant inside Err.
23985ad to
bcf8233
Compare
There was a problem hiding this comment.
Pull request overview
Adds the DKIM (RFC 6376/RFC 8463) verification layer for the email-recovery stack, along with the underlying DNSSEC proof-bundle verifier/config plumbing and shared SMTP gateway request types needed to feed deterministic verification in-canister.
Changes:
- Introduces a hand-rolled DKIM verifier (
src/internet_identity/src/dkim/) including parsing, canonicalization, DNS TXT parsing, signature verification, and end-to-end fixture-based tests. - Adds DNSSEC proof-bundle verifier module (
src/internet_identity/src/dnssec/), wiresDnssecConfigthrough init/persistent state, and updates the public Candid interface accordingly. - Adds/updates test vectors and workspace dependencies to support real-world DNSSEC algorithm coverage and DKIM fixtures.
Reviewed changes
Copilot reviewed 16 out of 17 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| Cargo.toml | Updates workspace deps (rsa bump) and adds DNSSEC crypto deps (p256, ed25519-dalek). |
| Cargo.lock | Locks new/updated transitive dependencies for the added crypto crates. |
| src/internet_identity/Cargo.toml | Enables canister crate to use p256/ed25519-dalek for DNSSEC/DKIM verification. |
| src/internet_identity/internet_identity.did | Extends InternetIdentityInit with dnssec_config : opt opt DnssecConfig and adds DNSSEC types. |
| src/internet_identity/src/main.rs | Registers dkim/dnssec modules and plumbs dnssec_config through config()/install-arg application. |
| src/internet_identity/src/state.rs | Persists dnssec_config in PersistentState with upgrade-survival semantics. |
| src/internet_identity/src/storage/storable/storable_persistent_state.rs | Adds dnssec_config field to stable-persisted state conversions/defaults. |
| src/internet_identity/src/dnssec/mod.rs | New DNSSEC verifier module entrypoint + exports. |
| src/internet_identity/src/dnssec/types.rs | New DNSSEC proof-bundle types and verifier error model. |
| src/internet_identity/src/dnssec/wire.rs | Shared DNSKEY/DS wire-format constants. |
| src/internet_identity/src/dnssec/canonical.rs | DNSSEC canonicalization + signed-data / DS digest input assembly. |
| src/internet_identity/src/dnssec/signature.rs | DNSSEC signature verification (RSA/ECDSA/Ed25519) + DS digest matching. |
| src/internet_identity/src/dnssec/verify.rs | DNSSEC bundle verification orchestration (root anchor, chain walk, hops, freshness). |
| src/internet_identity/src/dnssec/test_vectors.rs | Loader for captured DNSSEC JSON vectors + anchor parsing. |
| src/internet_identity/src/dkim/mod.rs | New DKIM verifier module entrypoint + exports. |
| src/internet_identity/src/dkim/types.rs | DKIM result/diagnostic types (EmailVerificationStatus, checks, failure reasons). |
| src/internet_identity/src/dkim/parse.rs | DKIM-Signature tag-list parser (RFC 6376 §3.5). |
| src/internet_identity/src/dkim/canonicalize.rs | Relaxed header/body canonicalization helpers (RFC 6376 §3.4). |
| src/internet_identity/src/dkim/dns_record.rs | DKIM TXT record parser (RFC 6376 §3.6.2). |
| src/internet_identity/src/dkim/signature.rs | DKIM RSA-SHA256 + Ed25519-SHA256 verification + body hash calculation. |
| src/internet_identity/src/dkim/verify.rs | DKIM verification orchestration (multi-signature loop, enforcement, b= blanking). |
| src/internet_identity/src/dkim/test_vectors.rs | .eml loader + end-to-end DKIM verification tests against fixtures. |
| src/internet_identity_interface/src/internet_identity/types.rs | Exposes new dnssec and smtp type modules via re-exports and init struct field. |
| src/internet_identity_interface/src/internet_identity/types/dnssec.rs | New Candid types for DNSSEC config + root anchors. |
| src/internet_identity_interface/src/internet_identity/types/smtp.rs | New SMTP gateway Candid types + bounds-checking helpers. |
| test_vectors/dnssec/cloudflare-com-2026-05.json | Captured DNSSEC chain fixture for TXT verification (algorithm coverage). |
| test_vectors/dnssec/ed25519-nl-2026-05.json | Captured DNSSEC chain fixture exercising Ed25519 (alg 15). |
| test_vectors/dnssec/iana-root-anchors-2026-05.json | IANA root trust anchors fixture used by tests. |
| test_vectors/dnssec/proton.me-2026-05.json | Captured DNSSEC chain fixture for proton.me. |
| test_vectors/dnssec/protonmail.com-2026-05.json | Captured DNSSEC chain fixture for protonmail.com. |
| test_vectors/dnssec/tutanota.com-2026-05.json | Captured DNSSEC chain fixture for tutanota.com. |
| test_vectors/dkim/README.md | Documents DKIM fixture provenance and regeneration approach. |
| test_vectors/dkim/synth-rsa-relaxed-relaxed.eml | Synthetic DKIM-signed email fixture (relaxed/relaxed). |
| test_vectors/dkim/synth-rsa-relaxed-simple.eml | Synthetic DKIM-signed email fixture (relaxed/simple). |
| test_vectors/dkim/synth-rsa-simple-simple.eml | Synthetic DKIM-signed email fixture (simple/simple rejection). |
| test_vectors/dkim/synth-rsa-test1._domainkey.test.example.com.txt | Synthetic DKIM TXT record fixture (public key). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Three fixes originally landed bundled with DoH review feedback (commit 5d5f686 on PR #3879). They belong here in PR 2's DKIM verifier since they're spec-compliance fixes to RFC 6376: - canonicalize.rs: relaxed_body of an empty body returns CRLF (RFC 6376 §3.4.3). - verify.rs: simple_body of an empty body returns CRLF (same rule). - parse.rs: enforce that the h= tag includes the From header (RFC 6376 §5.4). Without this, DMARC alignment can be defeated by a post-sign rewrite of the From header. - types.rs touched by the original combined commit is not back-ported — it referenced crate::dmarc::EmailVerificationStatus which doesn't exist at this point in the stack.
aterga
left a comment
There was a problem hiding this comment.
Add a one-line comment in Cargo.toml (or dkim/mod.rs) explaining the WASM-safe configuration, or move resolver logic behind a feature flag if it only runs in tests. If the build fails on CI for wasm32-unknown-unknown, this blocks merge.
There was a problem hiding this comment.
Pre-approving, please check Grok's feedback before merging.
Updated PR #3877 Review (post-#3838 merge)
Overall assessment
With #3838 now merged, this PR is in excellent shape. The diff is now cleanly scoped to the DKIM verifier + minimal SMTP type forward-port (no more DNSSEC commits). The force-push today (May 12) already addressed the b= blanking bug I flagged in earlier feedback — perfect RFC §3.7 compliance. CI is green across the board, tests (including the new synthetic .eml + TXT vectors) all pass, and the implementation remains deliberately minimal, WASM-safe, and security-first.
No major security gaps. The critical path (parse → canonicalize → verify) is tight, happens before any state mutation, and enforces the exact subset of RFC 6376 that the email-recovery design requires. This is ready to merge once the two tiny remaining items below are knocked out.
Blockers (must-fix before merge)
None. The previous hard dependency on #3838 is gone. The only remaining action is a quick rebase onto current main (to drop the now-merged DNSSEC commits from the history), but that’s trivial and non-blocking for review.
Easy-to-change improvements (quick wins, <15 min each)
-
Error surface for callers (types.rs + main.rs)
EmailVerificationStatus/VerificationFailReasoncurrently lumps “no DKIM signature at all” into a generic failure. The frontend would benefit from a distinctNoSignaturePresent(or equivalent) so it can show a clear “This email provider doesn’t use DKIM” message instead of a scary “Verification failed”.
Action: Add one variant toVerificationFailReason(or toDkimErrorif you prefer) and update the match arm inmain.rs::handle_email_recovery. Zero security or correctness impact. -
Parser strictness on unknown tags (parse.rs)
DKIM-Signature parsing already rejects duplicates (correct). Confirm that truly unknown tags are ignored per RFC 6376 §3.5 (they must not cause failure). The TXT record parser indns_record.rsalready does this correctly; just make sure the signature parser matches.
Action: If it currently returnsInvalidHeaderon an unknown tag, change to ignore (with a debug-level log). Update the existing unit test that exercises an unknown tag to assert the ignore behavior. -
One-line security note in verify.rs
Add a comment next to the trusted TXT record lookup:// TXT record is DNSSEC-validated by the caller (see PR #3838)This makes the trust boundary explicit for future readers.
-
Test-vector CI job polish
The newdkim-test-vectorsjob is great. Just rename it todkim-verifier-tests(or similar) and make the scope comment explicit that it also exercises the interface crate types.
Action: One-line change in.github/workflows/...(already 90 % done).
Strategic recommendations
-
Security posture
Still rock-solid. Key wins:- Only
relaxed/*canonicalisation is accepted (avoids the re-emission pitfalls ofsimple). - Strict enforcement of
x=,i=alignment,t=ytesting mode, and minimum 1024-bit RSA. - Multi-signature loop with bottom-up header selection is exactly as specified in RFC 6376 §3.7.
- Hand-rolled parser + constant-time crypto primitives = minimal attack surface.
The synthetic test vectors +
dkimpy-generated fixtures give strong coverage. Once the full email-recovery stack lands, a quickcargo fuzzpass on the parser + canonicaliser (5–10 minutes of setup) would be cheap extra insurance. - Only
-
Maintainability
The newdkim/module is ~1.2 kLOC of security-critical code and is nicely isolated. Long-term, consider extracting it to a reusableic-dkimcrate (or publishing a WASM-safe fork of a maintained library) so other Internet Computer canisters can benefit without copying the code. Not urgent for this PR. -
Stack hygiene
The email-recovery series is landing cleanly. Keeping each PR this focused (DKIM-only, minimal interface changes) is exactly the right pattern. Once this merges, the next ones in the stack should be even faster to review.
Summary verdict: Approve after the rebase + the four easy changes above (all <30 min total). This is high-quality, defense-in-depth work — exactly what we want for email recovery. Happy to re-review the final diff or the next PR in the stack. Great job!
|
Took a pass at Grok's four "easy-to-change improvements" — opened #3885 (draft, targets Summary of what actually changed vs. what wasn't actionable:
The All 79 DKIM tests pass, clippy clean, wasm32 builds clean. Generated by Claude Code |
Addresses the four "easy-to-change improvements" from Grok's review on #3877 (#3877 (review)). Targets `feat/dkim-verifier` so it can be merged in as a follow-up commit to that PR. ## What changed 1. **`types.rs` — `NoSignature` doc-comment.** The distinct variant Grok asked for (`NoSignaturePresent` or equivalent) already exists as `VerificationFailReason::NoSignature` at types.rs:84 and is returned from `verify()` when no DKIM-Signature headers are present (verify.rs:54, verify.rs:67). The frontend can already distinguish it from a generic failure. Expanded the doc-comment to make the UX intent ("this provider doesn't use DKIM") explicit so future readers (and AI reviewers) don't re-flag this. 2. **`parse.rs` — explicit unknown-tag-ignored test.** RFC 6376 §3.5 requires implementations to ignore unrecognised tags. The parser already does this (it's lookup-by-name in `split_tag_list` → `get(name)`; unknown tags pass straight through — the existing `happy_value()` fixture even contains an unknown `q=dns/txt`). Pinned the behaviour with a dedicated `unknown_tags_are_ignored` test covering `z=` and a synthetic `zz=`. 3. **`verify.rs` — SECURITY comment at the trust boundary.** Added an inline `// SECURITY:` block at the `parse_dkim_txt` call site stating that `dkim_txt` is trusted, sourced from the DNSSEC verifier (#3838) or pinned-host DoH outcall (#3879). ## Not actioned - **CI job rename (`dkim-test-vectors` → `dkim-verifier-tests`).** No such job exists in `.github/workflows/`; PR #3877 doesn't add one. Nothing to rename. - **`main.rs::handle_email_recovery` match arm.** That endpoint hasn't been added yet (it's PRs 5–7 of the stack). No-op for this PR. ## Tests - `cargo test -p internet_identity --bin internet_identity dkim::` — 79 pass (was 78, +1 for the new unknown-tag test). - `cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings` — clean. - `cargo check -p internet_identity --target wasm32-unknown-unknown` — clean. - `cargo fmt --check` on the three modified files — clean (other pre-existing fmt drifts in `dnssec/`, `openid/`, `tests/integration/` are unrelated and predate this branch). --- _Generated by [Claude Code](https://claude.ai/code/session_01NZmvbHgzN5NQc7Hqx9ZzpP)_ Co-authored-by: Claude <noreply@anthropic.com>
Addresses the two actionable "easy-to-change improvements" from Grok's review on #3878 (#3878 (comment)). Targets `feat/dmarc-alignment` so it can land as a follow-up commit on that PR. (Tried to push directly to `feat/dmarc-alignment` per the maintainer's note from #3877 / #3885, but my account got a 403 against that branch — falling back to a stacked draft PR.) ## What changed 1. **`dmarc/mod.rs` — consolidated `# Security model` docblock.** Calls out the three deliberate deviations from "stock" DMARC, each with a one-paragraph rationale: - **No Public Suffix List** — label-anchored suffix check prevents `evilexample.com`-style spoofs at the cost of closing multi-domain orgs (`gmail.com` ↔ `googlemail.com`). Safe direction for a recovery surface. Design doc §6.4. - **No SPF** — recovery proves mailbox control, not path-of-delivery. DKIM gives the cryptographic binding; SPF would need a source IP the gateway payload doesn't carry. Design doc §6.5. - **Fail-closed everywhere** — every malformed/unknown step collapses to `Unverified`. No quarantine, no downgrade. 2. **`dmarc/test_vectors.rs` — `verifies_end_to_end_when_dmarc_record_has_unknown_tags`.** Runs the full `verify_email` against a DMARC record carrying `rua=`, `ruf=`, `fo=`, and a synthetic `vendorext=` tag. Pins the ignore-unknown-tags behaviour through the public entry point, not just the parser unit (which already has a `unknown_tags_are_ignored` test in `dmarc/parse.rs`). ## Not actioned - **Grok item #1 ("parser must ignore unknown tags").** Already implemented and unit-tested before this PR. `dmarc/parse.rs` is lookup-by-name, so unknown tags pass through silently. The PR description even calls this out: "ignores unknown / reporting tags". The new end-to-end test is the explicit confirmation Grok asked for under item #3. The four strategic recommendations (observability metrics, frontend error surface, SPF wiring, README) are explicitly "nice-to-have" / "future" and belong in later PRs of the stack. ## Tests - `cargo test -p internet_identity --bin internet_identity dmarc::` — 49 pass (was 48, +1 e2e unknown-tag test). - `cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings` — clean. - `cargo check -p internet_identity --target wasm32-unknown-unknown` — clean. - `cargo fmt --check` on the two touched files — clean (other pre-existing fmt drifts in `dnssec/`, `openid/`, `tests/integration/`, `types/attributes.rs` are unrelated and predate this branch). --- _Generated by [Claude Code](https://claude.ai/code/session_01NZmvbHgzN5NQc7Hqx9ZzpP)_ Co-authored-by: Claude <noreply@anthropic.com>
Addresses the two actionable "easy-to-change improvements" from Grok's review on #3878 (#3878 (comment)). Targets `feat/dmarc-alignment` so it can land as a follow-up commit on that PR. (Tried to push directly to `feat/dmarc-alignment` per the maintainer's note from #3877 / #3885, but my account got a 403 against that branch — falling back to a stacked draft PR.) ## What changed 1. **`dmarc/mod.rs` — consolidated `# Security model` docblock.** Calls out the three deliberate deviations from "stock" DMARC, each with a one-paragraph rationale: - **No Public Suffix List** — label-anchored suffix check prevents `evilexample.com`-style spoofs at the cost of closing multi-domain orgs (`gmail.com` ↔ `googlemail.com`). Safe direction for a recovery surface. Design doc §6.4. - **No SPF** — recovery proves mailbox control, not path-of-delivery. DKIM gives the cryptographic binding; SPF would need a source IP the gateway payload doesn't carry. Design doc §6.5. - **Fail-closed everywhere** — every malformed/unknown step collapses to `Unverified`. No quarantine, no downgrade. 2. **`dmarc/test_vectors.rs` — `verifies_end_to_end_when_dmarc_record_has_unknown_tags`.** Runs the full `verify_email` against a DMARC record carrying `rua=`, `ruf=`, `fo=`, and a synthetic `vendorext=` tag. Pins the ignore-unknown-tags behaviour through the public entry point, not just the parser unit (which already has a `unknown_tags_are_ignored` test in `dmarc/parse.rs`). ## Not actioned - **Grok item #1 ("parser must ignore unknown tags").** Already implemented and unit-tested before this PR. `dmarc/parse.rs` is lookup-by-name, so unknown tags pass through silently. The PR description even calls this out: "ignores unknown / reporting tags". The new end-to-end test is the explicit confirmation Grok asked for under item #3. The four strategic recommendations (observability metrics, frontend error surface, SPF wiring, README) are explicitly "nice-to-have" / "future" and belong in later PRs of the stack. ## Tests - `cargo test -p internet_identity --bin internet_identity dmarc::` — 49 pass (was 48, +1 e2e unknown-tag test). - `cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings` — clean. - `cargo check -p internet_identity --target wasm32-unknown-unknown` — clean. - `cargo fmt --check` on the two touched files — clean (other pre-existing fmt drifts in `dnssec/`, `openid/`, `tests/integration/`, `types/attributes.rs` are unrelated and predate this branch). --- _Generated by [Claude Code](https://claude.ai/code/session_01NZmvbHgzN5NQc7Hqx9ZzpP)_ Co-authored-by: Claude <noreply@anthropic.com>
…of email-recovery stack) (#3878) ## Summary PR 3 of the email-recovery stack (`docs/ongoing/email-recovery.md` §6). Stacks on top of #3877 (DKIM verifier). Lands a hand-rolled DMARC alignment check and reshapes the verifier API: `dkim::verify_dkim` becomes a DKIM-only primitive, and the new `dmarc::verify_email` is the public top-level entry point that produces the combined `EmailVerificationStatus`. **Note:** This PR targets `main` but includes PRs 1+2's commits as its base. Review the DMARC-specific changes by looking at commits on top of `ec371aae3` (PR 2's tip). Once PRs 1+2 merge, this PR's diff shrinks to just the DMARC additions. ## What's in this PR ### `src/internet_identity/src/dmarc/` - **`types.rs`** — `DmarcOutcome` (Aligned / Misaligned / NoRecord / Malformed), `DmarcPolicy` (None / Quarantine / Reject), `AlignmentMode` (Strict / Relaxed), `DmarcRecord`, plus the combined `EmailVerificationStatus` that carries both DKIM diagnostics and the DMARC outcome on success. - **`parse.rs`** (RFC 7489 §6.3) — DMARC TXT record parser. Enforces `v=DMARC1` must be first, `p=` must be one of {none, quarantine, reject}, `pct=` 0..=100, rejects duplicate tags, ignores unknown / reporting tags. 12 unit tests. - **`from_header.rs`** (RFC 5322 / RFC 7489 §3.1.1) — single-mailbox From-header parser. Accepts bare addr-spec, name-addr, and quoted-display-name forms; rejects zero/multiple From: headers, address-lists, group syntax. Tolerates comma/colon inside quoted display names. 16 unit tests. - **`alignment.rs`** — strict (exact match) + relaxed (exact match OR label-aligned subdomain in either direction). Stricter than RFC-compliant relaxed alignment because we deliberately don't consult the PSL — design doc §6.4 documents the trust + asymmetric-failure-mode reasoning. The dot anchor on the subdomain check prevents `evilexample.com` from aliasing `example.com`. 8 unit tests. - **`verify.rs`** — orchestration. DKIM first; on failure, surface the DKIM reason verbatim. On DKIM pass, parse From and check DMARC alignment. Accepted iff Aligned, OR NoRecord with `dkim_domain == from_domain`. 8 unit tests. - **`test_vectors.rs`** — 5 end-to-end tests reusing PR 2's synthetic .eml fixtures. ### `src/internet_identity/src/dkim/types.rs` (rename + new variants) - Renamed `EmailVerificationStatus` → `DkimVerifyResult` (DKIM-only). The combined verdict moved to `dmarc::EmailVerificationStatus` so it can carry the `DmarcOutcome`. - Added `MalformedFromHeader(String)`, `DmarcMalformed(String)`, `DmarcMisaligned` to `VerificationFailReason`. ### `src/internet_identity/src/dkim/mod.rs` - Re-exports `verify` as `verify_dkim` so downstream callers (the dmarc layer) don't have to deal with both a `dkim::verify` and `dmarc::verify` in scope at the same time. ## Test plan - [x] `cargo check -p internet_identity --target wasm32-unknown-unknown` — clean. - [x] `cargo test -p internet_identity --bin internet_identity dmarc` — 49 tests pass (12 parse + 16 from_header + 8 alignment + 8 verify + 5 e2e). - [x] `cargo test -p internet_identity --bin internet_identity` — 365 tests pass total (was 313 with PR 2; +49 dmarc + 3 small reshape adjustments). - [x] `cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings` — clean. - [x] `cargo fmt --check` — clean (modulo pre-existing unrelated diffs). ## PR Stack | # | PR | Description | Status | |---|---|---|---| | 0 | [#3836](#3836) | Design doc | Open | | 1 | [#3838](#3838) | DNSSEC verifier scaffold | Open | | 2 | [#3877](#3877) | DKIM verifier (RFC 6376) | Open | | 3 | [#3878](#3878) | DMARC alignment (RFC 7489) | Open | | 4 | [#3879](#3879) | DoH fallback | Open | | 5+6 | [#3880](#3880) | Setup flow (storage + smtp_request) | Open | | 7 | [#3881](#3881) | Recovery flow (delegation) | Open | | 8 | [#3882](#3882) | Frontend + feature flag | Open | | 9 | [#3883](#3883) | Deploy/upgrade scripts: dnssec_config + doh_config | Open | | 10 | [#3884](#3884) | Email-recovery UX overhaul | Open | --------- Co-authored-by: Arshavir Ter-Gabrielyan <arshavir.ter.gabrielyan@dfinity.org> Co-authored-by: Claude <noreply@anthropic.com>
Summary
PR 2 of the email-recovery stack (
docs/ongoing/email-recovery.md§10 Phase 0). Stacks on top of PR 3838 (DNSSEC verifier). Lands a hand-rolled RFC 6376 DKIM verifier that consumes a parsedSmtpRequestplus an already-trusted DKIM TXT record and returns a per-stepEmailVerificationStatus.Note: This PR targets
mainbut includes PR 3838's commits (DNSSEC verifier) as its base. Review the DKIM-specific changes by looking at commits after9bbd8717(the last PR 3838 commit). Once PR 3838 merges, this PR's diff will shrink to just the DKIM additions.Why hand-rolled
The design originally specified
mail-auth(Stalwart's well-tested DKIM library), but mail-auth pulls a non-optionalhickory-resolverdep that fails to compile forwasm32-unknown-unknown(transitive: tokio + mio). Forking + patching mail-auth would be possible but creates perpetual rebase burden. We hand-roll instead — "the right way, no shortcuts" was the explicit guidance.What's in this PR
src/internet_identity_interface/src/internet_identity/types/smtp.rsBrings forward the SMTP gateway protocol types from PoC PR 3760:
SmtpRequest/SmtpResponse/SmtpHeader/SmtpMessage/SmtpAddress/SmtpEnvelope, the size bounds, and the input-bound validation (format_addresslowercases both halves;truncate_at_char_boundaryclamps to the previous UTF-8 boundary so a multi-byte subject can't trap the canister). Drops postbox-specific bits (PostboxEmail, ValidatedSmtpRequest, anchor-number parser).src/internet_identity/src/dkim/types.rs— Algorithm (RsaSha256, Ed25519Sha256), HeaderCanon/BodyCanon (Relaxed, Simple), DkimCheck/DkimCheckName/DkimCheckStatus per-step diagnostics, EmailVerificationStatus / VerificationFailReason result shape.parse.rs(RFC 6376 §3.5) — DKIM-Signature header tag-list parser. Splits structurally on;first then on the first=per element, so a literalb=substring inside another tag's base64 doesn't get misread as a new tag start (the bug class the PoC PR review specifically flagged). Folded whitespace inside base64 values is stripped before decoding. Tag names case-insensitive; duplicates rejected.canonicalize.rs(§3.4.2 / §3.4.4) — relaxed header canon (lowercase name, unfold continuations, collapse WSP+ to single SP, strip trailing WSP, strip WSP around colon) and relaxed body canon (per-line WSP cleanup, drop trailing empty lines, ensure non-empty output ends in exactly one CRLF).dns_record.rs(§3.6.2) — DKIM TXT record parser. Tag names case-insensitive (P=vsp=was a PoC bug), whitespace insidep=tolerated (multi-chunk DNS TXT records),t=y/t=sflags honoured, unknown tags ignored.signature.rs— RSA-SHA256 (RFC 5702 / RFC 8301) and Ed25519-SHA256 (RFC 8463) signature verification on top ofrsa+sha2+ed25519-dalekfrom PR 1's deps. Enforces 1024-bit RSA minimum per design §5.6. Ed25519 path wraps in SHA-256 per RFC 8463. Plusbody_hash_sha256with optionall=truncation per §3.4.5.verify.rs— orchestration. Multi-signature loop per §5.5 (accept on first pass), tag enforcement per design §5.4 (c=relaxed/* only, x= expiration, i= alignment with d=, k= match, t=y testing-mode), bottom-up header selection per §5.4 when h= lists a name multiple times, b=value blanking that's structural-position-aware so it doesn't mis-target an internal substring.test_vectors.rs—#[cfg(test)].eml loader + 8 end-to-end tests against committed fixtures.test_vectors/dkim/relaxed/relaxed,relaxed/simple,simple/simple).Test plan
cargo check -p internet_identity --target wasm32-unknown-unknown— clean.cargo test -p internet_identity --bin internet_identity dkim— 75 tests pass (parse 14, canonicalize 18, dns_record 16, signature 7, verify 12, end-to-end 8).cargo test -p internet_identity --bin internet_identity— 313 tests pass total (was 238 before this PR; +75 DKIM, plus a few in smtp types).cargo test -p internet_identity_interface --lib— 52 tests pass (was 42; +10 SMTP type tests).cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings— clean.cargo fmt --check— clean (modulo pre-existing diffs unrelated to this PR).Stack
This is PR 2 of a 12-PR series. Includes PR 3838's commits as its base; once PR 3838 merges, the diff shrinks to just the DKIM additions.
Subsequent PRs:
PR Stack