Skip to content

feat(dmarc): RFC 7489 alignment + combined DKIM+DMARC verifier (PR 3 of email-recovery stack)#3840

Closed
sea-snake wants to merge 2 commits into
dfinity:feat/dkim-verifierfrom
sea-snake:feat/dmarc-alignment
Closed

feat(dmarc): RFC 7489 alignment + combined DKIM+DMARC verifier (PR 3 of email-recovery stack)#3840
sea-snake wants to merge 2 commits into
dfinity:feat/dkim-verifierfrom
sea-snake:feat/dmarc-alignment

Conversation

@sea-snake
Copy link
Copy Markdown
Contributor

@sea-snake sea-snake commented May 5, 2026

Summary

PR 3 of the email-recovery stack (docs/ongoing/email-recovery.md §6). Stacks on top of #3839 (DKIM verifier). Lands a hand-rolled DMARC alignment check and reshapes the verifier API: dkim::verify_dkim becomes a DKIM-only primitive, and the new dmarc::verify_email is the public top-level entry point that produces the combined EmailVerificationStatus.

Note: This PR targets main but includes PRs 1+2's commits as its base. Review the DMARC-specific changes by looking at commits on top of ec371aae3 (PR 2's tip). Once PRs 1+2 merge, this PR's diff shrinks to just the DMARC additions.

What's in this PR

src/internet_identity/src/dmarc/

  • types.rsDmarcOutcome (Aligned / Misaligned / NoRecord / Malformed), DmarcPolicy (None / Quarantine / Reject), AlignmentMode (Strict / Relaxed), DmarcRecord, plus the combined EmailVerificationStatus that carries both DKIM diagnostics and the DMARC outcome on success.
  • parse.rs (RFC 7489 §6.3) — DMARC TXT record parser. Enforces v=DMARC1 must be first, p= must be one of {none, quarantine, reject}, pct= 0..=100, rejects duplicate tags, ignores unknown / reporting tags. 12 unit tests.
  • from_header.rs (RFC 5322 / RFC 7489 §3.1.1) — single-mailbox From-header parser. Accepts bare addr-spec, name-addr, and quoted-display-name forms; rejects zero/multiple From: headers, address-lists, group syntax. Tolerates comma/colon inside quoted display names. 16 unit tests.
  • alignment.rs — strict (exact match) + relaxed (exact match OR label-aligned subdomain in either direction). Stricter than RFC-compliant relaxed alignment because we deliberately don't consult the PSL — design doc §6.4 documents the trust + asymmetric-failure-mode reasoning. The dot anchor on the subdomain check prevents evilexample.com from aliasing example.com. 8 unit tests.
  • verify.rs — orchestration. DKIM first; on failure, surface the DKIM reason verbatim. On DKIM pass, parse From and check DMARC alignment. Accepted iff Aligned, OR NoRecord with dkim_domain == from_domain. 8 unit tests.
  • test_vectors.rs — 5 end-to-end tests reusing PR 2's synthetic .eml fixtures.

src/internet_identity/src/dkim/types.rs (rename + new variants)

  • Renamed EmailVerificationStatusDkimVerifyResult (DKIM-only). The combined verdict moved to dmarc::EmailVerificationStatus so it can carry the DmarcOutcome.
  • Added MalformedFromHeader(String), DmarcMalformed(String), DmarcMisaligned to VerificationFailReason.

src/internet_identity/src/dkim/mod.rs

  • Re-exports verify as verify_dkim so downstream callers (the dmarc layer) don't have to deal with both a dkim::verify and dmarc::verify in scope at the same time.

Test plan

  • cargo check -p internet_identity --target wasm32-unknown-unknown — clean.
  • cargo test -p internet_identity --bin internet_identity dmarc — 49 tests pass (12 parse + 16 from_header + 8 alignment + 8 verify + 5 e2e).
  • cargo test -p internet_identity --bin internet_identity — 365 tests pass total (was 313 with PR 2; +49 dmarc + 3 small reshape adjustments).
  • cargo clippy -p internet_identity --bin internet_identity --tests -- -D warnings — clean.
  • cargo fmt --check — clean (modulo pre-existing unrelated diffs).

PR Stack

# PR Description Status
0 #3836 Design doc Open
1 #3838 DNSSEC verifier scaffold Open
2 #3839 DKIM verifier (RFC 6376) Open
3 #3840 DMARC alignment (RFC 7489) Open
4 #3841 DoH fallback Open
5+6 #3842 Setup flow (storage + smtp_request) Open
7 #3843 Recovery flow (delegation) Open
8 #3844 Frontend + feature flag Open
9 #3855 Deploy/upgrade scripts: dnssec_config + doh_config Open
10 #3857 Email-recovery UX overhaul Open

Copilot AI review requested due to automatic review settings May 5, 2026 12:35
@sea-snake sea-snake requested a review from a team as a code owner May 5, 2026 12:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This stacked PR advances the email-recovery pipeline by adding the low-level DNSSEC, DKIM, and DMARC verification pieces that later recovery endpoints will rely on. In this diff, the public verification surface shifts toward a combined DKIM+DMARC verdict while also introducing the DNSSEC trust-anchor/config plumbing and supporting test fixtures.

Changes:

  • Adds a hand-rolled DNSSEC verifier, DKIM verifier, and DMARC parser/alignment/verifier flow for email recovery.
  • Extends canister/interface config with DNSSEC trust anchors and adds SMTP gateway wire types.
  • Adds captured DNSSEC fixtures, synthetic DKIM fixtures, end-to-end/unit tests, and a DNSSEC capture helper script.

Reviewed changes

Copilot reviewed 38 out of 39 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
test_vectors/dnssec/iana-root-anchors-2026-05.json Root trust-anchor fixture metadata/config sample.
test_vectors/dnssec/cloudflare-com-2026-05.json Captured DNSSEC proof bundle fixture for verifier tests.
test_vectors/dkim/synth-rsa-test1._domainkey.test.example.com.txt Synthetic DKIM TXT record fixture.
test_vectors/dkim/synth-rsa-simple-simple.eml Synthetic DKIM message fixture (simple/simple).
test_vectors/dkim/synth-rsa-relaxed-simple.eml Synthetic DKIM message fixture (relaxed/simple).
test_vectors/dkim/synth-rsa-relaxed-relaxed.eml Synthetic DKIM message fixture (relaxed/relaxed).
test_vectors/dkim/README.md Fixture provenance and regeneration notes.
src/internet_identity/src/storage/storable/storable_persistent_state.rs Persists DNSSEC config through stable storage.
src/internet_identity/src/state.rs Adds DNSSEC config to persistent canister state.
src/internet_identity/src/main.rs Wires DNSSEC config into config()/install-arg handling and registers new modules.
src/internet_identity/src/dnssec/verify.rs DNSSEC verification entry point and chain-walk logic.
src/internet_identity/src/dnssec/types.rs DNSSEC proof/result/error data types.
src/internet_identity/src/dnssec/test_vectors.rs DNSSEC fixture loader for tests.
src/internet_identity/src/dnssec/signature.rs DNSSEC signature and DS-digest verification routines.
src/internet_identity/src/dnssec/mod.rs DNSSEC module exports.
src/internet_identity/src/dnssec/canonical.rs DNSSEC canonical wire-format helpers.
src/internet_identity/src/dmarc/verify.rs Combined DKIM+DMARC verification orchestration.
src/internet_identity/src/dmarc/types.rs DMARC result/policy/alignment types.
src/internet_identity/src/dmarc/test_vectors.rs End-to-end DMARC tests using DKIM fixtures.
src/internet_identity/src/dmarc/parse.rs DMARC TXT parser.
src/internet_identity/src/dmarc/mod.rs DMARC module exports.
src/internet_identity/src/dmarc/from_header.rs From: header parser for DMARC alignment.
src/internet_identity/src/dmarc/alignment.rs DMARC strict/relaxed alignment checks.
src/internet_identity/src/dkim/verify.rs DKIM verification orchestration and header/body hashing.
src/internet_identity/src/dkim/types.rs DKIM result/check/failure types.
src/internet_identity/src/dkim/test_vectors.rs DKIM fixture loader and end-to-end tests.
src/internet_identity/src/dkim/signature.rs DKIM RSA/Ed25519 signature verification and body hashing.
src/internet_identity/src/dkim/parse.rs DKIM-Signature parser.
src/internet_identity/src/dkim/mod.rs DKIM module exports.
src/internet_identity/src/dkim/dns_record.rs DKIM TXT record parser.
src/internet_identity/src/dkim/canonicalize.rs DKIM relaxed canonicalization helpers.
src/internet_identity/internet_identity.did Exposes DNSSEC config types in Candid.
src/internet_identity/Cargo.toml Adds verifier crypto dependencies to canister crate.
src/internet_identity_interface/src/internet_identity/types/smtp.rs Adds SMTP gateway wire/validation types.
src/internet_identity_interface/src/internet_identity/types/dnssec.rs Adds interface-level DNSSEC config types.
src/internet_identity_interface/src/internet_identity/types.rs Re-exports SMTP/DNSSEC types and extends init args.
scripts/capture-dnssec-chain.py Helper script to capture DNSSEC proof bundles.
Cargo.toml Adds workspace crypto dependencies for DNSSEC/DKIM.
Cargo.lock Locks newly added/transitive dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/internet_identity/src/dkim/canonicalize.rs
Comment thread src/internet_identity/src/dkim/verify.rs
Comment thread src/internet_identity/src/dkim/parse.rs
Comment thread src/internet_identity/src/dnssec/verify.rs Outdated
Comment thread src/internet_identity/src/dmarc/parse.rs
Comment thread src/internet_identity/src/dmarc/from_header.rs
Comment on lines +32 to +33
is_subdomain_of(dkim_domain, from_domain) || is_subdomain_of(from_domain, dkim_domain)
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is intentional and called out in the design doc (§6.4). We deliberately don't ship a Public Suffix List in the canister:

  • Size: the PSL is ~250 KB compressed, ~1 MB uncompressed, and grows monotonically.
  • Freshness: it changes weekly, so we'd need a continuous upgrade-arg refresh path or a separate fetch+verify mechanism.
  • Trust root: the PSL is community-maintained without a strong cryptographic provenance story; pulling it through the deploy arg means trusting whoever curated this deploy's blob.

The mitigation we do enforce is the label-anchored dot check (so evilexample.com cannot align with example.com), and the co.uk-style asymmetric case is fail-closed for the From-domain owner (DMARC misaligned), not fail-open. The threat model is "spoofer claims to be user@co.uk" — co.uk doesn't publish DMARC, so receiver semantics fall back to "no DMARC, accept iff DKIM d= matches From". I think that's the safer default for a verifier targeting major mailbox providers (Gmail, Outlook, etc.) which all publish proper DMARC.

Open to revisiting if we'd rather ship a PSL — happy to discuss in a separate thread. Leaving as-is for now.

Comment thread src/internet_identity/src/dkim/types.rs
Comment thread test_vectors/dnssec/iana-root-anchors-2026-05.json Outdated
Comment thread src/internet_identity/src/dmarc/verify.rs
sea-snake added 2 commits May 11, 2026 16:00
PR 3 of the email-recovery stack (docs/ongoing/email-recovery.md §6).
Stacks on top of PR 2 (DKIM verifier) and reshapes the verifier API:
dkim::verify_dkim now returns a DKIM-only DkimVerifyResult, and the new
dmarc::verify_email is the public top-level entry point that produces
the combined EmailVerificationStatus.

New module src/internet_identity/src/dmarc/:
- types.rs: DmarcOutcome (Aligned / Misaligned / NoRecord / Malformed),
  DmarcPolicy, AlignmentMode, DmarcRecord, plus the combined
  EmailVerificationStatus that carries both the DKIM diagnostic and
  the DMARC outcome on success.
- parse.rs: DMARC TXT record parser per RFC 7489 §6.3. Enforces
  v=DMARC1 must be first, p= must be one of {none, quarantine, reject},
  pct= 0..=100, rejects duplicate tags, ignores unknown tags. 12 unit
  tests.
- from_header.rs: RFC 5322 single-mailbox From: parser. Accepts
  bare addr-spec, name-addr, and quoted-display-name forms; rejects
  zero/multiple From: headers, address-lists, group syntax. Tolerates
  comma/colon inside quoted display names. 16 unit tests.
- alignment.rs: strict (exact match) + relaxed (exact match OR
  label-aligned subdomain in either direction). Stricter than RFC-
  compliant relaxed alignment because we deliberately don't consult
  the PSL — see design doc §6.4. The dot anchor on the subdomain check
  prevents 'evilexample.com' from aliasing 'example.com'. 8 unit tests.
- verify.rs: orchestration. DKIM first; on failure, surface the DKIM
  reason verbatim. On DKIM pass, parse From and check DMARC alignment.
  Accepted iff Aligned, OR NoRecord with dkim_domain == from_domain.
  8 unit tests.

dkim/types.rs:
- Renamed EmailVerificationStatus -> DkimVerifyResult (DKIM-only).
  The combined verdict moved to dmarc::EmailVerificationStatus so it
  can carry the DmarcOutcome.
- Added MalformedFromHeader, DmarcMalformed, DmarcMisaligned to
  VerificationFailReason.

44 DMARC tests pass (12 parse + 16 from_header + 8 alignment + 8 verify),
on top of the existing 78 DKIM tests. Wasm32 build clean.
5 end-to-end tests in dmarc/test_vectors.rs reusing the synthetic .eml
fixtures from PR 2 (alice@test.example.com signed with d=test.example.com,
exact match → trivially aligned regardless of mode). Covers:
- no DMARC record + dkim==from → Verified(NoRecord)
- aligned DMARC strict → Verified(Aligned, Strict)
- aligned DMARC relaxed (default) → Verified(Aligned, Relaxed)
- malformed DMARC TXT → Unverified(DmarcMalformed)
- From: with address-list → Unverified

49 dmarc tests total (44 unit + 5 e2e). 365 total in the II suite. The
parse_eml helper is duplicated from dkim/test_vectors.rs because the
original is #[cfg(test)] and not pub-visible across modules; the
duplication is contained to test code.
@sea-snake sea-snake force-pushed the feat/dmarc-alignment branch from 6f9cf9b to 42ee57d Compare May 11, 2026 16:01
@sea-snake sea-snake changed the base branch from main to feat/dkim-verifier May 12, 2026 11:49
@sea-snake
Copy link
Copy Markdown
Contributor Author

Replaced with a new PR from an upstream branch (enables direct collaboration). Same content, new PR number.

sea-snake added a commit that referenced this pull request May 12, 2026


DKIM (PR 2's code):
- canonicalize.rs: relaxed_body empty body → CRLF per RFC 6376 §3.4.3
- verify.rs: simple_body empty body → CRLF (same rule)
- parse.rs: enforce that h= includes the From header per RFC 6376 §5.4
  (without this, DMARC alignment can be defeated by a post-sign rewrite)
- types.rs: doc comment now points at dmarc::EmailVerificationStatus
  rather than the misleading DkimVerifyResult

DNSSEC (PR 1's code):
- verify.rs: try ALL matching trust-anchor candidates before giving up.
  During KSK rollovers operators configure both the rolling-out and
  rolling-in keys; first-match early-exit could fail under the inactive
  one and never try the active one.
- mod.rs: doc comment no longer claims the email-recovery stack has no
  HTTP outcalls (the DoH fallback module does).
- iana-root-anchors-2026-05.json: corrected stale "_comment" that
  claimed the historical 19036 KSK was included.

DMARC (PR 3's code):
- parse.rs: enforce p= immediately after v=DMARC1 per RFC 7489 §6.3.
  Real-world records all do this; OpenDMARC and other reference
  parsers reject the alternative.
- from_header.rs: honour backslash escapes inside quoted-string
  display names so `"Alice \"Ops, Inc\"" <a@e.com>` parses correctly.
- verify.rs: removed empty placeholder test that asserted nothing;
  the e2e path is exercised by test_vectors::*.

DoH (PR 4's code):
- parser.rs: encode_name now rejects invalid names (label > 63,
  total > 255, empty labels) rather than silently truncating —
  truncation would change which name we ask for, which is a real
  correctness issue.
- mod.rs: fetch_txt now enforces a label-anchored suffix match
  between `name` and `registered_domain` (defence-in-depth: a caller
  bug otherwise lets an allowlisted registered_domain authorise an
  outcall for an unrelated FQDN).
- mod.rs: transform_doh now signals parse failure via HTTP status
  422 + empty body instead of a sentinel string in the body. The
  prior sentinel could collide with a (legal-but-unusual) TXT
  payload and silently turn valid records into "malformed".
- types.rs / quorum.rs / interface doc: fixed stale comments that
  said "three providers" or implied parallelism in the sync test
  helper (the production async path uses futures::future::join_all;
  run_quorum is sequential).
- New error variants: DohError::InvalidName, NameOutsideRegisteredDomain.
- New unit tests cover the new validation paths.

Test counts: 408 → 414 (added 6).

Note on the alignment.rs / no-PSL flag: the Copilot comment that
relaxed alignment is "fail-open without a PSL" is correct in
isolation but reflects a documented design decision. We deliberately
don't ship a PSL in the canister (size, freshness, and trust-root
concerns — see design doc §6.4). The mitigation is the dot-anchored
suffix check that prevents `evilexample.com` from matching
`example.com`. Left unchanged; will reply with the design rationale.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
pull Bot pushed a commit to mikeyhodl/internet-identity that referenced this pull request May 12, 2026
…iring (dfinity#3838)

## Summary

First PR in the email-recovery stack (`docs/ongoing/email-recovery.md`
§10 Phase 0). Lands a working RFC-4035-compliant DNSSEC verifier for
caller-supplied DNS proof bundles, plus the trust-anchor wiring that
drives it. PR 2 (DKIM verifier) and PRs 4–9 (storage + recovery methods)
build on this.

## What's in this PR

### Verifier core
- New `dnssec/` module under `src/internet_identity/src/`:
- `types.rs` — `DnsProofBundle`, `SignedRRset`, `DelegationLink`,
`Rrsig`, `DnsName`, `DnssecError`, `VerifiedRecord`.
- `canonical.rs` — owner-name canonicalization, RR canonical form, RRSIG
signed-data construction (RFC 4034 §3.1.8.1, §6.2, §6.3), DS digest
input.
  - `signature.rs` — algorithm dispatch + DS digest matching (SHA-256).
- `verify.rs` — four-step algorithm (root anchor match → chain walk →
leaf RRSIG → freshness).
- Algorithm coverage (RFC 8624 MUST set):
  - **alg 8** — RSA-SHA256 (RFC 5702): root, com., most legacy zones.
- **alg 13** — ECDSA-P256-SHA256 (RFC 6605): most TLDs, Cloudflare,
modern zones.
  - **alg 15** — Ed25519 (RFC 8080): rare in production but mandatory.
  - Anything else returns `UnsupportedAlgorithm`.

### Wiring
- New `DnssecConfig` and `DnssecRootAnchor` types in
`internet_identity_interface`, exposed at the top of
`InternetIdentityInit` as `dnssec_config: opt opt DnssecConfig`
(set/clear semantics matching `analytics_config` and `dummy_auth`).
- Trust-anchor list plumbed through `init`/`post_upgrade` into
`PersistentState.dnssec_config` (and `StorablePersistentState` for
cross-upgrade persistence).
- `internet_identity.did` updated.

### Tests
13 unit tests in `dnssec/` covering:
- Real cloudflare.com chain verifies end-to-end (exercises alg 8 at root
and alg 13 at com → cloudflare.com → leaf).
- Empty trust-anchor list rejected with `NoTrustAnchors`.
- Wrong trust anchor rejected with `RootAnchorMismatch`.
- Flipped byte in root DNSKEY → `RootAnchorMismatch` or `BadSignature`.
- Flipped byte in leaf signature → `BadSignature`.
- Stale signature (clock advanced past expiration) →
`StaleOrFutureSignature`.
- Unsupported algorithm (alg 5 / RSA-SHA1) → `UnsupportedAlgorithm(5)`.
- Plus canonical-encoding + RFC 3110 RSA key parsing unit tests.

### Test infrastructure
- `test_vectors/dnssec/cloudflare-com-2026-05.json` — real DoH-captured
chain (root DNSKEY + 2 delegation links + leaf TXT).
- `test_vectors/dnssec/iana-root-anchors-2026-05.json` — IANA root KSK
trust anchors (Klajeyz/2017 + Kmyv6jo/2024).
- `scripts/capture-dnssec-chain.py` — reproducible capture script using
dnspython + DoH wire format. Tests use a frozen now from the capture's
metadata so freshness checks stay stable indefinitely.

### New deps
- `domain` (NLnet Labs, pure Rust) — referenced in docstrings for
canonicalisation primitives; signature verification is hand-rolled on
top of RustCrypto.
- `p256` — ECDSA P-256 verification.
- `ed25519-dalek` — Ed25519 verification.

All three build cleanly for wasm32-unknown-unknown.

## What's deferred to later PRs in the stack

- Captures for additional providers (proton.me, protonmail.com,
tutanota.com — gmail.com / icloud.com / outlook.com / fastmail.com don't
sign with DNSSEC; this is acknowledged in design doc §7.6).
- Synthetic Ed25519 (alg 15) test vector — most production zones are alg
8 or 13; alg 15 is structurally exercised by the dispatch logic but
doesn't have a real captured chain in this PR.

## Test plan

- [x] `cargo check -p internet_identity --target wasm32-unknown-unknown`
— clean (no warnings).
- [x] `cargo test -p internet_identity --bin internet_identity` — 238
tests pass (was 227 pre-PR).
- [x] `cargo test -p internet_identity_interface --lib` — 42 tests pass.
- [x] `cargo clippy -p internet_identity --bin internet_identity --tests
-- -D warnings` — clean.
- [x] `cargo fmt --check` — clean (modulo a pre-existing diff in
attributes.rs unrelated to this PR).
- [x] CI on dfinity#3838 — fully green.

## Design doc


https://github.com/sea-snake/internet-identity/blob/design/email-recovery/docs/ongoing/email-recovery.md
(PR pending review on dfinity/internet-identity)

## Stack

This is PR 1 of a 12-PR series. Subsequent PRs:
- **PR 2** — mail-auth-backed DKIM verifier, consuming DnsProofBundle
from this PR.
- **PR 3** — DMARC alignment.
- **PRs 4–9** — storage + Candid + behavior for email recovery.
- **PRs 10–12** — frontend (DoH walker, Manage UI, recovery wizard).

## PR Stack
| # | PR | Description | Status |
|---|---|---|---|
| 0 | [dfinity#3836](dfinity#3836) |
Design doc | Open |
| 1 | [dfinity#3838](dfinity#3838) |
DNSSEC verifier scaffold | Open |
| 2 | [dfinity#3839](dfinity#3839) |
DKIM verifier (RFC 6376) | Open |
| 3 | [dfinity#3840](dfinity#3840) |
DMARC alignment (RFC 7489) | Open |
| 4 | [dfinity#3841](dfinity#3841) |
DoH fallback | Open |
| 5+6 | [dfinity#3842](dfinity#3842)
| Setup flow (storage + smtp_request) | Open |
| 7 | [dfinity#3843](dfinity#3843) |
Recovery flow (delegation) | Open |
| 8 | [dfinity#3844](dfinity#3844) |
Frontend + feature flag | Open |
| 9 | [dfinity#3855](dfinity#3855) |
Deploy/upgrade scripts: dnssec_config + doh_config | Open |
| 10 | [dfinity#3857](dfinity#3857) |
Email-recovery UX overhaul | Open |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants