Skip to content

Password sign-in form reachable on multi-binding device with mixed staleness #149

@aspiers

Description

@aspiers

Password sign-in form reachable on multi-binding device with mixed staleness (case 11)

Reproduction

On dev.certified.app deployment (commit 15da62c, deployed 2026-05-01 21:03 UTC). Browser had a previously-established device session containing two bound accounts: a fresh one (e.g. adam@hypercerts.org) and an older one (e.g. adam10, last authenticated >7 days earlier). Demo client https://demo.dev.certified.app/client-metadata.json initiated a fresh OAuth flow with no login_hint. User clicked the existing-session affordance for adam10. Browser landed on:

https://dev.certified.app/oauth/authorize?client_id=https%3A%2F%2Fdemo.dev.certified.app%2Fclient-metadata.json&request_uri=urn%3Aietf%3Aparams%3Aoauth%3Arequest_uri%3Areq-72ef99dac9bee0be3559be3f6f5840f1&epds_handle_mode=random

…which rendered upstream's stock sign-in-view (handle + password form). ePDS accounts are passwordless — that form is unsubmittable and is precisely the UI we have invested heavily in suppressing across PRs #128, #129, #134, and #141.

Confirmed via Railway logs (@certified-app/auth-service deployment 1b2973f8-b51d-4b44-b933-c4a119448289):

2026-05-01T21:05:13.653Z INFO HYPER-268 session reuse: device session detected, redirecting to pds-core
  requestUri="urn:ietf:params:oauth:request_uri:req-72ef99dac9bee0be3559be3f6f5840f1"
  target="https://dev.certified.app/oauth/authorize?client_id=...&request_uri=req-72ef99...&epds_handle_mode=random"

The PAR carried no login_hint. Auth-service decided to reuse the device session (cookie pair valid, no hint, shouldReuseSession returns true) and redirected to upstream /oauth/authorize. Upstream rendered the chooser. User clicked the stale row. The SPA performed a client-side state transition to sign-in-view in-place — no new HTTP request, no URL change, the auth-ui-guard never re-ran.

Root cause

Two distinct freshness systems exist; we have only been thoroughly testing one.

System 1 — cookie/device validity (well covered):
dev-id and ses-id cookies present, parse correctly, match the device row's stored sessionId, device row exists, device has at least one binding. The pre-route guard in packages/pds-core/src/auth-ui-guard.ts covers every failure of System 1 and bounces to auth-service. Scenarios in features/session-reuse-bugs.feature exercise this exhaustively (lines 30, 40, 46, 52, 58, 90).

System 2 — per-binding authentication age (the gap):
Even with System 1 fully valid, each binding has its own account_device.updated_at timestamp. Upstream's provider.checkLoginRequired(deviceAccount) returns true for any binding whose now - updated_at > authenticationMaxAge (default 7 days), independently per binding. The chooser ships loginRequired:true on those rows. Clicking a stale row makes the SPA swap to sign-in-view based on that flag.

The auth-ui-guard's existing System-2 check (auth-ui-guard.ts:304) bounces only when candidates.every(b => provider.checkLoginRequired(b)) — i.e. all candidates stale. With at least one fresh binding present, every is false and the guard passes. Upstream renders the chooser. The user can still pick the stale row. The SPA transition to sign-in-view runs purely client-side — pre-route middleware structurally cannot see it.

The class of bug we shipped fixes for (PRs #128, #129, #134, #141) all assumed the password form is reached via a server-rendered page response. That assumption is wrong for case 11: the password form is SPA state inside the same /oauth/authorize HTML response that also renders the chooser. The server-side HTML is identical; the divergence is loginRequired:true on a session in window.__sessions, which the SPA reads at click time.

Why prior testing didn't catch this

features/session-reuse-bugs.feature has scenarios for:

  • "Every binding's auth age exceeds 7 days" (line 218) — single binding stale, or all bindings stale → guard's every predicate fires.
  • "login_hint narrows to a stale binding on a multi-account device" (line 235) — filterCandidateBindings narrows candidates to the hinted (stale) one, every of [stale] = true → guard bounces.

The case that ships the bug: multi-binding, mixed staleness, no login_hint, user freely picks the stale row. Test hook (expire-device-account) exists; nobody wrote the scenario. The diff vs. line-235 is just dropping the hint, which sounds trivial — but without the hint, filterCandidateBindings doesn't narrow, the fresh binding masks the stale one in every, and the guard concludes the request is safe.

Decision matrix (full enumeration)

Variables: cookies state × bindings count × PAR prompt × login_hint resolution × per-binding loginRequired. Bullets enumerate every combination the guard reasons about, what it decides, and which scenario in features/session-reuse-bugs.feature covers it.

  • Cookies absent → guard bounces (parse fail). Covered (line 30, 40).
  • Half-pair → guard bounces (parse fail). Covered (line 30, 40).
  • Stale pair (server-unknown) → guard bounces (loadDeviceBindings returns null). Covered (line 46, 52, 58).
  • Valid cookies, zero bindings (purged) → guard bounces (bindings empty). Unit-tested; e2e pending (line 90).
  • Valid cookies, ≥1 binding, PAR prompt=login → guard bounces (promptHasLogin). Covered (line 183).
  • Valid cookies, 1 binding, no hint, that binding stale → guard bounces (every of [stale]). Covered (line 218).
  • Valid cookies, 1 binding, hint matches, binding stale → guard bounces. Covered indirectly.
  • Valid cookies, ≥2 bindings, no hint, hint matches a STALE binding (others fresh) → guard bounces (filterCandidateBindings narrows to [stale]). Covered (line 235).
  • Valid cookies, 1 fresh binding, no hint → guard passes; chooser → consent. Covered (line 26).
  • Valid cookies, ≥2 bindings, no hint, all fresh → guard passes; chooser → pick → consent. Covered (lines 137, 144).
  • Valid cookies, ≥2 bindings, no hint, MIXED staleness → guard passes (every is false because at least one fresh masks the stale ones); chooser renders all rows; user picks STALE → SPA swaps to sign-in-view in-place. NOT covered. This is case 11.
  • Valid cookies, ≥2 bindings, hint matches a fresh binding, others stale → guard passes; chooser pre-selects fresh → consent. Benign.

Other unreachable-form surfaces uncovered during this investigation

Upstream @atproto/oauth-provider@0.15.10 exposes more password-related surfaces than just the /oauth/authorize SPA. Each is a separate hardening concern that should be addressed alongside the case 11 fix.

/account* standalone account-management SPA
node_modules/.../@atproto/oauth-provider/dist/router/create-account-page-middleware.js lines 23–44 ship the same SPA bundle with a per-binding loginRequired flag. The case 11 click-into-stale-row mechanism applies identically here — clicking a stale row on /account* produces the same in-place sign-in-view swap. The pre-route guard's isGuardedPath already covers /account* for welcome-page and zero-bindings cases (auth-ui-guard.ts:76-79), but the case-11-equivalent System-2 hole exists on this route too.

POST /oauth/sign-in API endpoint
create-api-middleware.js:73-113. Backend that accepts {handle, password, remember} and authenticates against upstream's account manager. UI suppression doesn't disable this — it can be hit directly with curl. ePDS accounts have no password set so authentication will always fail, but the route still executes upstream's password-checking code, returns shaped errors, and exists as an attack surface on a system whose contract is "passwordless."

POST /oauth/reset-password-request and POST /oauth/reset-password-confirm
create-api-middleware.js:131-158. Password reset flow (email → token → set new password). Wrong control surface for ePDS entirely; we use email OTP for authentication itself, not for resetting an unused password.

GET /.well-known/change-password
create-account-page-middleware.js:18-20. Browser-discoverable password-change hint that redirects to /account/reset-password. Advertises that a password-reset flow exists.

Proposed fix

The investigation suggests three layered changes; (1) is the case 11 fix, (2) is the parallel /account* coverage, (3) is hardening of the API endpoints.

1. Stale-row click intercept in chooser-enrichment script. Extend the existing script in packages/pds-core/src/chooser-enrichment.ts (already injected into /oauth/authorize and /account* HTML responses; already captures __sessions and __deviceSessions globals). On each row that maps to a session with loginRequired:true:

  • Tag the row (data-epds-stale="1", plus data-epds-sub with the account sub).
  • Optionally render a small "Session expired — sign in again" hint inline.
  • Install a single capture-phase document click listener that closest('[data-epds-stale="1"]')-walks the event target. On match: event.stopImmediatePropagation() + event.preventDefault(), then window.location.assign(authOrigin + '/oauth/authorize?<original query>&prompt=login&login_hint=<sub or handle>').

This mirrors the existing "Another account" rebind exactly (chooser-enrichment.ts:106-132) — same auth origin, same query preservation, same prompt=login semantics — just keyed off data-epds-stale instead of upstream's "Login to account that is not listed" button.

The user keeps seeing the stale row (UX value: it reminds them which email addresses they previously used). Clicking it just sends them through a fresh OTP cycle pre-filled with the right email, instead of dead-ending on a password form.

This is a DOM-layer fix that operates on the loginRequired:true flag itself, not on any specific cause that produces the flag. Any future upstream trigger that maps to loginRequired:true (additional prompt modes, acr_values, id_token_hint mapping, etc.) folds into the same intercept — same general property the rejected "filter __sessions server-side" approach would have given, just preserving the row's visibility.

2. Same intercept covers /account* automatically. The chooser-enrichment script is already mounted on both routes (isChooserRequest at chooser-enrichment.ts:379-386). The new click intercept inherits that coverage with no additional mount work.

3. Hard-block password API routes in pds-core middleware. Mount a small Express middleware that returns 404 (or 403) on:

  • POST /oauth/sign-in
  • POST /oauth/reset-password-request
  • POST /oauth/reset-password-confirm
  • GET /.well-known/change-password

Independent of any SPA state. Cannot drift with upstream because it doesn't depend on upstream behaviour — it just removes the routes from the surface area entirely.

Test coverage to add

Two failing scenarios that demonstrate the bug must be added before the fix is merged.

  • Multi-binding mixed staleness on /oauth/authorize (features/session-reuse-bugs.feature):
    Given the device has two bound accounts; given one binding is fresh and the other has been backdated past 7 days via the existing expire-device-account test hook; when the demo client starts a new OAuth flow with no login_hint; then the chooser renders both rows; when the user clicks the stale row; then the browser lands on the auth-service email/OTP form (NOT a password form), with the stale account's email pre-filled.
  • Same scenario on /account*:
    Same setup; when the user navigates to /account; when the user clicks the stale row; same expected outcome.

Optionally also exercise the API hardening:

  • POST /oauth/sign-in returns 404 regardless of payload.
  • GET /.well-known/change-password returns 404.

Related issues / PRs

Why this kept slipping

Each prior fix patched a specific trigger (PAR expiry, welcome-page render, prompt=login handling, "Another account" button, login_hint narrowing). Each addressed a real hole. None addressed the class: a per-binding loginRequired flag combined with client-side SPA branching that the pre-route guard structurally cannot observe. Until now, the assumption was that all paths to the password form ran through a server-side render and could therefore be intercepted by middleware. Case 11 demonstrates that assumption is wrong, and that any future per-binding-staleness trigger upstream adds will exhibit the same shape unless we move detection to the DOM layer (or, equivalently, to wherever else the loginRequired flag is consumed).

Acceptance criterion: after the fix, the matrix above must have no remaining "guard passes; chooser → user picks STALE → sign-in-view" outcome. Equivalent statement: there must be no clickable affordance, anywhere in the ePDS UI surface, that produces upstream's sign-in-view.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions