Skip to content

fix(oauth): bound OAuth access-token lifetime across all backends (redis/sqlite/memory)#169

Merged
badjer merged 5 commits into
mainfrom
fix/oauth-token-default-ttl
May 27, 2026
Merged

fix(oauth): bound OAuth access-token lifetime across all backends (redis/sqlite/memory)#169
badjer merged 5 commits into
mainfrom
fix/oauth-token-default-ttl

Conversation

@badjer
Copy link
Copy Markdown
Contributor

@badjer badjer commented May 26, 2026

Problem

RedisOAuthDb.saveAccessToken stored access tokens with no expiry when the token had no expiresAt and no ttl. The ATXP server middleware (withATXPContext) saves the inbound token on every authenticated request without an expiresAt, so MCP servers accumulated one immortal key per account — growing the shared redis-mcp instance until its AOF‑rewrite fork OOM‑killed it (~13‑min crash loop, 502s across every ATXP MCP server). Review surfaced this as one instance of a broader bug: no OAuthDb backend bounded server‑cached tokens, and the token's real expiry was being discarded.

Root cause (spans all backends)

  1. introspectToken dropped exp — returned only {active, scope, sub, aud} (TokenData had no exp).
  2. withATXPContext saved no expiresAt → backends stored unbounded.
  3. sqlite never evictedgetAccessToken selected expires_at but never compared/deleted.
  4. seconds‑vs‑milliseconds inconsistencyAccessToken.expiresAt was epoch seconds in redis but ms in atxp-client and MemoryOAuthDb.

Fix

Layer Change
Root Carry introspection exp (RFC 7662, epoch seconds) through TokenDataintrospectTokenwithATXPContext sets dbData.expiresAt. Cached tokens now expire at the token's real lifetime across redis, sqlite, and memory.
Units Standardize AccessToken.expiresAt on epoch seconds (fix atxp-client writes + MemoryOAuthDb; redis already correct; documented on the type).
sqlite getAccessToken compares expires_at and deletes + returns null when expired (mirrors redis read‑path).
redis Backstop: DEFAULT_ACCESS_TOKEN_TTL_SECONDS (24h, configurable defaultTtl) for any token lacking an expiry.
lockfile Dedup atxp-client/atxp-base/atxp-tempo's nested registry @atxp/common+@atxp/mpp (stale 0.11.7) to the workspace, so npm ci/CI test against local code.

Scope of the fixed‑TTL backstop (the no‑exp edge case)

exp‑based expiry bounds all three backends. The 24h fixed backstop is redis‑only by design: redis is the shared server‑side cache that OOM'd (safe to bound; re‑populated on every request), whereas sqlite/memory are client token stores where a token with no expiry may be intentionally long‑lived — a blanket TTL there could evict valid tokens. In practice the ATXP AS returns exp, so the no‑exp path is an edge case; if it were ever omitted, sqlite/memory would retain such a token (sqlite until overwritten, memory until restart) — far narrower than the original unbounded bug, and redis (the store that OOM'd) is covered both ways.

Compatibility note (no CHANGELOG in repo — flagging here for release)

AccessToken.expiresAt changes from epoch milliseconds to epoch seconds. Safe within the SDK (packages are pinned/released in lockstep). An external consumer mixing a pre‑change @atxp/common with a post‑change @atxp/client (or vice‑versa) across this boundary would see expiry misbehavior — worth a release‑note callout / coordinated bump.

CI fidelity

package-lock.json pinned atxp-client to a nested registry @atxp/common@0.11.7 (old ms code), so npm ci shadowed the workspace symlink and CI tested published code, not this change. Regenerated the lock (deps already match workspace 0.11.8; clean +1/−31 diff). Verified: after rm -rf node_modules && npm ci, no nested @atxp copy is created (require.resolvepackages/atxp-common/dist/index.cjs) and the atxp‑client suite is 208/208 with nothing deleted manually.

Tests

common 105 · server 191 · client 208 · sqlite 27 · redis 20 — all passing; typecheck clean. New tests: server exp‑propagation, memory seconds‑expiry (memory had no tests before), sqlite read‑path eviction, redis sliding‑window + renamed unconditional‑ttl test. Only RedisOAuthDb/SqliteOAuthDb/MemoryOAuthDb implement OAuthDb (no Postgres OAuthDb in the SDK).

Merge coordination

Touches introspectToken's return in oAuthResource.ts, which also overlaps fix/oauth-dcr-only-on-invalid-client. Not a guaranteed conflict, but coordinate merge order; I'll rebase whichever lands second.

🤖 Generated with Claude Code

badjer and others added 4 commits May 26, 2026 14:27
RedisOAuthDb.saveAccessToken stored access tokens with no expiry (plain SET) whenever the token had no expiresAt and no ttl was configured. The ATXP server middleware (atxpContext.withATXPContext) saves the inbound token on every authenticated request without an expiresAt, so these keys accumulated without bound and grew shared Redis instances until they OOM'd (a 256MB shared instance crash-looped in production).

Apply a bounded default TTL (DEFAULT_ACCESS_TOKEN_TTL_SECONDS = 24h, configurable via the new defaultTtl option) in the no-expiry branch instead of storing indefinitely. Tokens with an explicit expiresAt or a configured ttl are unchanged. Because the middleware re-saves on every request, active sessions stay cached while idle entries age out. Consumers adopt the fix by bumping @atxp/redis; no per-service config required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…iresAt to epoch seconds

The server middleware (withATXPContext) cached the inbound access token with no expiresAt, so OAuthDb backends stored it without bound and could later return an already-expired token. Carry the introspection exp (RFC 7662, epoch seconds) through TokenData and introspectToken, and set dbData.expiresAt from it, so the cache entry is bounded by the token's real lifetime across redis, sqlite, and memory. Also standardize AccessToken.expiresAt on epoch seconds (atxp-client wrote milliseconds, MemoryOAuthDb compared milliseconds, atxp-redis already used seconds) and document the unit. Adds memory-expiry and server exp-propagation tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SqliteOAuthDb.getAccessToken selected expires_at but never compared it to now or deleted the row, so it returned expired tokens as valid and oauth_access_tokens grew unbounded. Compare expires_at (epoch seconds) on read; if expired, delete the row and return null, mirroring the redis read-path. Without this, propagating exp would silently no-op for sqlite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… note staleness window

Rename the misleadingly-named default-TTL test: the ttl option is an unconditional TTL applied to all tokens (overriding even an explicit expiresAt), distinct from the defaultTtl fallback. Add a test documenting that the default TTL is re-armed on every re-save (the sliding-window behavior the fix relies on). Note in code that no-expiry entries store expires_at null, so the read-path expiry check is skipped and the default TTL is their only bound.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@badjer badjer changed the title fix(redis): bound default TTL for access tokens without expiry fix(oauth): bound OAuth access-token lifetime across all backends (redis/sqlite/memory) May 26, 2026
@badjer
Copy link
Copy Markdown
Contributor Author

badjer commented May 26, 2026

Expanded this PR to fix the bug at the root across all three OAuthDb backends, per your feedback — full details in the updated description. Summary:

  • exp propagation (root cause): introspectToken now carries expwithATXPContext sets expiresAt, so redis/sqlite/memory all bound the cached token by its real lifetime.
  • sqlite: added the missing read‑path expiry + eviction — you were right that propagating exp alone silently no‑ops there (it stored expires_at but never read it).
  • units: standardized AccessToken.expiresAt on epoch seconds (atxp-client and MemoryOAuthDb were milliseconds; redis already seconds).
  • redis nits: renamed the misleading unconditional‑ttl test, added the sliding‑window re‑save test, and noted the staleness window in code.

On the token‑exchange question: with exp propagated the entry expires at the token's real lifetime, so the 24h default is only a backstop for when the AS omits exp.

All five packages green (common 105 / server 191 / client 208 / sqlite 27 / redis 20), typecheck clean.

package-lock.json pinned atxp-client (and atxp-base/atxp-tempo) to nested registry copies of @atxp/common and @atxp/mpp at 0.11.7, left over from before the 0.11.8 bump. npm ci re-materialized the stale 0.11.7 MemoryOAuthDb inside atxp-client, shadowing the workspace symlink, so atxp-client tests and CI ran against published code instead of the local workspace - masking cross-package changes. Exact deps (0.11.8) now match the workspace, so regenerate the lock to dedup them to the workspace. Verified: clean npm ci no longer creates nested @atxp copies and the atxp-client suite passes 208/208 against the workspace common. Per CLAUDE.md guidance (line 73).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@badjer
Copy link
Copy Markdown
Contributor Author

badjer commented May 26, 2026

You were right on the nested‑common issue and I was wrong to wave it off — thanks for proving it with the lockfile line + CI path. Fixed here:

CI‑fidelity blocker (fixed). package-lock.json pinned atxp-client (and atxp-base/atxp-tempo) to nested registry copies of @atxp/common/@atxp/mpp at 0.11.7. Since the exact deps (0.11.8) now match the workspace, I regenerated the lock so they dedup to the workspace (per CLAUDE.md:73) — clean +1/−31 diff. Verified under the CI path: after rm -rf node_modules && npm ci, there's no nested @atxp copy (require.resolve("@atxp/common", {paths:["packages/atxp-client"]})packages/atxp-common/dist/index.cjs), and the atxp‑client suite is 208/208 with nothing deleted manually. Commit: chore(deps): dedup @atxp/common and @atxp/mpp to workspace in lockfile.

Residual notes:

  1. "All backends" backstop framing — corrected in the description. exp‑based bounding covers all three backends; the 24h fixed backstop is redis‑only by design. redis is the shared server‑side cache that OOM'd (safe to bound, re‑populated per request); sqlite/memory are client token stores where a no‑exp token may be intentionally long‑lived, so a blanket TTL there could evict valid tokens. I chose document‑the‑assumption over adding a sqlite TTL for exactly that reason. (The AS returns exp in practice; redis is covered both ways regardless.)
  2. Unit‑contract change — added a Compatibility note to the description (AccessToken.expiresAt ms→seconds; safe within the SDK's lockstep pins, flagged for release notes). No CHANGELOG file exists in the repo, so the description is the callout surface.
  3. Merge interaction with fix/oauth-dcr-only-on-invalid-client — noted in the description; both touch introspectToken. I'll rebase whichever merges second.

@badjer badjer merged commit 89e9a63 into main May 27, 2026
1 check passed
@badjer badjer deleted the fix/oauth-token-default-ttl branch May 27, 2026 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant