fix: refresh upstream tokens transparently instead of forcing re-auth#4036
Open
aron-muon wants to merge 4 commits intostacklok:mainfrom
Open
fix: refresh upstream tokens transparently instead of forcing re-auth#4036aron-muon wants to merge 4 commits intostacklok:mainfrom
aron-muon wants to merge 4 commits intostacklok:mainfrom
Conversation
efdbe42 to
cd44bbf
Compare
cd44bbf to
0dd2c1c
Compare
The upstreamswap middleware returned 401 when upstream access tokens expired, forcing users through full re-authentication even though valid refresh tokens existed in storage. This happened because: 1. Redis/memory storage TTL was set to access token expiry, deleting the entry (and refresh token) when the access token expired 2. Storage returned nil on ErrExpired, discarding the refresh token 3. The middleware had no refresh path — only 401 Fix all three layers: - Add DefaultRefreshTokenTTL (30 days) to storage entry TTL so refresh tokens survive past access token expiry - Return token data alongside ErrExpired from storage so callers can use the refresh token - Add UpstreamTokenRefresher interface and implementation that wraps the upstream OAuth2Provider and storage - Plumb the refresher through Server → EmbeddedAuthServer → Runner → MiddlewareRunner - Update upstreamswap middleware to attempt refresh before returning 401, only requiring re-auth when the refresh token itself fails Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0dd2c1c to
ab9807b
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #4036 +/- ##
==========================================
- Coverage 68.70% 68.69% -0.01%
==========================================
Files 445 446 +1
Lines 45374 45451 +77
==========================================
+ Hits 31173 31224 +51
- Misses 11796 11818 +22
- Partials 2405 2409 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Add comprehensive tests for RefreshAndStore (6 cases) and middleware refresh paths (4 cases: successful refresh, failed refresh, no refresh token, defense-in-depth expired-without-error). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Users are forced to fully re-authenticate with upstream OAuth providers every time the upstream access token expires (controlled by
accessTokenLifespan), even though valid refresh tokens exist in storage.The root cause is that upstream access tokens and refresh tokens are stored together in a single storage entry, with the entry's TTL/expiry set to the access token's expiry. When the access token expires, the entry is deleted (Redis) or marked expired (memory) — losing the refresh token, which is typically valid for 30-90 days or longer depending on the provider. The
upstreamswapmiddleware has no refresh path and returns 401, forcing full re-authentication.This affects both Redis and in-memory storage backends — Redis deletes the key via TTL, and memory storage's cleanup goroutine removes the expired entry.
Type of change
Root cause
The storage model bundles access + refresh tokens in a single entry (
UpstreamTokensstruct). The entry TTL is derived from the access token'sExpiresAt, so when the access token expires:GetUpstreamTokensreturnsnil, ErrExpired— discarding the token dataIdeally, access and refresh tokens would be stored separately with independent TTLs matching their actual lifetimes. This fix extends the bundled entry's TTL as a pragmatic solution; a future refactor could separate them for cleaner lifecycle management.
Changes
Storage layer (
pkg/authserver/storage/)DefaultRefreshTokenTTL(30 days) in both Redis and memory storage, so refresh tokens survive past access token expiryGetUpstreamTokensto return token data alongsideErrExpired(instead ofnil) so callers can use the refresh tokenExpiresAt(access token expiry) rather than the entry'sexpiresAt(storage TTL) for the expired checkToken refresher (
pkg/authserver/refresher.go)UpstreamTokenRefresherinterface instorage/types.goupstream.OAuth2Provider.RefreshTokens()+UpstreamTokenStorage.StoreUpstreamTokens()Plumbing
Server→EmbeddedAuthServer→Runner→MiddlewareRunnerusing the same lazy accessor pattern asGetUpstreamTokenStorageMiddleware (
pkg/auth/upstreamswap/)getOrRefreshUpstreamTokenshelper to keep cyclomatic complexity under lint thresholdProduction validation
Deployed to a production cluster with Redis (AWS Valkey) storage (we use a sentinel emulator which basically just returns the Valkey URL in each case. One of the little clever ways we could use Valkey). All four upstream providers successfully refreshed tokens transparently — no user re-authentication required:
cf.mcp.atlassian.com/v1/tokenapp.asana.com/-/oauth_tokenslack-gov.com/api/oauth.v2.accessoauth2.googleapis.com/tokenRedis TTLs confirmed updated to ~30 days (previously ~1 hour). GitHub has not yet expired its 8-hour access token but uses the same code path.
Test plan
ErrExpiredDoes this introduce a user-facing change?
Yes — upstream OAuth sessions now persist beyond the access token lifetime. Users will no longer be forced to re-authenticate as long as their upstream refresh token is valid (typically 30 days to indefinite depending on the provider).
Generated with Claude Code