fix: use fresh access token for SSE reconnection attempts#78
fix: use fresh access token for SSE reconnection attempts#78devin-ai-integration[bot] wants to merge 1 commit into
Conversation
The requestFn closure passed to SubscribeWithParams captured the access token by value at SSE setup time. When the connection dropped and the reconnect loop fired, it reused the original (potentially expired) token, causing repeated auth failures logged via the onError callback. Now the closure calls infisicalClient.Auth().GetAccessToken() on each connection/reconnection attempt, so the Go SDK's auto-refresh provides a valid token. Co-Authored-By: arsh <arshsb1998@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
|
| Filename | Overview |
|---|---|
| internal/services/infisicalsecret/reconciler.go | Removes a pre-captured token variable so the SSE reconnect closure fetches a fresh token via GetAccessToken() on every attempt; thread-safety of that call from background goroutines is noted as unverified in the PR checklist. |
Reviews (1): Last reviewed commit: "fix: use fresh access token for SSE reco..." | Re-trigger Greptile
| // even when reconnecting long after the original token expired. | ||
| httpClient, err := util.CreateRestyClient(model.CreateRestyClientOptions{ | ||
| AccessToken: token, | ||
| AccessToken: infisicalClient.Auth().GetAccessToken(), |
There was a problem hiding this comment.
Unverified thread-safety of
GetAccessToken() from reconnect goroutine
The closure is invoked from reconnectLoop and checkConnectionHealth, both of which run in background goroutines (see sse.go lines 370 and 551). If the Go SDK's auth module is not goroutine-safe — i.e., if a concurrent token refresh races with the read in GetAccessToken() — this could return a partially-written or empty token, producing the same 401-auth errors the PR aims to fix. The PR's own review checklist marks this as unverified.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Good catch from the bot — I investigated the Go SDK source (go-sdk@v0.7.1).
TL;DR: The SDK's token refresh is goroutine-safe; GetAccessToken() has a pre-existing minor race but this change is still strictly better than the stale-token capture.
Here's what the SDK does internally:
-
Writes are mutex-protected:
setAccessToken()(called by bothdoTokenRenewalanddoReAuthentication) holdsc.mu.Lock()when writingtokenDetails,credential, andauthMethod. A separaterefreshMuserializes concurrent refresh attempts viaTryLock. -
GetAccessToken()does NOT holdc.mu.RLock()— it readsc.client.tokenDetails.AccessTokendirectly. This is a pre-existing race in the SDK (not introduced by this PR). In practice, Go string assignment is pointer+length and the worst realistic outcome is reading a briefly stale (but previously valid) token, which would succeed or trigger a retry. -
Net effect of this PR: Before — guaranteed expired token on every reconnect. After — reads the latest token (possibly racing with a concurrent refresh, but the value will be either the current valid token or the previous one that was valid moments ago). This is a strict improvement.
The proper fix for the read race belongs in the SDK's GetAccessToken() method (adding c.client.mu.RLock()), not in this operator code.
Summary
Fixes repeated SSE errors reported by customers during secrets reconciliation (
OpenInstantUpdatesStream.func2).Root cause: The
requestFnclosure passed toSubscribeWithParamscaptured the access token by value once at SSE setup time (line 861). When the SSE connection later dropped (network blip, server restart, or the backend's 1-hourCONNECTION_TIMEOUT_MS), the reconnect loop reused the original — now expired — token, causing all 5 retry attempts to fail with auth errors. Each failure was logged via theonErrorcallback (func2), producing the error spike customers observed.Fix: Replace the captured
tokenvariable with a live call toinfisicalClient.Auth().GetAccessToken()inside the closure. The Go SDK manages token auto-refresh internally, so calling it at reconnection time returns a valid token.Review & Testing Checklist for Human
infisicalClient.Auth().GetAccessToken()is safe to call from the reconnection goroutine (thread-safety of the Go SDK auth module)instantUpdates: true, wait for the SSE connection to drop (or force-kill it), and confirm reconnection succeeds with a fresh token instead of producing auth errorstokenvariable)Notes
The removed
tokenvariable on line 861 was only consumed by the closure — the slug resolution on line 866 already calledGetAccessToken()independently.Link to Devin session: https://app.devin.ai/sessions/e138b75048354fd2aa1f7f9f72cfd8ce
Requested by: @0xArshdeep