fix(auth): do not remove session when getUser returns session_not_found#2166
Conversation
@supabase/auth-js
@supabase/functions-js
@supabase/postgrest-js
@supabase/realtime-js
@supabase/storage-js
@supabase/supabase-js
commit: |
mandarini
left a comment
There was a problem hiding this comment.
Hi @mbaltrusitis, thank you so much for contributing to Supabase! 💚
This is a thoughtful fix and the test coverage you added is genuinely appreciated. I do have a few things I'd like us to work through before merging.
On the "transient" premise
I went and looked at the server-side code for this. The /user endpoint calls FindSessionByID with forUpdate=false, which is a plain WHERE id = ? query -- either the row is there or it isn't. For session_not_found to be genuinely transient, you'd need something like read replica replication lag (if the auth server routes /user reads to a replica). Is that the specific scenario that triggered this? If so, it would really help to say so explicitly in the PR description, because right now "server-side race conditions, brief network partitions, or replication lag" reads as speculative. Knowing the actual failure mode will help reviewers and future readers understand why this tradeoff is worth making.
The SSR / autoRefreshToken: false case
This is the part I'm most concerned about. In server-side rendering contexts (Next.js with @supabase/ssr, SvelteKit, Nuxt, etc.), createServerClient explicitly sets autoRefreshToken: false. More critically, the cookie cleanup in @supabase/ssr works by listening for SIGNED_OUT via onAuthStateChange and calling applyServerStorage to clear the cookies:
// createServerClient.ts
client.auth.onAuthStateChange(async (event: AuthChangeEvent) => {
const hasStorageChanges = ...
if (hasStorageChanges && (event === "SIGNED_OUT" || ...)) {
await applyServerStorage(...)
}
})With this change, SIGNED_OUT no longer fires on session_not_found, so applyServerStorage never runs and the session cookie is never cleared. The next request loads the same stale cookie, calls getUser(), gets the same error, and nothing cleans up. This repeats until the JWT expires (up to an hour). The user is stuck: getUser() keeps returning null, no event fires, and the app has no clean signal to redirect or recover unless it explicitly handles AuthSessionMissingError and calls signOut() itself.
The cleanup path is untested
The PR description says cleanup is deferred to _callRefreshToken(), which is correct, that path does work. But all of your tests set autoRefreshToken: false, so the refresh cycle never actually runs in any test. Could you add a test that covers the full path: session is gone, auto-refresh fires, refresh token call returns session_not_found, session is removed, and SIGNED_OUT fires? That's the core of the fix's correctness guarantee and right now it's not verified.
It would also be great to have a test for the actual recovery scenario, getUser() fails transiently, session is preserved, and a subsequent getUser() succeeds. That's the whole point of deferring cleanup, and showing it work end-to-end would make this much easier to approve.
Thank you again for taking the time to dig into this and for the thorough test suite you already wrote. Looking forward to seeing where this lands.
🔍 Description
Do not destroy the local session when
getUser()receives asession_not_founderror from the auth server, as this error can be transient. Instead, preserve the session and let the next token refresh cycle handle cleanup if the session is truly invalid.What changed?
_removeSession()and code-verifier cleanup from_getUser()when the server returns asession_not_founderror. Previously, asession_not_foundresponse would immediately destroy the local session and fire aSIGNED_OUTevent._callRefreshToken()will fail and handle removal at that point.session_not_founderror comment infetch.tsto acknowledge that this error can be transient.gotrue-client-getUser.test.ts) covering session preservation, event behavior, error shape, code-verifier retention, and asignOut()regression check.fetch.test.tsfor thesession_not_found->AuthSessionMissingErrormapping.Why was this change needed?
When
getUser()encounters asession_not_founderror from the auth server, the current behavior aggressively destroys the local session. However, this error can be transient (e.g. server-side race conditions, brief network partitions, or replication lag). Destroying the session immediately causes users to be unexpectedly signed out even though their session may still be valid. By deferring cleanup to the refresh cycle, transient failures are handled gracefully while truly expired sessions are still cleaned up.📸 Screenshots/Examples
N/A
🔄 Breaking changes
📋 Checklist
npx nx formatto ensure consistent code formatting📝 Additional notes
signOut()flow is and still removes the session as expected (covered by regression test)._callRefreshToken()will fail and clean up the session at that point, so there is no riskof stale sessions persisting indefinitely.