Add force-refresh path for cached U2M tokens#1552
Add force-refresh path for cached U2M tokens#1552mihaimitrea-db wants to merge 6 commits intomainfrom
Conversation
Split token loading from refresh decisions, add dedicated force-refresh coverage, and document the remaining cross-process refresh race for a follow-up fix. Signed-off-by: Mihai Mitrea <mihai.mitrea@databricks.com>
Document the new PersistentAuth.ForceRefreshToken method in NEXT_CHANGELOG so the release notes cover the new explicit cached-token refresh behavior. Signed-off-by: Mihai Mitrea <mihai.mitrea@databricks.com>
Move the missing refresh token check into the shared refresh path, expose it as ErrMissingRefreshToken, and cover the wrapped error behavior from both Token and ForceRefreshToken. Signed-off-by: Mihai Mitrea <mihai.mitrea@databricks.com>
simonfaltum
left a comment
There was a problem hiding this comment.
Review (automated, 2 agents)
Verdict: Approved
0 Critical | 0 Major | 1 Gap | 2 Nit | 1 Suggestion
See inline comments for details.
…rMissingRefreshToken to changelog Signed-off-by: Mihai Mitrea <mihai.mitrea@databricks.com>
mihaimitrea-db
left a comment
There was a problem hiding this comment.
Automated Review by 2 Agents
Verdict: Approve with comments
- 🔴 0 Critical
- 🟠 0 Major
- 🟡 3 Minor
- 🔵 3 Nit
The core logic changes look sound; the main concerns are a misclassified changelog entry, a missing test for the new ErrMissingRefreshToken graceful-degradation path, and a misleading doc comment on ForceRefreshToken().
See inline comments for details.
- NEXT_CHANGELOG: Move ErrMissingRefreshToken to API Changes, remove proactive-refresh and ParentPath (rebase artifacts / unrelated) - error.go: Fix InvalidRefreshTokenError doc (Load -> Token/ForceRefreshToken) - persistent_auth.go: Clarify ForceRefreshToken dual-write vs Token(); unify error prefix to 'token refresh:'; simplify loadToken doc - persistent_auth_test.go: Add TestToken_ReturnsExistingTokenWhenNearExpiryAndNoRefreshToken; add defensive store t.Fatalf to ForceRefreshToken tests Signed-off-by: Mihai Mitrea <mihai.mitrea@databricks.com>
|
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
simonfaltum
left a comment
There was a problem hiding this comment.
Review (automated, 3 agents: Claude Code + Isaac + Cursor, 2-round deep swarm)
Verdict: Approved (with suggestions)
0 Critical | 1 Major | 0 Gap | 2 Nit | 0 Suggestion
See inline comments for details. Overall the PR is well-designed: the loadToken() extraction is clean, ForceRefreshToken() has the right strict-error semantics, ErrMissingRefreshToken is idiomatic Go, and test coverage is thorough across all key paths.
| // When the token was loaded via host-key fallback (profile key not found), the | ||
| // refreshed token is dual-written to both the profile key and host key, | ||
| // effectively migrating the token to the profile. Token() migrates only when | ||
| // it needs to refresh (expired or nearly expired); ForceRefreshToken always | ||
| // refreshes, so it always migrates on success. | ||
| func (a *PersistentAuth) ForceRefreshToken() (*oauth2.Token, error) { | ||
| t, err := a.loadToken() | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| t, err = a.refresh(t) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("forced token refresh: %w", err) | ||
| } | ||
| t.RefreshToken = "" |
There was a problem hiding this comment.
[Major] ForceRefreshToken migrates host-key fallback tokens to the profile key, widening a pre-existing risk
loadToken() deliberately avoids migrating host-key fallback tokens because they "may contain a token from a different profile with different scopes." However, ForceRefreshToken() always calls refresh() which calls dualWrite(), writing the refreshed token to the profile key even when it was loaded via fallback. This can make a wrong-scoped token sticky for future lookups.
Nuance: The base branch's Token() already migrates fallback tokens when they need refresh (expired or near-expiry). ForceRefreshToken() widens the migration to include still-valid tokens, but doesn't introduce a new category of bug. Practical risk is limited (requires two profiles with different scopes pointing to the same host).
Suggestion: Consider threading provenance out of loadToken() in a follow-up, so callers know whether the token came from the primary key or host-key fallback. Then ForceRefreshToken() could refuse to force-refresh a fallback-loaded token (directing the user to databricks auth login --profile <name>), or refresh without writing to the profile key.
Found by: Cursor + Claude Code (R1), nuanced by Isaac (R2)
| // InvalidRefreshTokenError is returned from PersistentAuth's Token() and | ||
| // ForceRefreshToken() methods if the access token has expired and the refresh | ||
| // token in the token cache is invalid. |
There was a problem hiding this comment.
[Nit] InvalidRefreshTokenError doc comment overstates expiry precondition
The comment says this error is returned "if the access token has expired", but ForceRefreshToken() calls refresh() even for still-valid cached access tokens. If that forced refresh gets invalid_grant, callers still receive InvalidRefreshTokenError, so the doc comment overstates the expiry precondition.
Suggestion: Reword in terms of a refresh attempt, e.g.: "returned when a token refresh is attempted and the cached refresh token is invalid."
Found by: Cursor (R2, cross-review)
|
[Nit] PR description claims "no default behavior changed" for The PR description states "No default behavior changed for existing callers of Suggestion: Update the PR description, e.g.: "For Found by: Isaac (R1) + Claude Code (R1), confirmed by Cursor (R2) |
Summary
This PR adds an explicit
PersistentAuth.ForceRefreshToken()path for cached U2M OAuth tokens so the CLI can implementdatabricks auth token --force-refreshon top of the 5-minute proactive refresh buffer introduced in #1535.For maintainability reasons it also splits token loading from refresh decisions and documents a cross-process refresh race for a follow-up fix.
Why
#1535 changed the default U2M token path to proactively refresh tokens that are within 5 minutes of expiry so callers of
databricks auth tokenare less likely to receive a token that expires before it can be used. That addresses the "give me a useful token" case, which came out of theauth tokendiscussion in cli issue #4564.That proactive refresh is intentionally best-effort, though.
PersistentAuth.Token()still returns the existing access token when it is valid and a proactive refresh fails, because callers did not explicitly ask to block on a fresh token. That behavior is appropriate for the default path, but it is not strong enough for integrations that want to treat the CLI as a token minter or that want to manage their own cache/TTL policy. In those cases, "return a still-usable token" is different from "refresh now and give me a newly minted token or fail."This PR adds that stricter SDK primitive. It gives the CLI a clean way to implement
--force-refreshwithout changing the semantics of the existingToken()flow.The raw oauth2 error text for forced refresh failures can be misleading. In particular, when a forced refresh is requested for a cached access token that is still valid but does not include a refresh token, the default oauth2 error suggests that the token is expired even though the real problem is that the cached token is not refresh-capable.
This PR surfaces explicit refresh-token errors instead of relying on the default oauth2 wording, so callers can distinguish "the current access token is expired" from "a new token could not be minted because the refresh token is missing or was rejected by the IdP".
What changed
Interface changes
(*PersistentAuth).ForceRefreshToken() (*oauth2.Token, error).This always attempts to refresh the cached U2M token, regardless of its current expiry, and redacts the refresh token from the returned value just like
Token()does.Behavioral changes
PersistentAuth.Token().PersistentAuth.ForceRefreshToken()always tries to refresh, even when the cached access token is still valid.ForceRefreshToken()returns an error instead of falling back to the existing token.ForceRefreshToken()fails explicitly.Internal changes
loadToken()so token loading and refresh policy are separated.refresh()with a follow-up TODO rather than expanding the scope of this PR to add inter-process coordination.ErrMissingRefreshTokenfor the "cached token has no refresh token" case.refresh()path so bothToken()andForceRefreshToken()wrap the same semantic error.How is this tested?
go test ./credentials/u2mForceRefreshToken()ForceRefreshToken()when the cached token has no refresh tokenInvalidRefreshTokenErrorbehavior