Skip to content

Add identity extractor for OAuth2 token responses#5200

Open
jhrozek wants to merge 1 commit intomainfrom
token_claim_mapping-1-identity-extractor
Open

Add identity extractor for OAuth2 token responses#5200
jhrozek wants to merge 1 commit intomainfrom
token_claim_mapping-1-identity-extractor

Conversation

@jhrozek
Copy link
Copy Markdown
Contributor

@jhrozek jhrozek commented May 5, 2026

Summary

Some OAuth2 upstream providers do not expose a usable userinfo endpoint and instead place user identity directly in the token endpoint response. Two shapes appear in practice:

  • Identity as side-attributes alongside the tokens — Snowflake's username, Slack's authed_user.id, Shopify's associated_user.{id,email,first_name}.
  • Identity claims embedded inside a JWT-shaped access token — Auth0, Azure AD, Keycloak, Cognito.

This PR adds a pure helper extractIdentityFromTokenResponse(body, cfg) that reads operator-supplied gjson dot-notation paths from the raw token-response body to extract subject, name, and email. It also registers a gjson modifier @upstreamjwt so paths can pipe through a JWT payload decode step (e.g. access_token|@upstreamjwt|sub). The modifier does not verify the JWT signature — trust comes from the TLS-authenticated channel to the AS, the same trust model as the existing userinfo path. Signed-token flows remain handled by the existing OIDC provider type.

The helper is consumed by the embedded auth server's OAuth2 upstream provider in a later commit; nothing in this PR calls it yet.

Closes #5152

Type of change

  • New feature / enhancement

Test plan

  • Unit tests for happy paths (Snowflake-flat, Slack-nested, Shopify-nested)
  • Unit tests for subject type-guard rejection (object, null, missing path, empty body, malformed JSON, nil cfg, empty subject value)
  • Unit tests for optional name/email handling (path not configured, missing in body, warn-and-skip on object value)
  • Unit tests for JWT-embedded identity (happy path + malformed JWT)
  • Numeric subject precision test (9007199254740993, beyond 2^53)
  • Property-style test asserting error messages never contain any portion of the response body
  • task lint-fix reports 0 issues

Implementation plan

Approved plan from the design session
  • Pure helper extractIdentityFromTokenResponse(body, cfg) with type-guard subject validation (string or number only; objects, arrays, null, booleans, missing paths rejected with ErrIdentityResolutionFailed).
  • Optional name/email paths warn-and-skip on type mismatch via slog.Warn that names the path but never the value or any body content.
  • Numeric precision preserved via result.Raw rather than gjson.Result.String() (which formats via float64 and truncates >2^53).
  • Errors carry the configured path name but never any portion of the body — pinned by a dedicated property test embedding a unique secret marker.
  • gjson custom modifier @upstreamjwt for JWT-embedded identity claims. No signature verification (trust via TLS-to-AS, same as userinfo path). Returns "" on any failure mode (non-string input, wrong dot-count, base64 error), letting downstream validateIdentityField produce a uniform ErrIdentityResolutionFailed.
  • Modifier registration via explicit RegisterModifiers() rather than init() — caller-controlled, testable, no gochecknoinits suppression. Idempotent (gjson.AddModifier overwrites).
  • Test surface kept minimal: 17 table cases covering distinct code branches plus one body-leakage property test.

Does this introduce a user-facing change?

No. This is an internal helper with no consumer in this PR. The OAuth2 upstream provider integration that exposes the feature to operators — and the corresponding CRD type — lands in follow-up PRs.

Special notes for reviewers

  • Beyond the linked issue's stated criteria: issue Add identity extractor for OAuth2 token responses #5152's acceptance criteria enumerate only the side-attribute shapes (Snowflake/Slack/Shopify). The @upstreamjwt modifier is an additive extension for the JWT-embedded shape (Auth0/Azure/Keycloak/Cognito). All eight original acceptance criteria from the issue are met; the JWT modifier is incremental on top.
  • Wire-up obligation for the integrating PR: RegisterModifiers() must be called once during application or test wire-up before any consumer issues a path containing @upstreamjwt. The integrating commit will add this call to the OAuth2 upstream provider's constructor.
  • Downstream JWT serialization: numeric-source subjects are returned as Go strings (scalarToString uses result.Raw). Any downstream JWT issuer must serialize them as JSON strings per RFC 7519 §4.1.2 (StringOrURI). A test for that contract belongs in the integrating PR.
  • Trust-model docstring: the trust caveats (no signature verification, TLS-to-AS as integrity boundary) live on upstreamJWTModifier's godoc. The corresponding CRD type (out of scope here) should also carry operator-facing warnings about which fields are appropriate for subjectPath (don't use access_token, token_type, etc.).

🤖 Generated with Claude Code

Some OAuth2 upstream providers do not expose a usable userinfo
endpoint and instead place user identity directly in the token
endpoint response. Two response shapes appear in practice:

  - Identity as side-attributes alongside the tokens, e.g.
    Snowflake's `username`, Slack's `authed_user.id`, Shopify's
    `associated_user.{id,email,first_name}`.
  - Identity claims embedded inside a JWT-shaped access token, e.g.
    Auth0, Azure AD, Keycloak, Cognito.

Introduce a pure helper that reads operator-supplied gjson
dot-notation paths from the raw token-response body to extract
subject, name, and email. Register a custom gjson modifier
`@upstreamjwt` so paths can pipe through a JWT payload decode step
(e.g. "access_token|@upstreamjwt|sub"). The modifier base64url-decodes
the JWT payload without verifying the signature; trust comes from the
TLS-authenticated channel to the AS, the same trust model as the
existing userinfo path. Signed-token flows remain handled by the
existing OIDC provider type. Modifier registration is exported and
explicit (RegisterModifiers) so callers control when the
process-global gjson state mutates.

The helper is consumed by the embedded auth server's OAuth2 upstream
provider in a later commit; nothing in this commit calls it yet.

Type guard restricts the subject to scalar string or number values to
avoid silently returning a JSON blob as the user's identity. Numeric
subjects are returned via the raw JSON token rather than gjson's
float64 formatting to preserve integer precision beyond 2^53. Error
messages never include any portion of the body.

Closes #5152
@github-actions github-actions Bot added the size/M Medium PR: 300-599 lines changed label May 5, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 89.83051% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.70%. Comparing base (9572f6a) to head (31d6deb).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pkg/authserver/upstream/identity_from_token.go 89.83% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5200      +/-   ##
==========================================
+ Coverage   67.65%   67.70%   +0.04%     
==========================================
  Files         607      608       +1     
  Lines       61982    62043      +61     
==========================================
+ Hits        41937    42006      +69     
+ Misses      16883    16876       -7     
+ Partials     3162     3161       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Medium PR: 300-599 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add identity extractor for OAuth2 token responses

1 participant