Skip to content

Add Redis backend for DCRCredentialStore#5195

Open
tgrunnagle wants to merge 2 commits intomainfrom
dcr-3b_issue_5184
Open

Add Redis backend for DCRCredentialStore#5195
tgrunnagle wants to merge 2 commits intomainfrom
dcr-3b_issue_5184

Conversation

@tgrunnagle
Copy link
Copy Markdown
Contributor

@tgrunnagle tgrunnagle commented May 5, 2026

Summary

An authserver replica that registers itself as a DCR client against an upstream
authorization server currently keeps the resulting (client_id, client_secret)
in the in-process MemoryStorage from sub-issue 1. Restarts and horizontal
scale-outs lose the registration, forcing every replica to re-register on cold
start and breaking RFC 7592 management URLs. This PR adds the persistent half
of DCRCredentialStore so a Redis-Sentinel-backed authserver shares DCR
credentials across replicas and survives restarts.

  • Add KeyTypeDCR and a length-prefixed redisDCRKey(prefix, DCRKey) helper
    that handles colons in RedirectURI and Issuer without ambiguity, mirroring
    the existing redisProviderKey shape.
  • Add RedisStorage.StoreDCRCredentials / GetDCRCredentials with JSON
    serialisation (acting as a defensive copy on read) and TTL semantics derived
    from client_secret_expires_at: zero means no Redis TTL (RFC 7591 "never"),
    a future expiry uses time.Until(expiry), and an already-past expiry is
    bounded to a 1s eviction window so already-dead secrets do not linger.
  • Wire a compile-time _ DCRCredentialStore = (*RedisStorage)(nil) assertion
    alongside the existing interface checks at the bottom of redis.go.
  • Add miniredis-backed unit tests covering key distinctness/determinism, full
    round-trip, overwrite, validation, defensive copy via decode, all three TTL
    branches, ErrNotFound semantics, and concurrent Store / Get.
  • Add Redis Sentinel integration tests (testcontainers, //go:build integration)
    pinning the wire-level TTL contract against real Redis (-1 for never-expires)
    plus round-trip, distinct-keys, overwrite, and concurrent access — extending
    redis_integration_test.go rather than introducing a second harness.

Closes #5184

Type of change

  • New feature

Test plan

  • Unit tests (task test)
  • Linting (task lint-fix)
  • Manual testing (describe below)

Ran the integration suite locally against a Docker-backed Redis Sentinel
cluster: go test -tags=integration ./pkg/authserver/storage/... (TTL,
round-trip, distinct-keys, overwrite, and concurrent-access cases all pass,
including the wire-level TTL == -1 assertion for never-expires rows).

API Compatibility

  • This PR does not break the v1beta1 API, OR the api-break-allowed label is applied and the migration guidance is described above.

Changes

File Change
pkg/authserver/storage/redis_keys.go Add KeyTypeDCR const and redisDCRKey length-prefixed key helper.
pkg/authserver/storage/redis.go Add storedDCRCredentials wire type, StoreDCRCredentials / GetDCRCredentials, pastExpiryDCRTTL bound, and DCRCredentialStore interface assertion.
pkg/authserver/storage/redis_test.go Add miniredis unit coverage: key encoding, round-trip, overwrite, validation, defensive copy, TTL (never / future / past), ErrNotFound, concurrent access.
pkg/authserver/storage/redis_integration_test.go Add Redis Sentinel integration coverage extending withIntegrationStorage: round-trip, distinct keys, overwrite, real-Redis TTL, concurrent access.

Does this introduce a user-facing change?

No. This is internal storage plumbing behind the DCRCredentialStore interface
introduced in sub-issue 1. Sub-issue 3 will wire EmbeddedAuthServer to select
the Redis backend via the existing storage_type: redis config toggle.

Special notes for reviewers

  • This branch went through two review iterations. The final state has zero
    CRITICAL / HIGH / MEDIUM findings. One LOW finding remains: a stale docstring
    on a unit-test helper in redis_test.go that still references the
    pre-pastExpiryDCRTTL behaviour. Happy to fix in this PR or as a follow-up
    if reviewers prefer to keep this PR focused on the storage primitive.
  • Past-expiry TTL behaviour is deliberately 1s rather than rejecting the
    write or storing long-lived. Rationale is in the pastExpiryDCRTTL constant
    doc comment and the StoreDCRCredentials docstring: caller's expiry
    timestamp still round-trips so a downstream reader can observe it and trigger
    re-registration, while the row self-evicts almost immediately. Worth a look
    during review to confirm the policy matches how sub-issue 3's resolver will
    consume it.
  • Unit tests use miniredis (already in go.mod from the surrounding
    redis_test.go suite). Integration tests use the existing testcontainers
    Redis Sentinel harness — both layers are required: the unit layer pins
    in-process semantics for task test, and the integration layer pins the
    wire contract (TTL returns -1 for "no TTL") that miniredis cannot
    faithfully reproduce.
  • No new dependencies were introduced.

@tgrunnagle tgrunnagle changed the base branch from main to dcr-3a_issue_5183 May 5, 2026 14:35
@github-actions github-actions Bot added the size/L Large PR: 600-999 lines changed label May 5, 2026
@tgrunnagle tgrunnagle force-pushed the dcr-3a_issue_5183 branch from c0fed52 to cc92472 Compare May 5, 2026 15:50
@tgrunnagle tgrunnagle force-pushed the dcr-3a_issue_5183 branch from cc92472 to 7aa7366 Compare May 6, 2026 14:33
@tgrunnagle tgrunnagle force-pushed the dcr-3b_issue_5184 branch from 43ead51 to b0bf320 Compare May 6, 2026 17:17
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels May 6, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

❌ Patch coverage is 92.22222% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.81%. Comparing base (6b39256) to head (df0ad8f).

Files with missing lines Patch % Lines
pkg/authserver/storage/redis.go 89.39% 4 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5195      +/-   ##
==========================================
- Coverage   67.83%   67.81%   -0.02%     
==========================================
  Files         610      610              
  Lines       62303    62379      +76     
==========================================
+ Hits        42262    42302      +40     
- Misses      16875    16902      +27     
- Partials     3166     3175       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Base automatically changed from dcr-3a_issue_5183 to main May 7, 2026 15:28
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/L Large PR: 600-999 lines changed labels May 7, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

Copy link
Copy Markdown
Contributor Author

@tgrunnagle tgrunnagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multi-agent review summary

Reviewed by 5 specialist agents (backend equivalence, concurrency/TTL, security, test quality, general Go quality). Findings below survived consensus scoring (threshold ≥ 7/10).

Review event: COMMENT (PR is by the posting user and marked DRAFT — GitHub disallows self-APPROVE/REQUEST_CHANGES anyway).

Consensus summary

# Finding Severity Anchor Agent agreement
1 RedisStorage.StoreDCRCredentials validation set diverges from MemoryStorage; docstring's "same fail-loud contract" claim is false. Empty ScopesHash / ClientID / AuthorizationEndpoint / TokenEndpoint are rejected by Memory, accepted by Redis. HIGH redis.go 1421-1432 4 / 5
2 TTL test docstring ("row is written without a TTL") contradicts the past expiry uses bounded TTL subtest body and the production code's pastExpiryDCRTTL branch. Acknowledged in PR description. MEDIUM redis_test.go 2413-2415 2 / 5
3 *_ConcurrentAccess tests partition the keyspace by goroutine ID — every worker writes its own key, so concurrent same-key contention is not exercised. The Memory baseline has both a disjoint and an overlapping variant. MEDIUM redis_test.go 2493, redis_integration_test.go 1569 2 / 5
4 require.NoError(t, ...) is called from spawned goroutines in both *_ConcurrentAccess tests. Per Go testing docs, FailNow (which require.NoError invokes on failure) must be called only from the goroutine running the test function — behaviour from another goroutine is undefined. The Memory baseline uses an atomic.AddInt32(&errCount, 1) pattern instead. MEDIUM redis_test.go 2497, redis_integration_test.go 1573 1 / 5
5 redisDCRKey documents that ScopesHash is [0-9a-f] and never contains a colon, then appends it without a length prefix. The assumption isn't enforced anywhere — once finding #1 is fixed (validate empty ScopesHash on Store) most of the gap closes; full enforcement would also reject non-hex content or length-prefix the segment. LOW redis_keys.go 104-107 1 / 5
6 GetDCRCredentials docstring claims that an unpopulated key (empty Issuer / RedirectURI) cannot match any stored row "because StoreDCRCredentials rejects such keys". The reasoning is incomplete: it omits ScopesHash, and once #1 is fixed should be updated to enumerate all rejected components for symmetry. LOW redis.go 1483-1486 1 / 5

Holistic assessment

Backend interchangeability is the load-bearing contract here. The DCRCredentialStore interface (parent PR) explicitly defers to MemoryStorage's rejected-input list as canonical. The Redis backend's narrower validation breaks substitutability — a record with an empty ClientID or ScopesHash will fail loud against Memory and silently persist in Redis. The cleanest fix is extracting a free function validateDCRCredentialsForStore(*DCRCredentials) error and calling it from both backends so the validation set cannot drift again.

The TTL docstring drift (#2) and the *_ConcurrentAccess naming-vs-behaviour gap (#3) don't affect correctness today but degrade the test set's value as a regression guard. Both are cheap fixes.

The require.NoError-in-goroutine issue (#4) is a Go testing.T contract violation — not a current bug, but undefined-behaviour territory on test failure. The Memory baseline already does the right thing.

Forward-looking: when sub-issue 3 wires the embedded auth server to use this backend, the bounded-TTL past-expiry behaviour will be exercised under real workloads. A slog.Warn on that branch (currently silent) would give operators a signal they can correlate with upstream clock skew. Captured in the dropped-findings appendix as a follow-up.

Architecture notes (no action required)

  • Pattern parity with storedUpstreamTokens (Unix-epoch fields, length-prefixed keys, JSON wire format) is excellent.
  • No migration concern — KeyTypeDCR is a new key type with no overlap.
  • Security posture is sound: no secrets in errors, no logging of credentials, multi-tenant prefix correctly applied to every new key.
  • Codex cross-review: skipped (codex CLI not installed locally).

Dropped findings (consensus < 7/10)

  • ,omitempty on validated key fields divergence from storedUpstreamTokens
  • Past-expiry write should emit a slog.Warn
  • "Second-grain TTL is the broadly-tested path" rationale is technically misleading for the future-expiry branch (which uses PX)
  • GetDCRCredentials does not validate the unmarshaled stored struct (defensive-in-depth only)
  • NotFound test "fully empty key" subtest is redundant
  • Defensive-copy test is shape-fragile against future struct evolution
  • Past-expiry exposure window (1s) deserves explicit rationale-vs-reject doc
  • 10 s unit-test concurrent timeout could be tight under -race on slow CI
  • pastExpiryDCRTTL exact-equality assertion fragile against real-Redis clock drift (miniredis pin only)
  • dcrFixtureKey shared between unit/integration via build-tag visibility

🤖 Generated by /dev:pr-review

Comment thread pkg/authserver/storage/redis.go
Comment thread pkg/authserver/storage/redis.go Outdated
// a downstream reader can still observe it and trigger re-registration.
//
// Returns fosite.ErrInvalidRequest for nil creds, empty Issuer, or empty
// RedirectURI — the same fail-loud contract as MemoryStorage.StoreDCRCredentials.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[HIGH] This docstring claims "the same fail-loud contract as MemoryStorage.StoreDCRCredentials". As written, that's false — see the inline at the function body for the four missing validation checks.

Once the implementation is brought into parity, this comment becomes accurate again. Otherwise it must be rewritten to enumerate the actual narrower set and explicitly call out the asymmetry, and the interface docstring in types.go would also need updating to permit the divergence.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in df0ad8f.

Comment thread pkg/authserver/storage/redis_test.go Outdated
Comment thread pkg/authserver/storage/redis_test.go Outdated
Comment thread pkg/authserver/storage/redis_integration_test.go Outdated
Comment thread pkg/authserver/storage/redis_test.go Outdated
Comment thread pkg/authserver/storage/redis_integration_test.go Outdated
Comment thread pkg/authserver/storage/redis_keys.go
Comment thread pkg/authserver/storage/redis.go
@tgrunnagle tgrunnagle force-pushed the dcr-3b_issue_5184 branch from b0bf320 to 1736a6e Compare May 7, 2026 15:41
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/XL Extra large PR: 1000+ lines changed size/L Large PR: 600-999 lines changed labels May 7, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

✅ PR size has been reduced below the XL threshold. The size review has been dismissed and this PR can now proceed with normal review. Thank you for splitting this up!

@github-actions github-actions Bot dismissed their stale review May 7, 2026 15:42

PR size has been reduced below the XL threshold. Thank you for splitting this up!

Addresses #5195 review comments:
- HIGH redis.go (3202731047): extract validateDCRCredentialsForStore as a
  shared free function in types.go; both MemoryStorage and RedisStorage
  delegate to it so the rejection set cannot drift across backends.
  Extends TestRedisStorage_DCRCredentials_StoreInvalidInputRejected to
  cover all 7 cases mirrored from the Memory baseline.
- HIGH redis.go (3202731049): rewrite the StoreDCRCredentials docstring
  to point at the shared validator now that the rejection set matches.
- MEDIUM redis_test.go (3202731051): apply the suggested replacement on
  TestRedisStorage_DCRCredentials_TTL — the third bullet now describes
  the bounded pastExpiryDCRTTL behaviour.
- MEDIUM redis_test.go + redis_integration_test.go (3202731061,
  3202731063): add an overlapping-key concurrent variant alongside the
  existing disjoint variant via a shared runDCRConcurrentAccess helper,
  matching the Memory baseline rationale.
- MEDIUM redis_test.go + redis_integration_test.go (3202731065,
  3202731066): the shared helper reports goroutine errors via
  atomic.AddInt32 checked from the test goroutine after wg.Wait(), so
  testing.T.FailNow is never invoked from a non-test goroutine.
- LOW redis_keys.go (3202731071): soften the redisDCRKey docstring's
  ScopesHash claim and reference the Store-side validator.
- LOW redis.go (3202731074): update GetDCRCredentials docstring to
  enumerate all rejected key components and add an "empty scopes_hash"
  subtest to TestRedisStorage_DCRCredentials_NotFound.
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels May 7, 2026
@tgrunnagle tgrunnagle marked this pull request as ready for review May 7, 2026 16:00
@tgrunnagle tgrunnagle requested review from JAORMX and jhrozek as code owners May 7, 2026 16:00
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels May 7, 2026
Copy link
Copy Markdown
Contributor

@jhrozek jhrozek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two findings from a fresh review pass.

Finding 1 (HIGH) — runtime panic on integration suite

Four DCR integration test functions panic at runtime because they call t.Parallel() and then withIntegrationStorage(t, ...) on the same *testing.T; the helper itself calls t.Parallel() (line 343), so the second invocation panics with testing: t.Parallel called multiple times. Reproduced locally with:

go test -tags=integration -run TestIntegration_DCRCredentials_DistinctKeysCoexist ./pkg/authserver/storage/
--- FAIL: TestIntegration_DCRCredentials_DistinctKeysCoexist (0.00s)
panic: testing: t.Parallel called multiple times [recovered, repanicked]

The other DCR integration tests (_RoundTrip, _TTL) get this right: they call t.Parallel() only on the outer test and let withIntegrationStorage handle parallelism inside t.Run subtests where t is a different instance. Inline suggestions below.

FYI on the CI side: this passed CI green because the Redis Sentinel suite is gated behind //go:build integration, and no GitHub Actions workflow builds with that tag — task test-coverage (which runs in .github/workflows/test.yml) excludes integration-tagged tests, and the existing task test-integration target isn't wired up. So the suite has zero CI coverage today; that's worth a separate fix but is orthogonal to this PR.

Finding 2 (MEDIUM) — stale forward references in types.go

Three pre-existing docstrings on the DCRCredentialStore interface still describe the Redis backend as future work. After this PR merges they will contradict the implementation, which violates the project's Keep Comments Synchronized With Code rule. (These lines aren't in the PR's diff hunks, so I can't attach inline suggestions — flagging here.)

pkg/authserver/storage/types.go:230-231 (DCRCredentials doc, # Lifetime section)

-// the periodic cleanup loop. The Redis backend (future sub-issue) applies
-// TTL via SetEX when ClientSecretExpiresAt is non-zero.
+// the periodic cleanup loop. The Redis backend applies TTL via SET with a
+// duration when ClientSecretExpiresAt is non-zero.

pkg/authserver/storage/types.go:266-267 (ClientSecretExpiresAt field doc)

-	// the persisted entry: the Redis backend (sub-issue 2) plumbs it through
-	// SetEX so the row evicts before the upstream rejects the secret at the
+	// the persisted entry: the Redis backend plumbs it through SET with a
+	// duration so the row evicts before the upstream rejects the secret at the

pkg/authserver/storage/types.go:275-276 (DCRCredentialStore interface doc)

-// dynamic-client-registration credentials. Both MemoryStorage and a future
-// Redis-backed store implement it; an authserver backed by Redis shares DCR
+// dynamic-client-registration credentials. Both MemoryStorage and
+// RedisStorage implement it; an authserver backed by Redis shares DCR

(Side note on SetEX: the implementation uses Set with a time.Duration, which go-redis emits as SET ... EX/PX — slightly inaccurate as written, but the bigger drift is the "future" framing.)


Other findings from the review (past-expiry backend divergence, Store-side defensive-copy parity test, t.Fatalf in goroutine, vacuous nolint:gosec, TTL-slack inconsistency) were considered and intentionally not surfaced — happy to discuss any of them if useful.

}

func TestIntegration_DCRCredentials_DistinctKeysCoexist(t *testing.T) {
t.Parallel()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HIGH — runtime panic on integration suite.

This t.Parallel() followed by withIntegrationStorage(t, ...) panics: the helper at line 343 already calls t.Parallel() on the same *testing.T. Reproduced:

--- FAIL: TestIntegration_DCRCredentials_DistinctKeysCoexist (0.00s)
panic: testing: t.Parallel called multiple times [recovered, repanicked]

Same bug at lines 1498, 1581, and 1588 (separate inline comments). Fix: drop this line — the helper handles parallelism.

Suggested change
t.Parallel()

More context (and a note on why CI didn't catch this) in the review summary.

}

func TestIntegration_DCRCredentials_OverwriteSemantics(t *testing.T) {
t.Parallel()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same t.Parallel()-then-withIntegrationStorage(t, ...) panic as line 1464. Drop this line.

Suggested change
t.Parallel()

t.Parallel()

t.Run("overlapping_key", func(t *testing.T) {
t.Parallel()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same t.Parallel()-then-withIntegrationStorage(t, ...) panic as line 1464. Drop this line.

Suggested change
t.Parallel()

})

t.Run("disjoint_keys", func(t *testing.T) {
t.Parallel()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same t.Parallel()-then-withIntegrationStorage(t, ...) panic as line 1464. Drop this line.

Suggested change
t.Parallel()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Persistent DCRCredentialStore: Redis backend

2 participants