Skip to content

fix: cache regression#4269

Open
ulemons wants to merge 8 commits into
mainfrom
fix/regression-member-cache
Open

fix: cache regression#4269
ulemons wants to merge 8 commits into
mainfrom
fix/regression-member-cache

Conversation

@ulemons

@ulemons ulemons commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes multiple cache bugs in queryMembersAdvanced that were causing every member list count request to bypass Redis and hit Postgres directly. Also removes an unnecessary member_orgs CTE from the count query and replaces COUNT(DISTINCT m.id) with COUNT(*) where safe.

Changes

  • Fix inverted ternary on cachedCount (base.ts) — since commit 83bc45a18 (Jan 30), cachedCount was set to null whenever countOnly=true, making the check if (countOnly && cachedCount !== null) always false. Every count-only request was a DB hit. Restored the correct condition: countOnly ? await cache.getCount(cacheKey) : null.

  • Remove countOnly from cache key (queryCache.ts, base.ts) — count and full-result queries for the same params were hashed to different keys, so a full query never warmed the count cache and vice versa. Since the two caches live in separate Redis namespaces (members-advanced / members-count), the same key string is safe for both.

  • Normalize search: "" → null in cache key (base.ts) — buildSearchCTE treats "" and null identically (both produce an empty CTE), but "" was not filtered by the null/undefined check in buildCacheKey, creating duplicate cache entries for the same query.

  • Cross-populate count cache after full query (base.ts) — a countOnly=false query now also writes to members-count via Promise.all, so a subsequent countOnly=true request with the same params hits Redis instead of Postgres.

  • Remove unnecessary member_orgs CTE from count query (base.ts) — buildCountQuery was receiving includeMemberOrgs: include.memberOrganizations (always true for the member list endpoint), causing every count query to run a full GROUP BY scan on memberOrganizations. The count never selects org data; filterHasMo inside buildCountQuery already adds the join when the filter references mo.*.

  • COUNT(*) instead of COUNT(DISTINCT m.id) when withAggregates=true (queryBuilder.ts) — the join is on memberSegmentsAgg_pkey (unique on memberId + segmentId), so duplicates are impossible and DISTINCT is dead work.

  • Add error handling to setCount (queryCache.ts) — setCount had no try/catch unlike set, so a Redis failure during the new Promise.all would propagate and return a 500 even though data was already fetched successfully.

Type of change

  • Bug fix
  • New feature
  • Refactor / cleanup
  • Performance improvement
  • Chore / dependency update
  • Documentation

Note

Medium Risk
Changes core member list/count caching and SQL semantics; wrong count keys or COUNT(*) assumptions could show stale or incorrect totals, though invalidation and DISTINCT fallback for enrichment filters mitigate this.

Overview
Fixes queryMembersAdvanced cache behavior so member list and count requests actually hit Redis instead of Postgres on every call.

The PR corrects count-only cache lookup (was always null when countOnly=true), splits full-page vs count cache keys (buildCountCacheKey shares filter/search/segment only), normalizes empty search to null for stable keys, and writes counts when full queries run while skipping redundant COUNT(*) on later pages when the count is already cached. Count-only hits no longer trigger background refresh (avoids a DB query per request).

Query cost reductions: count queries no longer pull in member_orgs unless filters need mo.*; COUNT(*) replaces COUNT(DISTINCT m.id) when segment aggregates apply and enrichments aren't in the filter; list queries only join memberEnrichments when needed.

Operational hardening: Redis refresh locks dedupe background refreshes; structured logging for hits/misses/failures; on sync query failure, schedules background refresh for retry; optionsBgQx runs background work outside request transactions; setCount/getCount wrapped in try/catch like other cache ops; invalidation clears count and lock namespaces too.

Reviewed by Cursor Bugbot for commit 2e741fb. Bugbot is set up for automated code reviews on this repo. Configure here.

@ulemons ulemons self-assigned this Jun 26, 2026
Copilot AI review requested due to automatic review settings June 26, 2026 09:39
@ulemons ulemons added the Bug Created by Linear-GitHub Sync label Jun 26, 2026
@github-actions

Copy link
Copy Markdown
Contributor

⚠️ Jira Issue Key Missing

Your PR title doesn't contain a Jira issue key. Consider adding it for better traceability.

Example:

  • feat: add user authentication (CM-123)
  • feat: add user authentication (IN-123)

Projects:

  • CM: Community Data Platform
  • IN: Insights

Please add a Jira issue key to your PR title.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a regression in the queryMembersAdvanced caching flow so countOnly member list requests can hit Redis again (and full-result queries can warm the count cache), while also reducing count-query cost in Postgres.

Changes:

  • Fixes countOnly count-cache lookup and aligns count/full queries on a shared cache key (separate Redis namespaces).
  • Warms the count cache after full queries and hardens setCount against Redis write failures.
  • Optimizes count SQL by avoiding unnecessary org CTE/join and using COUNT(*) when the aggregate join is guaranteed unique.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
services/libs/data-access-layer/src/members/queryCache.ts Removes countOnly from key generation and adds error handling to setCount (cache write hardening).
services/libs/data-access-layer/src/members/queryBuilder.ts Optimizes count expression to use COUNT(*) when withAggregates guarantees no duplicates.
services/libs/data-access-layer/src/members/base.ts Fixes cache read/write flow for count-only requests, normalizes cache key inputs, and cross-populates count cache after full queries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/libs/data-access-layer/src/members/queryCache.ts
Comment thread services/libs/data-access-layer/src/members/base.ts Outdated
Copilot AI review requested due to automatic review settings June 26, 2026 10:42
@ulemons ulemons force-pushed the fix/regression-member-cache branch from 81c72ad to 63fab1d Compare June 26, 2026 10:42
Comment thread services/libs/data-access-layer/src/members/queryCache.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment on lines 205 to 209
limit,
offset,
orderBy,
search,
search: search?.trim() || null,
segmentId,
Comment on lines 198 to 210
// Build cache key — countOnly is excluded so count and full-result caches share the same key.
// Normalize empty search to null so "" and null produce the same key (buildSearchCTE treats them identically).
const cacheKey = cache.buildCacheKey({
countOnly,
fields,
filter,
include,
includeAllAttributes,
limit,
offset,
orderBy,
search,
search: search?.trim() || null,
segmentId,
})
Copilot AI review requested due to automatic review settings June 26, 2026 10:47

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

offset,
orderBy,
search,
search: search?.trim() || null,
ulemons added 4 commits June 26, 2026 14:03
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Copilot AI review requested due to automatic review settings June 26, 2026 12:03
@ulemons ulemons force-pushed the fix/regression-member-cache branch from e44f7b6 to d626a1d Compare June 26, 2026 12:03
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

Copilot AI review requested due to automatic review settings June 26, 2026 12:09

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Comment on lines 246 to 254
if (countOnly && cachedCount !== null) {
refreshCountCacheInBackground(bgQx, redis, cacheKey, {
refreshCountCacheInBackground(bgQx, redis, countCacheKey, {
filter,
search,
search: normalizedSearch,
segmentId,
include,
includeAllAttributes,
attributeSettings,
})
Comment on lines 571 to 575
try {
log.info(`Refreshing members advanced query cache in background: ${cacheKey}`)
await executeQuery(qx, redis, cacheKey, params)
log.info(`Members advanced query cache refreshed in background: ${cacheKey}`)
} catch (error) {
Comment on lines 586 to 590
try {
log.info(`Refreshing members advanced count cache in background: ${cacheKey}`)
await executeQuery(qx, redis, cacheKey, { ...params, countOnly: true })
log.info(`Members advanced count cache refreshed in background: ${cacheKey}`)
} catch (error) {
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Comment thread services/libs/data-access-layer/src/members/base.ts
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Copilot AI review requested due to automatic review settings June 26, 2026 12:53

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 622439d. Configure here.

Comment thread services/libs/data-access-layer/src/members/queryBuilder.ts

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment on lines +249 to +252
log.info(
{ countCacheKey, segmentId, search: normalizedSearch },
'Members advanced count cache hit — returning cached count',
)
Comment on lines +30 to +48
// Returns true if lock was acquired (no other refresh in progress for this key).
// Uses a cryptographically random token to distinguish "we set it" from "already existed".
// TTL ensures the lock auto-expires if the refresh crashes without releasing it.
async tryAcquireRefreshLock(cacheKey: string, ttlSeconds = 90): Promise<boolean> {
try {
const token = randomBytes(16).toString('hex')
const stored = await this.lockCache.setIfNotExistsOrGet(cacheKey, token, ttlSeconds)
return stored === token
} catch {
return true // fail open: if Redis is down, let the refresh proceed
}
}

async releaseRefreshLock(cacheKey: string): Promise<void> {
try {
await this.lockCache.delete(cacheKey)
} catch {
// best effort
}
Signed-off-by: Umberto Sgueglia <usgueglia@contractor.linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Created by Linear-GitHub Sync

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants