Skip to content

feat: add Redis Cluster support via REDIS_CLUSTER_MODE setting#3890

Open
bebakouma wants to merge 5 commits intoIBM:mainfrom
bebakouma:feat/redis-cluster-support
Open

feat: add Redis Cluster support via REDIS_CLUSTER_MODE setting#3890
bebakouma wants to merge 5 commits intoIBM:mainfrom
bebakouma:feat/redis-cluster-support

Conversation

@bebakouma
Copy link
Copy Markdown

Problem

When deploying ContextForge against a Redis Cluster (e.g. Bitnami redis-cluster Helm chart with multiple shards), the current standalone redis.asyncio.Redis client cannot handle MOVED/ASK redirects. This causes the application to crash-loop with redis.exceptions.MovedError on startup during leader election.

The error occurs because redis.asyncio.from_url() always creates a standalone client that connects to a single node. When a key lives on a different shard, Redis responds with a MOVED redirect that the standalone client cannot follow.

Solution

Add a REDIS_CLUSTER_MODE boolean environment variable (default: false). When set to true, the redis_client factory creates a redis.asyncio.RedisCluster instance that automatically discovers all shards and routes commands to the correct node.

Changes

  • config.py — new redis_cluster_mode setting with Field descriptor
  • utils/redis_client.py — refactored into _create_cluster_client() and _create_standalone_client() helpers; strips /N database path from URL in cluster mode (Redis Cluster only supports db 0); selects client type based on redis_cluster_mode setting

Usage

REDIS_CLUSTER_MODE=true
REDIS_URL=redis://:password@redis-cluster:6379   # no /0

Compatibility

  • Backward compatible — default is false, existing standalone deployments are unaffected
  • RedisCluster supports the same .set(), .get(), .ping(), .eval(), .publish(), .pubsub() APIs used throughout the codebase
  • PubSub works on RedisCluster (broadcasts to all nodes)
  • Leader election Lua script uses a single key, so it hashes to one slot (no cross-slot issues)

Environment

Tested against Bitnami redis-cluster Helm chart v12.0.2 (3 nodes) where the redis-cluster ClusterIP service load-balances across shards.

@bebakouma bebakouma requested a review from crivetimihai as a code owner March 27, 2026 11:54
@bebakouma bebakouma force-pushed the feat/redis-cluster-support branch from 8e53d94 to 0f9ccda Compare March 27, 2026 12:46
bbakouma added 2 commits March 27, 2026 09:54
When deploying against a Redis Cluster (e.g. Bitnami redis-cluster Helm
chart), the current standalone redis.asyncio.Redis client cannot handle
MOVED/ASK redirects, causing the application to crash-loop with
redis.exceptions.MovedError.

This commit adds a REDIS_CLUSTER_MODE boolean setting (default: false).
When enabled, the redis_client factory creates a
redis.asyncio.RedisCluster instance instead of a standalone Redis
client. The cluster client automatically discovers all shards and
routes commands to the correct node.

Changes:
- config.py: Add redis_cluster_mode setting with Field descriptor
- utils/redis_client.py: Add _create_cluster_client() and
  _create_standalone_client() helpers; strip /N database path from
  URL in cluster mode (Redis Cluster only supports db 0); select
  client type based on redis_cluster_mode setting

Usage:
  REDIS_CLUSTER_MODE=true
  REDIS_URL=redis://:password@redis-cluster:6379  # no /0

Fixes crash-loop caused by MovedError when REDIS_URL points to a
Redis Cluster service with multiple shards.

Signed-off-by: bbakouma <bakoumaema@gmail.com>
…ation

- _strip_db_from_url now raises ValueError for non-zero database
  numbers (e.g. /1, /2) instead of silently stripping them, so
  misconfigurations fail fast
- Add TestStripDbFromUrl: tests for /0 stripping, no-path, slash-only,
  and non-zero DB rejection
- Add TestClusterMode: tests for cluster_mode=true (RedisCluster),
  cluster_mode=false (standalone), and missing attr fallback
- Update .env.example with REDIS_CLUSTER_MODE documentation

Signed-off-by: bbakouma <bakoumaema@gmail.com>
@bebakouma bebakouma force-pushed the feat/redis-cluster-support branch from 0f9ccda to 3a3dc63 Compare March 27, 2026 14:55
@crivetimihai crivetimihai added enhancement New feature or request SHOULD P2: Important but not vital; high-value items that are not crucial for the immediate release performance Performance related items labels Mar 29, 2026
@crivetimihai crivetimihai added this to the Release 1.1.0 milestone Mar 29, 2026
@crivetimihai
Copy link
Copy Markdown
Member

Thanks @bebakouma — clean implementation. The cluster/standalone factory split, _strip_db_from_url() validation, and test coverage are well done.

One security issue to fix:

The cluster-mode log line logs the full URL:

f"url={_strip_db_from_url(settings.redis_url)}"

If REDIS_URL contains a password (e.g., redis://:secret@host:6379), the password is logged in plaintext. The standalone path doesn't log the URL. Please mask or remove the URL from the log output.

Minor suggestions (non-blocking):

  • Consider adding REDIS_CLUSTER_MODE to the Helm chart values (charts/) for parity
  • A brief mention in the deployment docs would help operators

DCO: Signed-off-by line is required on all commits — use git commit --amend -s.

Otherwise this looks ready for review. Nice work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request performance Performance related items SHOULD P2: Important but not vital; high-value items that are not crucial for the immediate release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants