feat: bound cluster pipeline parallelism#1149
Conversation
Signed-off-by: dthuynh <dthuynh@axon.com>
139df7d to
47c0cae
Compare
|
@collin-lee could you help me to take a look on this? This change adds configurable bounded parallelism for Redis Cluster pipeline groups while preserving the current serial behavior by default. The intent is to reduce P99 latency for multi-descriptor requests where descriptors map to different cluster slots, without changing behavior for existing users. I’ve included benchmark results, resource impact, caveats, and test coverage in the PR description. The change is backward-compatible with REDIS_CLUSTER_PIPELINE_PARALLELISM=1 as the default. Happy to adjust the implementation or docs based on your feedback. Thanks! |
|
@agrawroh could you please take a look? |
|
What's the expected upper bound on distinct keys per request? If it can exceed REDIS_POOL_SIZE, unbounded mode risks pool starvation. Should context.Background() in PipeDo be replaced with the gRPC request context to enable deadline propagation to parallel Redis calls? |
…the parallelism to RedisPoolSize Signed-off-by: dthuynh <dthuynh@axon.com>
60cee1f to
99caa22
Compare
|
@collin-lee Thanks, both points are valid. For distinct keys: there is no hard protocol/code upper bound on distinct Redis keys per request. In practice it is bounded by the request descriptors/non-empty generated cache keys in a pipeline phase, but that can exceed REDIS_POOL_SIZE. I updated the implementation so this does not rely only on operator configuration. The configured cluster pipeline parallelism is normalized at Redis client creation:
For context propagation: agree. I updated PipeDo to accept the request context and pass it through the GET and INCR/EXPIRE Redis pipeline calls, so deadlines/cancellation now propagate to parallel Redis calls. |
The fork content was already synced via squash-merge in 95e672c (PR #33). This merge commit links the histories so GitHub no longer reports the fork as 100+ commits behind upstream. * upstream/main: (154 commits) Update to golang-1.26.4 and update golang.org/x/net to 0.55.0 (envoyproxy#1154) feat: bound cluster pipeline parallelism (envoyproxy#1149) fix: correct typos in memcache error messages and variable name (envoyproxy#1150) Update to golang 1.26.3 (envoyproxy#1152) Add quota mode to rate limit descriptor proto (envoyproxy#1148) feat: add retry in init phase instead of panic directly (envoyproxy#1144) Add integration test for quota based service selection. (envoyproxy#1114) build: pin golang:1.26.2 to multi-arch index digest (envoyproxy#1131) Update third party libraries flagged for vulnerability scans (envoyproxy#1124) feat: add zipkin b3 header propagation (envoyproxy#1110) Fix Prometheus response time units (envoyproxy#1104) Dockerfile: add ENTRYPOINT (envoyproxy#1095) Send user defined metadata to the client (envoyproxy#1112) build(deps): bump google.golang.org/grpc from v1.74.2 to v1.80.0 (envoyproxy#1111) Fix quota result when all limits were exceeded (envoyproxy#1059) Update golang references to 1.26.1 (envoyproxy#1091) Add integration test for token based quota (envoyproxy#1092) Add quota integration test (envoyproxy#1090) Add debug logging for quota values (envoyproxy#1089) Wait for sevices to be up before running tests (envoyproxy#1088) ...













Problem
In Redis Cluster mode,
clientImpl.executeGroupedPipelinegroups pipeline actions by key and then executes each key group serially. For a singleShouldRateLimitrequest that carries multiple descriptors whose keys map to different cluster slots, this adds one Redis round-trip per group. Request latency therefore scales with descriptor count even though the groups are independent.Concretely: a request with N descriptors hits N parallel Redis cluster slots but waits N × RTT instead of 1 × RTT.
Solution
Adds a new env var (and matching field on
Settings):REDIS_CLUSTER_PIPELINE_PARALLELISM=1REDIS_CLUSTER_PIPELINE_PARALLELISM=0errgroup.WithContext).REDIS_CLUSTER_PIPELINE_PARALLELISM>1A matching
REDIS_PERSECOND_CLUSTER_PIPELINE_PARALLELISMcovers the per-second pool.A
len(pipeline) == 1fast-path skips grouping entirely so N=1 callers see zero overhead.Backward compatibility
default = 1preserves upstream legacy serial behavior. No config-file changes, no metric changes, no public API changes.Test results
Bench: 4 RLS pods per variant, both running the same binary, differing only by
REDIS_CLUSTER_PIPELINE_PARALLELISMenv var. Backend: a managed Redis Cluster reached via TLS. Server-side latency measured viahistogram_quantile(0.99, sum by (le) (rate(ratelimit_service_response_time_seconds_bucket[1m]))).N=1 — confirms no regression (3 reps, 3000 RPS, 90s each)
=1(legacy)=8(bounded)Δ = +0.7% — within ambient noise. Both variants take the single-action fast-path at N=1, as designed.
N=2 @ 3000 RPS — typical multi-descriptor case (3 reps, 90s each)
=1(legacy)=8(bounded)Δ = −59% server-side P99 (and −62% on the bench client side; both metrics agree).
N=5 @ 1000 RPS — isolates the RTT-multiplier effect
=1(legacy)=8(bounded)Δ = −77%. The 4.3× ratio closely matches the theoretical 5× upper bound (legacy = N × RTT, parallel = 1 × RTT-batch).
Resource cost
ratelimit_redis_pool_cx_activeper podGoroutine increase is bounded by
concurrent in-flight pipelines × N(or× parallelismwhen bounded). At realistic load this is on the order of hundreds, not thousands.Caveat — when this does not help
Parallelism helps when the RLS-side pool / RTT is the bottleneck. When the Redis cluster itself is CPU-saturated (e.g., per-shard CPU > 75%), every Redis op queues at the shard regardless of how the client submits them, and the latency improvement disappears. In testing at N=5 + 3000 RPS, the managed Redis cluster hit 77–79% per-shard CPU and both variants converged to ~2300 ms P99. Operators should size Redis appropriately before relying on this knob to absorb high-N requests.
This is called out in the env-var docs in the README.
Tests
CI on the PR will run the full suite.
Files changed
Screenshots of latency / RPS / CPU charts attached in a follow-up comment.