Skip to content

feat: implement is_negative_hits in rate limit descriptor#1140

Open
rudrakhp wants to merge 1 commit into
envoyproxy:mainfrom
rudrakhp:main
Open

feat: implement is_negative_hits in rate limit descriptor#1140
rudrakhp wants to merge 1 commit into
envoyproxy:mainfrom
rudrakhp:main

Conversation

@rudrakhp

Copy link
Copy Markdown
Member

Summary

Add support for the is_negative_hits field on rate limit descriptors. When set, the
hits_addend value decrements the rate limit counter instead of incrementing it, effectively
refilling previously consumed tokens. The counter is floored at 0 (cannot go negative).

  • Upgrade go-control-plane/envoy to v1.37.1 (adds IsNegativeHits proto field)
  • Redis: uses an atomic Lua script (EVAL) to decrement and floor at 0 in a single operation
  • Memcached: uses native Decrement which floors at 0 by default
  • Negative hits bypass over-limit and near-limit checks (always return OK)
  • New total_negative_hits stat tracks requested decrements per descriptor

Fixes envoyproxy/envoy#41219

Signed-off-by: Rudrakh Panigrahi <rudrakh97@gmail.com>
@arkodg

arkodg commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

cc @collin-lee @psbrar99

@collin-lee collin-lee left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Redis Cluster Routing Bug (src/redis/fixed_cache_impl.go:41)

pipelineAppendDecrement passes the Lua script text as the key parameter to PipeAppend. In Redis Cluster mode, PipelineAction.Key is used to determine which slot/node to route to. This means EVAL commands would be routed incorrectly — potentially causing
MOVED/CROSSSLOT errors or silent misrouting when multiple negative-hit descriptors target different hash slots.

  1. Negative Hits Can Return OVER_LIMIT (src/redis/fixed_cache_impl.go:222)

The PR claims "Negative hits bypass over-limit checks (always return OK)", but this is only true for the local-cache over-limit check. In the response loop, if the post-decrement counter is still above the threshold (e.g., counter at 15, limit 10,
decrement 3 → result 12 > limit 10), GetResponseDescriptorStatus returns OVER_LIMIT. Same issue exists in memcached (src/memcached/cache_impl.go:146).

  1. Lua Script Returns new_val Instead of math.floor(new_val) (src/limiter/base_limiter.go:23)

Minor inconsistency: the script stores math.floor(new_val) via SET but returns new_val directly. Practically identical for integer math but easy to fix for defensive correctness.

Minor Issues

  • A decrement on a non-existent key creates a phantom key at value 0 with full TTL (functionally harmless but wastes memory)
  • No test coverage for negative hits with stopCacheKeyIncrementWhenOverlimit=true
  • No test for negative hits when the post-decrement value is still above the limit

Suggestions

  1. Fix cluster routing by passing the actual Redis key (not the script) as the routing key for EVAL commands
  2. Force OK response for negative hits in the response handling loop (or document that OVER_LIMIT is possible)
  3. Add test cases for the above edge cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable per-user concurrency limits by supporting negative cost in the rate limiting protocol

3 participants