@@ -111,16 +111,39 @@ hey -n 500 -c 50 -m POST \
111111 https://your-worker.workers.dev/webhooks/< source_id>
112112```
113113
114+ ## Rate Limiting
115+
116+ hookflare uses a two-layer rate limiter:
117+
118+ 1 . ** In-memory pre-check** (0ms) — fast rejection for requests clearly over the limit
119+ 2 . ** RateLimiter Durable Object** (5-20ms) — precise, globally consistent counter per source
120+
121+ The DO layer adds ~ 60ms to P50 latency but provides real protection:
122+ - One DO instance per source_id, serializable (no race conditions)
123+ - No storage writes (in-memory counter only — resets on DO eviction)
124+ - Graceful fallback: if DO is unavailable, in-memory check still works
125+
126+ | Metric | In-memory only | DO rate limiter (current) |
127+ | ---| ---| ---|
128+ | P50 Latency | 239ms | ** 303ms** |
129+ | Global accuracy | ❌ per-isolate | ✅ per-source |
130+ | Race conditions | Possible | None |
131+ | Error rate | 0% | ** 0%** |
132+
133+ Default: 100 requests per 60 seconds per source. Configurable. For strict global limits, use Cloudflare WAF rules.
134+
114135## Optimization History
115136
116137| Version | Avg Latency | RPS | Error Rate | Change |
117138| ---| ---| ---| ---| ---|
118139| v0.1 (6 sequential I/O) | 860ms | 21 | 8-20% | Baseline |
119140| v0.2 (deferred R2+D1) | 420ms | 38 | 20% | KV rate limiter still on hot path |
120- | ** v0.3 (in-memory rate limit)** | ** 265ms** | ** 149** | ** 0%** | ** Current** |
141+ | v0.3 (in-memory rate limit) | 265ms | 149 | 0% | Eliminated KV from hot path |
142+ | ** v0.4 (DO rate limiter)** | ** 303ms** | ** 60** | ** 0%** | ** Current — precise global rate limiting** |
121143
122144Key optimizations:
123145- Source lookup cached in-memory (60s TTL) — eliminates D1 read per request
124146- R2 write + D1 event creation deferred to queue consumer
125- - KV-based rate limiter replaced with in-memory counter — eliminates KV write per request
147+ - KV-based rate limiter replaced with DO-based — precise, zero race conditions, zero KV quota
126148- Queue send + KV idempotency write run in parallel
149+ - In-memory pre-check reduces DO calls by ~ 80% under normal traffic
0 commit comments