Skip to content

fix(beacon): add per-source relay rate limit to prevent queue monopolization (PILOT-306)#17

Open
matthew-pilot wants to merge 1 commit into
mainfrom
openclaw/pilot-306-20260530-195707
Open

fix(beacon): add per-source relay rate limit to prevent queue monopolization (PILOT-306)#17
matthew-pilot wants to merge 1 commit into
mainfrom
openclaw/pilot-306-20260530-195707

Conversation

@matthew-pilot
Copy link
Copy Markdown
Collaborator

What

Adds a per-source sliding-window rate limiter to dispatchRelay that caps each sender at 1000 relays/second.

Why

dispatchRelay currently pushes all relay jobs onto a single 524288-deep relayCh channel with no per-source budget. A malicious sender flooding relays to a known destination can saturate the queue and cause relayDropped — squeezing out legitimate traffic.

Fix

  • New relaySourceWindow struct tracks per-source relay count in 1-second windows
  • Added to Server struct: relayRateMu + relaySourceCount map[uint32]*relaySourceWindow
  • Rate limit check in dispatchRelay after the destination pre-check, before buffer allocation and channel enqueue
  • Periodic cleanup of stale entries in reapStaleNodes
  • Follows the same pattern as the existing punch-request rate limiter (SEC-026)

Verification

  • go build ./... — clean
  • go vet ./... — clean
  • go test ./... — relay dispatch tests pass consistently (DispatchRelay_*); pre-existing flaky tests in the parallel UDP suite occur at the same rate on main

Scope

1 file, server.go, +38 code lines (+54 including comments).

Closes PILOT-306

…ization (PILOT-306)

dispatchRelay has no per-source budget — a malicious sender flooding
relays to a known destination can saturate the 524288-deep relayCh
and cause queue-full drops for legitimate traffic (SEC-037).

Add a per-source sliding-window rate limiter (max 1000 relays/sec)
with periodic cleanup in reapStaleNodes. The cap is generous enough
that legitimate multi-agent NAT sources won't hit it, but a DoS
source can no longer consume the entire queue. Follows the same
pattern as the punch-request rate limiter (SEC-026).

Closes PILOT-306
@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

📊 PR Status — #17 PILOT-306

Field Value
State OPEN
Mergeable ✅ MERGEABLE
Draft No
Branch openclaw/pilot-306-20260530-195707main
Files 1 file (server.go), +54/−1
Labels (none)

CI Checks (2/2 passing)

Check Result
test ✅ pass
codecov/patch ✅ pass

Created

2026-05-30 20:07 UTC

@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

🔍 PR Explanation — #17 PILOT-306

What this does

Adds a per-source relay rate limit to the beacon server to prevent one sender from monopolizing the relay queue.

The problem

The beacon relay queue (relayCh) is 524288 deep and shared by all sources. A malicious or misconfigured source targeting a known destination can saturate this queue, causing queue-full drops for everyone — including legitimate relay traffic.

The fix

1. New per-source sliding window (relaySourceWindow)

  • Tracks relay count per sender ID in 1-second windows
  • Guarded by relayRateMu mutex

2. Rate cap: 1000 relays/sec per source

  • At ~0.2% of total queue capacity per second, one source cannot crowd out others
  • Honest daemons retry (3-attempt path in pkg/daemon/daemon.go relay branch), so drops are self-healing

3. Cleanup sweep

  • Stale source entries (>5 min without activity) are deleted in reapStaleNodes()
  • Prevents unbounded map growth

Files changed

  • server.go: +54/−1 — new fields, type, constants, rate-limit check in dispatchRelay, and cleanup in reapStaleNodes

@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

Status (auto)

  • PR: open, mergeable · branch openclaw/pilot-306-20260530-195707main
  • Canary: not run
  • Jira: PILOT-306 — QA/IN-REVIEW (Teodor Calin)
  • Last activity: 2026-05-30T20:22 UTC

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant