Skip to content

Improve heartbeat resilience with consecutive failure tracking#37

Open
benrobson wants to merge 4 commits intostagingfrom
claude/fix-velocity-heartbeat-AkXqk
Open

Improve heartbeat resilience with consecutive failure tracking#37
benrobson wants to merge 4 commits intostagingfrom
claude/fix-velocity-heartbeat-AkXqk

Conversation

@benrobson
Copy link
Member

Summary

Enhanced the heartbeat monitoring system in both Velocity and Waterfall implementations to be more resilient by introducing consecutive failure tracking before disconnecting players. This prevents transient API failures from immediately disrupting the server.

Key Changes

  • Consecutive failure tracking: Added AtomicInteger to track consecutive heartbeat failures instead of immediately disconnecting on first failure
  • Configurable thresholds: Introduced MAX_CONSECUTIVE_FAILURES (3) and REQUEST_TIMEOUT_SECONDS (10) constants for easier tuning
  • Request timeout handling: Wrapped API requests in CompletableFuture with explicit timeout to prevent indefinite hangs
  • Improved logging: Enhanced error messages to show failure count and provide better diagnostics
  • Code deduplication: Extracted player disconnection logic into a shared kickAllPlayers() method to reduce code duplication
  • Better state management: Reset failure counter on successful heartbeat to distinguish between transient and persistent failures

Implementation Details

  • Players are only disconnected after 3 consecutive failures, allowing the system to tolerate temporary API issues
  • Request timeout is set to 10 seconds to prevent the heartbeat task from blocking indefinitely
  • Both Velocity and Waterfall implementations now follow the same resilience pattern
  • Logging now clearly indicates the progression of failures (e.g., "1/3", "2/3", "3/3")

https://claude.ai/code/session_01SRyGmN3Re6fQsQDMxeL9Z3

The Request library uses HttpClient with no timeout configured, so any
network hiccup causes execute() to block indefinitely and eventually
throw, which previously kicked all players on the very first failure.

Two fixes applied to both velocity and waterfall Heartbeat:
- Wrap req.execute() in a CompletableFuture with a 10s timeout so a
  slow/hung request fails fast rather than blocking forever
- Track consecutive failures with AtomicInteger; only kick all players
  after 3 consecutive failures, not on the first transient error

https://claude.ai/code/session_01SRyGmN3Re6fQsQDMxeL9Z3
@benrobson benrobson requested a review from JaedanC as a code owner March 9, 2026 03:28
@coderabbitai
Copy link

coderabbitai bot commented Mar 9, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d37c78a9-6a1a-4c66-8e10-a0594081e822

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/fix-velocity-heartbeat-AkXqk

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

claude added 3 commits March 9, 2026 13:43
BaseAPIURL and APIKey were declared twice — once in the outer method
scope and again inside the lambda, causing a compilation error.
Removed the redundant inner declarations; the lambda captures them
from the outer scope.

https://claude.ai/code/session_01SRyGmN3Re6fQsQDMxeL9Z3
The Velocity plugin was manually formatting and broadcasting chat to
players, then calling ChatResult.denied() to suppress the default
forward. However, SignedVelocity (a required dependency) intercepts
signed chat packets at the network level and forwards them to the
backend before Velocity's event result is evaluated, causing both
the proxy-formatted message and the backend's chat plugin output to
appear simultaneously.

Reverted to the same approach used in the Waterfall version: let
chat pass through to the backend normally, only deny when the filter
blocks content. The backend's chat plugin handles formatting.

https://claude.ai/code/session_01SRyGmN3Re6fQsQDMxeL9Z3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants