Add CI benchmarks for all transports (HTTP, WebSocket, MCP, CLI)#299
Merged
evantahler merged 3 commits intomainfrom Apr 2, 2026
Merged
Add CI benchmarks for all transports (HTTP, WebSocket, MCP, CLI)#299evantahler merged 3 commits intomainfrom
evantahler merged 3 commits intomainfrom
Conversation
Exercises the status action across every transport to detect performance regressions. Uses generous p95 thresholds (500ms HTTP/WS, 2s MCP, 10s CLI) that serve as "something is very wrong" guards rather than tight regression detectors. Prints detailed latency stats (min, avg, p50, p95, p99, max, req/s) for human review. Runs as a new parallel job in the CI workflow. https://claude.ai/code/session_01Xmj5JwvuzR9GQG7ntPBA5d
The resque test pushed test_action into api.actions.actions in beforeEach but never cleaned up. When mcpServer.test.ts ran in the same shard, the duplicate actions caused "Tool test_action is already registered" errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7c23af5 to
03f0bfc
Compare
HTTP: 500ms → 50ms, WebSocket: 500ms → 25ms, MCP: 2000ms → 100ms, CLI: 10000ms → 3000ms. Still 10-28x above measured p95 values to absorb GitHub Actions runner variance. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Exercises the status action across every transport to detect performance
regressions. Uses generous p95 thresholds (500ms HTTP/WS, 2s MCP, 10s CLI)
that serve as "something is very wrong" guards rather than tight regression
detectors. Prints detailed latency stats (min, avg, p50, p95, p99, max, req/s)
for human review. Runs as a new parallel job in the CI workflow.
https://claude.ai/code/session_01Xmj5JwvuzR9GQG7ntPBA5d