The repo behind anycable.io/compare/nodejs-websocket. Five WebSocket setups, three questions, one rule: same Railway hardware for every row.
- Does it deliver messages when clients drop and reconnect?
- Does it survive a deploy?
- How many idle connections will a single instance hold?
Setups under test: default Socket.io, Socket.io + Connection State Recovery, uWebSockets.js, AnyCable OSS, AnyCable Pro.
Methodology, traps, and the bugs we caught in our own setup: docs/methodology.md. Below: the numbers and how to rerun them.
All numbers from one Railway region, Pro tier, 32 vCPU / 32 GB. Bench-runner shards run alongside the targets on the internal network so the driver is never the bottleneck.
Every client's TCP socket gets force-closed every ~15 s and stays offline ~2 s before reconnecting. ~8 jitter events per client over the 160 s test.
| Default Socket.io | Socket.io + CSR | uWS | AnyCable OSS | AnyCable Pro | |
|---|---|---|---|---|---|
| Deliveries lost | 184,449 | 0 | 153,805 | 0 | 0 |
| Delivery rate | 84.55% | 100% | 87.03% | 100% | 100% |
| CSR resume rate | n/a | 99.7% | n/a | n/a | n/a |
| Replay p50 (raw) | 106 ms | 148 ms | 92 ms | 250 ms | 261 ms |
| Replay p95 | 394 ms | 1.97 s | 0.72 s | 4.10 s | 4.10 s |
| Replay p99 | 1.07 s | 4.58 s | 1.72 s | 6.14 s | 6.15 s |
| Replay max | 1.75 s | 9.71 s | 2.95 s | 9.23 s | 9.36 s |
At-most-once protocols drop whatever landed during the offline window. Replay protocols deliver everything; CSR's tail is slightly shorter at p99 in this in-memory comparison. AnyCable wins elsewhere: separate Go process so app deploys don't sever connections, horizontal scaling via NATS or Redis with replay intact, native client-to-client whispers, Ruby + Rails + Node + Bun + Deno backends.
| Socket.io | AnyCable | |
|---|---|---|
| Connections dropped | 5,000 (100%) | 0 |
| Recovery p50 | 4,967 ms | 0 ms |
| Recovery p95 | 5,992 ms | 0 ms |
| Clients that never came back | 189 (3.8%) | 0 |
| Total downtime | ~6.8 s | 0 s |
In-memory CSR can't save you here. Server state is lost on restart. Redis Streams CSR keeps the state, but the connections themselves still all sever. The avalanche is architectural.
Fifty bench-runner shards, each in its own Railway container with its own source IP and ~64K outbound-port pool. The 1M idle test fans out across all of them. The per-IP port ceiling is the reason single-machine 1M is hard; this is how we get past it without kernel tuning.
Single-instance target, 32 vCPU / 32 GB, one stream subscription per connection. Headline run: 1,000,000 idle clients via 25 × 40,000 shards.
| Server | Held | Peak memory | Peak CPU | Wall |
|---|---|---|---|---|
| Socket.io 4.x (Node 22) | 119,826 | 6.3 GB | 1.34% (1 core) | single Node event loop |
anycable-go (OSS) |
993,994 | 32 GB (box ceiling) | 12.22% (~3.9 vCPU) | physical RAM |
anycable-go-pro v1.6.13 |
999,954 | 19.34 GB | 9.37% (~3.0 vCPU) | nothing, 13 GB still free |
33 KB per connection OSS, 19 KB Pro at 1M. Pro is ~1.7× more efficient at this scale and ~2.4× at 200K. Socket.io's wall is architectural: handshakes serialize through one event loop regardless of memory. To reach 1M with Socket.io you run many Node processes behind a Redis adapter (~10K–30K per process).
OSS scaling line on the same box:
| Idle conns | OSS memory | OSS CPU |
|---|---|---|
| 1K | 280 MB | 0% |
| 10K | 280 MB | 0% |
| 20K | 751 MB | 1.08% (~0.3 vCPU) |
| 50K | 1.98 GB | 1.08% (~0.3 vCPU) |
| 100K | 4.18 GB | 1.62% (~0.5 vCPU) |
| 200K | 8.35 GB | 2.63% (~0.8 vCPU) |
| 994K | 32 GB | 12.22% (~3.9 vCPU) |
Three knobs that shape the numbers. Full reasoning in docs/methodology.md.
- Default Socket.io's offline window is floored to ~2 s (
MIN_OFFLINE_MSinlib/core/timing.ts). Otherwise the manual reconnect path stays offline only ~1 s and underreports the loss a realsocket.io-clientuser sees with defaultreconnection: true. CSR, AnyCable, and uWS land near 2 s on their own because their client-library backoffs dominate. Same disruption shape across the board. - CSR runs with the in-memory adapter. Simplest opt-in path. Redis Streams or MongoDB shift the tail by adding network RTT; structural picture holds. CSR is documented as incompatible with the Redis pub/sub adapter, so the "Redis adapter" most teams reach for first is the one CSR can't use.
- AnyCable's jitter-row RAM is the tradeoff for parallel replay. Its history buffer is per-stream so
historyparallelises across streams; that costs more RAM during jittery runs. Page-level RAM-per-connection comes from the idle test, where the per-connection footprint is what's measured.
benchmark/
├── docker-compose.yml # Local Socket.io + anycable-go
├── railway.toml # Railway deploy config
├── docs/
│ ├── methodology.md # How we built it and why
│ ├── railway-ops.md # Resize, redeploy, fleet, pause
│ └── env.md # Full env-var reference
└── backend/
├── Dockerfile # One image; SERVICE_ENTRY picks the entry point
├── package.json
├── results/ # CSV/JSON output (gitignored)
└── src/
├── publisher.ts # Standalone HTTP publisher (legacy)
├── socketio/server.ts # /_broadcast + /publish-local
├── uws/server.ts # uWebSockets.js comparison server
├── standalone-publisher/server.ts # Publisher as a separate Railway service
├── bench-runner/server.ts # Bearer-auth HTTP wrapper, deployed N times
├── bench/ # Driver scripts; each is `npm run bench:<name>`
│ ├── jitter-*.ts, avalanche-*.ts, throughput-*.ts, deploy-impact-*.ts
│ ├── whispers.ts, whispers-multi.ts, idle-multi.ts
│ ├── latency-trace-anycable.ts, jitter-anycable-trace.ts
│ ├── fetch-jitter-metrics.ts, railway-metrics.ts
│ ├── tests-manifest.ts # Canonical list of every rebaseline test
│ ├── rebaseline.ts # Walk manifest, regress vs baselines
│ └── rebaseline-history.ts # Per-metric trend across runs
└── lib/
├── jitter-runners.ts, jitter-uws.ts, jitter-anycable-traced.ts
├── avalanche-runner.ts, avalanche-uws.ts
├── deploy-impact-runner.ts
├── standalone-deploy-impact-runner.ts
├── standalone-deploy-impact-anycable-runner.ts
├── whispers-runner.ts, throughput.ts, idle-runner.ts
├── anycable-trace.ts
└── core/ # Every runner imports from here
├── params, stats, timing, peak-rss, log, results-dir, chart
├── bench-runner-client # Driver-side bearer-token fetch
├── job-queue # bench-runner async job state
├── shard-coordinator # Multi-shard fan-out + merge
└── railway-api # Railway GraphQL
Local CLI scripts and the Railway-hosted HTTP endpoints call the same runJitter* / runThroughput* functions. Numbers are produced by one code path; only the trigger differs.
Requires Node.js 22+ and either Docker or a local anycable-go binary (brew install anycable-go).
cd backend
npm installTwo server terminals:
# Terminal 1: Socket.io (no CSR by default; set SOCKETIO_CSR=1 for CSR)
npm run dev:socketio # :3000
# Terminal 2: anycable-go
anycable-go --port 8080 --broker=memory --presets=broker --publicThird terminal, run any variant. Each script publishes its own messages.
# Default Socket.io
SOCKETIO_URL=http://localhost:3000 NUM_CLIENTS=50 DURATION=60 \
TOTAL_MESSAGES=60 INTERVAL_MS=500 \
npm run bench:jitter:socketio
# Socket.io + CSR (server must run with SOCKETIO_CSR=1)
SOCKETIO_URL=http://localhost:3000 NUM_CLIENTS=50 DURATION=60 \
TOTAL_MESSAGES=60 INTERVAL_MS=500 \
npm run bench:jitter:socketio-csr
# AnyCable
ANYCABLE_URL=ws://localhost:8080/cable BROADCAST_URL=http://localhost:8090/_broadcast \
NUM_CLIENTS=50 DURATION=60 TOTAL_MESSAGES=60 INTERVAL_MS=500 \
npm run bench:jitter:anycableOutput: delivery rate, jitter event count, latency percentiles (raw + min-normalized), runner peak RSS.
Local avalanche (Socket.io spawns, gets killed, restarts):
npm run build
NUM_CLIENTS=1000 PORT=4000 npm run bench:avalanche:socketioFor AnyCable the test confirms nothing happens (separate process, no disruption):
anycable-go --port 8080 --broker=memory --presets=broker --public &
NUM_CLIENTS=1000 ANYCABLE_URL=ws://localhost:8080/cable \
BROADCAST_URL=http://localhost:8090/_broadcast \
npm run bench:avalanche:anycableDeploy three services in one project:
socketio-serverfrom this repo (backend/Dockerfile,SERVICE_ENTRY=socketio/server). Serves/_broadcastfor per-message HTTP publishing and/publish-localfor the diagnostic in-process emit path.anycable-gofrom the official image (anycable/anycable-go:latest) withANYCABLE_BROKER=memory,ANYCABLE_PRESETS=broker,ANYCABLE_PUBLIC=true,ANYCABLE_HTTP_BROADCAST_SECRET=<your-secret>.bench-runnerfrom this repo withSERVICE_ENTRY=bench-runner/server,ANYCABLE_BROADCAST_SECRET=<same secret>,BENCH_RUNNER_TOKEN=<random 32+ chars>.
Give bench-runner a public domain. With BENCH_RUNNER_TOKEN set, every /bench-* and /jobs/* request needs Authorization: Bearer <token>; /health stays open for Railway probes.
# AnyCable @ 10K
curl --max-time 320 -X POST \
-H "Authorization: Bearer $BENCH_RUNNER_TOKEN" \
"https://bench-runner-production.up.railway.app/bench-jitter-anycable?n=10000&duration=200&msgs=120&interval=500&jitter=15&jitterMs=1000&ramp=300&stream=run-ac"
# Default Socket.io @ 10K
curl --max-time 320 -X POST \
-H "Authorization: Bearer $BENCH_RUNNER_TOKEN" \
"https://bench-runner-production.up.railway.app/bench-jitter-socketio?n=10000&duration=200&msgs=120&interval=500&jitter=15&jitterMs=1000&ramp=300&stream=run-d"
# Socket.io + CSR @ 10K
curl --max-time 320 -X POST \
-H "Authorization: Bearer $BENCH_RUNNER_TOKEN" \
"https://bench-runner-production.up.railway.app/bench-jitter-socketio-csr?n=10000&duration=200&msgs=120&interval=500&jitter=15&jitterMs=1000&ramp=300&stream=run-csr"Default mode is sync (blocks until the run finishes, typically 3 to 5 minutes at 10K). Add ?async=1 to get back {jobId} and poll /jobs/:id instead.
Response shape: deliveryRatePct, lostDeliveries, expectedDeliveries, receivedDeliveries, jitterEvents, csrResumes, csrResumeRatePct, connectFailures, latencyRawMs and latencyOverMinMs ({avg,p50,p95,p99,max} plus skewFloor), runnerPeakRssMb.
SOCKETIO_URL=https://your-socketio.up.railway.app NUM_CLIENTS=5000 \
npm run bench:avalanche:railwayWhen the script reports "All clients connected", from a second terminal:
railway restart -s socketio-server --yes/bench-idle-anycable, /bench-idle-socketio, /bench-idle-uws open N raw WebSockets to the target via internal network, hold for holdSec, return final counts.
Single shard (up to ~50K):
curl --max-time 600 -X POST -H "Authorization: Bearer $BENCH_RUNNER_TOKEN" \
"https://bench-runner.up.railway.app/bench-idle-anycable?n=50000&hold=120&ramp=300"Each Linux container has a ~64K per-source-IP outbound port pool, capping any single shard around 50K useful connections.
Multi-shard (100K to 1M) fans out across the 50-shard fleet. Deploying the fleet is in docs/railway-ops.md. Then:
SHARDS=$(printf 'https://bench-runner-production.up.railway.app'
for i in $(seq 2 50); do
printf ',https://bench-runner-%s-production.up.railway.app' "$i"
done)
# 1M idle against anycable-go on 32 vCPU / 32 GB
SHARDS="$SHARDS" \
PER_SHARD_N=20000 HOLD_SEC=120 RAMP_PER_SEC=200 \
PROJECT_ID=<railway-project-uuid> SERVICE_ID=<anycable-go-service-uuid> \
SERVICE_NAME=anycable-go \
npm run bench:idle:multi
# Other targets: TARGET=socketio + SERVER_URL=...
# TARGET=uws + UWS_WS_URL=...PROJECT_ID and SERVICE_ID are optional. Without them the script reports aggregate counts; with them it pulls memory and CPU from Railway and writes a CSV.
A tests manifest lives at backend/src/bench/tests-manifest.ts. One command walks it:
cd backend
BENCH_RUNNER_URL=https://bench-runner-production.up.railway.app \
BENCH_RUNNER_TOKEN=<your-token> \
npm run bench:rebaselineHits each bench-runner endpoint, writes results to tmp/v1.6.14-bench-results/{id}.json, prints a delta-vs-baseline report. Drift past the threshold goes yellow (drift); a delivery drop or a key-metric breach goes red (regress) and exits non-zero.
FILTER=jitter # only jitter tests
FILTER=latency-anycable # only AnyCable latency
FILTER=jitter,whispers # comma-separated categories or substrings
DRY_RUN=1 # print the plan
INCLUDE_IDLE=1 # add 4 idle tests (multi-shard, ~16 min)
INCLUDE_AVALANCHE=1 # add 5 avalanche tests (auto-redeploys server)
Full sweep: ~90 minutes wall-clock.
Baselines vs the page numbers. The page was captured during a noisy Railway shared-tenant window. The baseline field in tests-manifest.ts is what the same tests deliver on a quieter window: latencies ~50% better, everything else the same. So the page is the cautious "worst seen under shared-infra load" view, and the rebaseline tests against today's quieter floor. A green rebaseline says "we still beat today's floor", which is stricter than the page promises. When we refresh the page, baselines and page numbers move together.
Per-run history lives at tmp/v1.6.14-bench-results/runs/{ISO-ts}/. To watch each headline number move across runs:
npm run bench:rebaseline:history
LAST=10 FILTER=jitter npm run bench:rebaseline:historyMost-used: BENCH_RUNNER_URL and BENCH_RUNNER_TOKEN for any Railway driver; NUM_CLIENTS, DURATION, JITTER_INTERVAL, JITTER_DURATION, TOTAL_MESSAGES, INTERVAL_MS, RAMP_RATE for the jitter scripts; SOCKETIO_URL / ANYCABLE_URL / BROADCAST_URL to retarget local scripts; RESULTS_DIR to redirect CSV/JSON output. Full reference: docs/env.md.
Railway infrastructure recipes (resize, redeploy, deploy the 50-shard fleet, pause-after-tests cost control) live in docs/railway-ops.md.
- WebSocket-only transport on both sides. No long-polling fallback for Socket.io.
- AnyCable runs with the in-memory broker here. Production typically wants NATS or Redis for restart durability and multi-node fan-out.
- Publisher and subscribers are different processes by design (real-world latency model). Latency is reported both raw and min-normalized; the compare page uses the min-normalized view, with
skewFloorexposed so you can spot a bad clock. - Multi-shard min-normalization assumes each shard sees one fast sample. Reliable at 3 to 5 minute test windows; below ~60 s, sanity-check
skewFloorper shard before trusting the merged p99. - One bench-runner saturates around 50K subscribers (Node event-loop work, not memory). Above that, fan out via
bench/idle-multi.tsorbench/jitter-multi.ts. - Math.random isn't seeded. Runs reproduce statistically, not bit-for-bit. A few-ms p99 drift between runs is expected.
deliveryRatePctdenominator istotalMessages × clients. The JSON also carriesdeliveryRateOfConnectedPctso a run with connect failures isn't silently capped below 100%. In healthy runs (connectFailures: 0) the two are identical.
Built by AnyCable alongside the compare page at https://anycable.io/compare/nodejs-websocket. Open issues or PRs if you find a methodological flaw; we'd rather fix it than leave a wrong number standing.
MIT, LICENSE.