Summary
When @socket.io/sticky is used in front of a Node app that serves both Socket.IO traffic and arbitrary HTTP endpoints, multipart (or any large) HTTP POST bodies are truncated at the worker. The body bytes that arrive after the primary has already forwarded
the connection to a worker are silently discarded by the primary's drain handler instead of being forwarded.
This breaks any non-Socket.IO POST endpoint that reads req.body for payloads larger than what fits in the first kernel TCP read. File uploads via multer/busboy are the most visible failure mode (Error: Unexpected end of form from busboy when req.complete === true but the parser saw fewer body bytes than Content-Length).
A sister library has the same architectural pattern and the same bug reported six years ago without resolution: wzrdtales/socket-io-sticky-session#16.
Reproduction
Minimal Node app: cluster mode with setupMaster on the primary, setupWorker on workers, plus an HTTP POST endpoint that reads the full body.
// primary
const httpServer = http.createServer()
setupMaster(httpServer, { loadBalancingMethod: 'least-connection' })
httpServer.listen(3080)
// worker (via cluster.fork)
const app = express()
app.post('/upload', (req, res) => {
let bytes = 0
req.on('data', c => { bytes += c.length })
req.on('end', () => res.json({ contentLength: req.headers['content-length'], bytes, complete: req.complete }))
})
const server = createServer(app)
const sio = new Server(server)
setupWorker(sio)
Reproduction curl (any binary >= ~8 KB):
dd if=/dev/urandom of=/tmp/test.bin bs=1024 count=256
curl -X POST http://localhost:3080/upload --data-binary @/tmp/test.bin
Expected
{ contentLength: "262144", bytes: 262144, complete: true }.
Actual
{ "contentLength": "262144", "bytes": 132682, "complete": true }
The bytes count varies between ~5 % and ~70 % of Content-Length from request to request. req.complete is always true, which is what makes this insidious — Node's parser thinks the body ended cleanly because it received a TCP FIN, so
application-level error handling can't reliably detect the truncation.
Root cause analysis
@socket.io/sticky/index.js listens for "data" events on the primary's incoming socket, peeks at the first chunk to extract the sticky sid, then forwards the connection to a worker via cluster.workers[workerId].send({ type: "sticky:connection", data, connectionId }, socket, ...). The worker re-emits the buffered chunk via socket.emit("data", Buffer.from(data)).
This works for the Socket.IO traffic the library was designed for (one short HTTP handshake → upgrade to WebSocket) because the entire HTTP request fits in the first read.
For arbitrary HTTP POSTs with bodies that span multiple TCP reads, two things happen:
- The primary's
httpServer is also attached and starts parsing. To prevent it from buffering the body in primary memory, the library installs httpServer.on("request", req => req.on("data", () => {})) — the drain handler.
- The worker's
httpServer.emit("connection", socket) runs after the connection has already been emitting "data" chunks on the primary. Subsequent chunks are consumed by the drain handler and never reach the worker's parser.
Net result: the worker sees the headers + first chunk only, then a clean FIN. req.complete = true, but the body is short.
Suggested directions
A real fix is non-trivial because of the design choice to start parsing on the primary at all. Options that come to mind:
- Document the limitation explicitly. A note in the README that says "this library is designed for Socket.IO traffic; if your app also accepts HTTP request bodies, run a separate listener for those workers" would prevent a lot of debugging.
- Buffer all incoming bytes on the primary until handoff completes, and forward them in the message payload — not just the first chunk. Memory-bounded by maxHttpBufferSize or similar.
- Skip the drain handler when the upgrade isn't WebSocket. Pause the socket immediately after the first read, send the FD with the buffered prefix, let the worker resume reading from the kernel directly. (This is closer to how
cluster natively forwards
listening sockets.)
- Recommend
SO_REUSEPORT + Redis adapter as the canonical multi-core scaling pattern for mixed Socket.IO + HTTP apps, since the primary-fanout design has this fundamental tension.
Workaround in the affected app
Disable cluster mode entirely (PRIA_CLUSTER=0 in our case) and scale horizontally — more single-process tasks behind a load balancer that handles sticky pinning at L4. Loses zero functionality, just changes where parallelism happens.
Environment
@socket.io/sticky: latest at time of report
- Node.js: 22.x
- OS: Linux (Debian Bullseye on AWS ECS)
- Browser-side TLS terminates at HAProxy 2.6.27, which forwards HTTP/1.1 + Content-Length to an AWS NLB → ECS tasks. We confirmed HAProxy is not re-framing —
Transfer-Encoding is absent on the wire reaching Node, only Content-Length.
- Reproducible with curl as well as browsers, ruling out browser-specific behavior.
Happy to provide a complete repro repo if useful.
Summary
When
@socket.io/stickyis used in front of a Node app that serves both Socket.IO traffic and arbitrary HTTP endpoints, multipart (or any large) HTTP POST bodies are truncated at the worker. The body bytes that arrive after the primary has already forwardedthe connection to a worker are silently discarded by the primary's drain handler instead of being forwarded.
This breaks any non-Socket.IO POST endpoint that reads
req.bodyfor payloads larger than what fits in the first kernel TCP read. File uploads via multer/busboy are the most visible failure mode (Error: Unexpected end of formfrom busboy whenreq.complete === truebut the parser saw fewer body bytes thanContent-Length).A sister library has the same architectural pattern and the same bug reported six years ago without resolution: wzrdtales/socket-io-sticky-session#16.
Reproduction
Minimal Node app: cluster mode with
setupMasteron the primary,setupWorkeron workers, plus an HTTP POST endpoint that reads the full body.Reproduction curl (any binary >= ~8 KB):
Expected
{ contentLength: "262144", bytes: 262144, complete: true }.Actual
{ "contentLength": "262144", "bytes": 132682, "complete": true }The
bytescount varies between ~5 % and ~70 % ofContent-Lengthfrom request to request.req.completeis alwaystrue, which is what makes this insidious — Node's parser thinks the body ended cleanly because it received a TCP FIN, soapplication-level error handling can't reliably detect the truncation.
Root cause analysis
@socket.io/sticky/index.jslistens for"data"events on the primary's incoming socket, peeks at the first chunk to extract the stickysid, then forwards the connection to a worker viacluster.workers[workerId].send({ type: "sticky:connection", data, connectionId }, socket, ...). The worker re-emits the buffered chunk viasocket.emit("data", Buffer.from(data)).This works for the Socket.IO traffic the library was designed for (one short HTTP handshake → upgrade to WebSocket) because the entire HTTP request fits in the first read.
For arbitrary HTTP POSTs with bodies that span multiple TCP reads, two things happen:
httpServeris also attached and starts parsing. To prevent it from buffering the body in primary memory, the library installshttpServer.on("request", req => req.on("data", () => {}))— the drain handler.httpServer.emit("connection", socket)runs after the connection has already been emitting"data"chunks on the primary. Subsequent chunks are consumed by the drain handler and never reach the worker's parser.Net result: the worker sees the headers + first chunk only, then a clean FIN.
req.complete = true, but the body is short.Suggested directions
A real fix is non-trivial because of the design choice to start parsing on the primary at all. Options that come to mind:
clusternatively forwardslistening sockets.)
SO_REUSEPORT+ Redis adapter as the canonical multi-core scaling pattern for mixed Socket.IO + HTTP apps, since the primary-fanout design has this fundamental tension.Workaround in the affected app
Disable cluster mode entirely (
PRIA_CLUSTER=0in our case) and scale horizontally — more single-process tasks behind a load balancer that handles sticky pinning at L4. Loses zero functionality, just changes where parallelism happens.Environment
@socket.io/sticky: latest at time of reportTransfer-Encodingis absent on the wire reaching Node, onlyContent-Length.Happy to provide a complete repro repo if useful.