fix(backend): route fastify listen/ready/close errors through logger instead of unhandled rejection#17401
Open
calebcgates wants to merge 1 commit into
Conversation
…instead of unhandled rejection
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Closes #17399.
packages/backend/src/server/ServerService.tshas three error-handling regressions aroundfastify.listen(),fastify.ready(), andfastify.close()that cause startup and shutdown failures to be reported as uncaught Node exceptions instead of through the misskey logger the code clearly intended to use.The four changes:
await fastify.listen()inside atry/catchinstead of fire-and-forget on the TCP path and callback-with-ignored-erron the socket path.fs.chmodSyncon the socket only afterlisten()resolves (today the chmod runs even when listen failed, throwingENOENTand masking the original error).await fastify.ready()in the sametry/catchso plugin-registration timeouts (ERR_AVVIO_PLUGIN_TIMEOUT) are reported throughthis.logger.errorinstead of NestJS's bootstrap stack trace.fastify.close()at 5s indispose()so leaked WebSocket upgrades fromstreamingApiServerService.attach(fastify.server)don't hangOnApplicationShutdownuntil PM2/systemdSIGKILLs the process.The intent of the existing
fastify.server.on('error', ...)block is preserved — moved into ahandleListenError()helper invoked from the newtry/catch. The reason this matters: fastify v5'slisten()rejects its returned Promise when the underlying http.Server hitsEADDRINUSE/EACCES, before (or in a race with) the http.Server emitting'error'. On Node >= 15 with default--unhandled-rejections=throw, the listen-promise rejection terminates the process with a stack trace beforeserver.on('error')can run. The handler is reachable today only by accident of timing.Why
Concrete reproductions:
Repro 1 — TCP port in use:
Repro 2 — Unix socket without permission:
Run as a non-root user.
fastify.listen({ path })rejects withEACCES, the callback'serris ignored, thenfs.chmodSync('/var/run/misskey.sock', ...)throwsENOENT: no such file or directorybecause no socket was ever created. User sees the chmod error, not the actual permission error. After this PR:"You do not have permission to listen on /var/run/misskey.sock."+ clean exit(1).Repro 3 —
OnApplicationShutdownhangs:A WebSocket client opens a connection via
streamingApiServerService.attach(fastify.server), then disconnects abruptly without a clean close frame. PM2 / systemd / k8s sends SIGTERM to misskey.dispose()callsawait this.#fastify.close()which waits indefinitely for the upgraded socket to be reaped. PM2's defaultkill_timeoutthen SIGKILLs the process; logs show "process killed" with no diagnostic. After this PR: 5-second cap, thenOnApplicationShutdownresolves cleanly so PM2 records a graceful exit.These are the only four real bugs in this area. The other fastify call sites in the backend (route handlers throwing in
ClientServerService/ApiServerService/WellKnownServerService,register()calls deferred toawait fastify.ready(),addHook()callbacks) are correctly handled by the existingsetErrorHandler(ClientServerService.ts:924) plus the now-properly-try-wrappedawait fastify.ready()— they're not part of this PR's scope.Additional info (optional)
Behavior preservation
EADDRINUSEEACCESprocess.send('listenFailed')only whenserver.on('error')wins the raceprocess.send('listenFailed')alwaysOn tests
packages/backend/test/unit/has noServerServicespec today. Adding one requires mocking the NestJS DI graph for ~18 injected services and stubbing 11 sub-fastify-services. I scoped this PR to the bug-fix and the truth table above; happy to add a vitest unit test forhandleListenErrorin a follow-up if useful — please flag if you want it bundled here instead.Local verification
pnpm install --frozen-lockfile— no lockfile churnpnpm --filter backend typecheck(tsgo) — 0 errorspnpm --filter backend exec eslint src/server/ServerService.ts— 0 errors; warning count net-reduced from 9 to 5 (removed(err as any)cast, two unused-param warnings, and thethis.config.socket!non-null assertion)Checklist