diff --git a/CHANGELOG.md b/CHANGELOG.md index 629f435..0bdac1b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,7 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), ### Added - `ALLOWLIST_FILE` env var: when set, the contents of the referenced JSON file replace the built-in `sanitize-html` configuration. Lets different consumers run with different policies without forking. Malformed input fails fast at startup. - `lib/allowlist.js` module exporting `DEFAULT_ALLOWLIST` (the previous hardcoded config) and `loadAllowlist({ path })` for tests and programmatic use. +- `RATE_LIMIT_RPM` env var: opt-in per-IP rate limiting on `POST /validate` (requests per minute). Disabled by default. `/health` and `/openapi.json` are never rate-limited. 429 responses include `retry-after` and `ratelimit-*` headers and are documented in the OpenAPI spec. ## [2.2.0] - 2026-05-10 diff --git a/README.md b/README.md index 53e61f2..657bcb9 100644 --- a/README.md +++ b/README.md @@ -87,6 +87,7 @@ The image is built on `node:24-alpine`, runs as the unprivileged `node` user, an | `PORT` | `5001` | TCP port the HTTP server binds to. | | `LOG_LEVEL` | `info` | `pino` log level (`trace`, `debug`, `info`, `warn`, `error`, `fatal`, `silent`). Forced to `silent` under `NODE_ENV=test`. | | `ALLOWLIST_FILE` | _unset_ | Path to a JSON file with a custom `sanitize-html` configuration. When set, replaces the built-in allowlist wholesale. See [Customising the allowlist](#customising-the-allowlist). | +| `RATE_LIMIT_RPM` | _unset_ | Positive integer. When set, enables per-IP rate limiting on `POST /validate` at this many requests per minute. Disabled by default. See [Rate limiting](#rate-limiting). | The JSON body limit is fixed at `256kb`. Markdown larger than that is rejected by Express with a `413` before reaching the handler. Adjust `express.json({ limit: ... })` in `server.js` if you need more. @@ -105,6 +106,14 @@ Set `ALLOWLIST_FILE` to a JSON file whose contents are passed straight to `sanit The file is loaded once at startup. Malformed JSON, a missing file, or a non-array `allowedTags` causes the process to exit immediately rather than silently fall back. The default allowlist lives in [`lib/allowlist.js`](lib/allowlist.js) and is exported as `DEFAULT_ALLOWLIST` for reference. +### Rate limiting + +Set `RATE_LIMIT_RPM` to a positive integer to enable per-IP rate limiting on `POST /validate`. The window is 60 seconds and the limit applies only to `/validate` — `/health` and `/openapi.json` are always reachable so that probes and clients can introspect the service even under load. Exceeding the limit returns `429 Too Many Requests` with `retry-after` and `ratelimit-*` headers (RFC 9462). + +The limiter keys on `req.ip`. If the service is deployed behind a reverse proxy, configure `app.set('trust proxy', ...)` in `server.js` so the limiter sees the real client address rather than the proxy. The service ships with no `trust proxy` configuration to avoid header-injection in untrusted topologies. + +Invalid values (`0`, negative, non-integer) cause the process to exit at startup rather than silently disable. + ### Logging and request correlation Every request is logged as a single JSON line on stdout via [`pino-http`](https://github.com/pinojs/pino-http). Each request is tagged with an id surfaced in the `x-request-id` response header and included in every log line. If the caller sends an `x-request-id` header that matches `^[a-zA-Z0-9_.-]{1,128}$`, the service reuses it; otherwise a fresh UUID is generated. Use this id to correlate a client trace with the server log for a given request. @@ -115,11 +124,11 @@ Every request is logged as a single JSON line on stdout via [`pino-http`](https: - **Front matter is exposed raw, not trusted.** It is returned in its own `frontMatter` field, never inside `sanitized`. A coarse HTML-like check decides `safe`, but the consumer must sanitize any front-matter value it intends to render as HTML. - **`query parser` is set to `simple`.** Express's default `qs`-based parser has shipped two array-limit DoS bypasses (`GHSA-w7fw-mjwx-w883`, `GHSA-6rw7-vpxm-498p`); the simple parser is not affected. Do not change this without re-reviewing those advisories. - **Body size cap.** `express.json({ limit: '256kb' })` is the first line of defence against payload-amplification attacks against `sanitize-html`. -- **No rate limiting or auth.** This service expects to live behind a gateway that handles those concerns. If you expose it directly, put a reverse proxy in front. +- **Rate limiting is opt-in via `RATE_LIMIT_RPM`** and disabled by default. The service still expects to live behind a gateway for auth and TLS; the in-process limiter is defence-in-depth for `/validate`, not a substitute for an upstream policy layer. - **Property-based fuzzing.** `tests/fuzzing.test.js` runs `fast-check` against `/validate` to exercise invariants (no dangerous tags ever leak to `sanitized`, sanitization is idempotent, front matter never appears inside `sanitized`). Hundreds of randomized payloads per release. - **Schema-validated boundary.** `POST /validate` rejects any body that does not conform to the OpenAPI `ValidateRequest` schema (`ajv`). Extra fields, wrong types, and missing/empty values are caught before reaching the sanitizer, with structured `details` per violation. -`npm audit` reports zero vulnerabilities at the time of writing (May 2026, against `express@5`, `sanitize-html@2.17`, `pino@10`, `pino-http@11`, `ajv@8`, `jest@30`, `supertest@7.2`, `fast-check@4`). +`npm audit` reports zero vulnerabilities at the time of writing (May 2026, against `express@5`, `sanitize-html@2.17`, `pino@10`, `pino-http@11`, `ajv@8`, `express-rate-limit@8`, `jest@30`, `supertest@7.2`, `fast-check@4`). ## Project layout @@ -132,6 +141,7 @@ tests/fuzzing.test.js Property-based tests (fast-check) for sanitizer invari tests/request-id.test.js Coverage for the x-request-id middleware. tests/openapi.test.js Coverage for the OpenAPI endpoint and contract. tests/allowlist.test.js Unit + integration coverage for the allowlist loader. +tests/rate-limit.test.js Coverage for the RATE_LIMIT_RPM middleware on /validate. Dockerfile, .dockerignore Container build. ``` diff --git a/openapi.json b/openapi.json index 2a87f6b..c5b6ac4 100644 --- a/openapi.json +++ b/openapi.json @@ -60,6 +60,21 @@ "headers": { "x-request-id": { "$ref": "#/components/headers/RequestId" } } + }, + "429": { + "description": "Rate limit exceeded. Only returned when the service is started with a positive `RATE_LIMIT_RPM` env var. Includes `retry-after` and `ratelimit-*` headers.", + "headers": { + "x-request-id": { "$ref": "#/components/headers/RequestId" }, + "retry-after": { + "description": "Seconds the client should wait before retrying.", + "schema": { "type": "integer" } + } + }, + "content": { + "application/json": { + "schema": { "$ref": "#/components/schemas/ErrorResponse" } + } + } } } } diff --git a/package-lock.json b/package-lock.json index f66755d..bf52d29 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1,16 +1,17 @@ { "name": "markdown-security", - "version": "2.1.0", + "version": "2.2.0", "lockfileVersion": 3, "requires": true, "packages": { "": { "name": "markdown-security", - "version": "2.1.0", + "version": "2.2.0", "license": "MIT", "dependencies": { "ajv": "^8.20.0", "express": "^5.2.1", + "express-rate-limit": "^8.5.1", "pino": "^10.3.1", "pino-http": "^11.0.0", "sanitize-html": "^2.17.3" @@ -2590,6 +2591,24 @@ "url": "https://opencollective.com/express" } }, + "node_modules/express-rate-limit": { + "version": "8.5.1", + "resolved": "https://registry.npmjs.org/express-rate-limit/-/express-rate-limit-8.5.1.tgz", + "integrity": "sha512-5O6KYmyJEpuPJV5hNTXKbAHWRqrzyu+OI3vUnSd2kXFubIVpG7ezpgxQy76Zo5GQZtrQBg86hF+CM/NX+cioiQ==", + "license": "MIT", + "dependencies": { + "ip-address": "^10.2.0" + }, + "engines": { + "node": ">= 16" + }, + "funding": { + "url": "https://github.com/sponsors/express-rate-limit" + }, + "peerDependencies": { + "express": ">= 4.11" + } + }, "node_modules/fast-check": { "version": "4.7.0", "resolved": "https://registry.npmjs.org/fast-check/-/fast-check-4.7.0.tgz", @@ -3125,6 +3144,15 @@ "integrity": "sha512-k/vGaX4/Yla3WzyMCvTQOXYeIHvqOKtnqBduzTHpzpQZzAskKMhZ2K+EnBiSM9zGSoIFeMpXKxa4dYeZIQqewQ==", "license": "ISC" }, + "node_modules/ip-address": { + "version": "10.2.0", + "resolved": "https://registry.npmjs.org/ip-address/-/ip-address-10.2.0.tgz", + "integrity": "sha512-/+S6j4E9AHvW9SWMSEY9Xfy66O5PWvVEJ08O0y5JGyEKQpojb0K0GKpz/v5HJ/G0vi3D2sjGK78119oXZeE0qA==", + "license": "MIT", + "engines": { + "node": ">= 12" + } + }, "node_modules/ipaddr.js": { "version": "1.9.1", "resolved": "https://registry.npmjs.org/ipaddr.js/-/ipaddr.js-1.9.1.tgz", diff --git a/package.json b/package.json index 57640b6..4b32439 100644 --- a/package.json +++ b/package.json @@ -27,6 +27,7 @@ "dependencies": { "ajv": "^8.20.0", "express": "^5.2.1", + "express-rate-limit": "^8.5.1", "pino": "^10.3.1", "pino-http": "^11.0.0", "sanitize-html": "^2.17.3" diff --git a/server.js b/server.js index f9fd3ed..65b339b 100644 --- a/server.js +++ b/server.js @@ -4,6 +4,7 @@ const sanitizeHtml = require('sanitize-html'); const pino = require('pino'); const pinoHttp = require('pino-http'); const Ajv = require('ajv'); +const rateLimit = require('express-rate-limit'); const openapi = require('./openapi.json'); const { loadAllowlist } = require('./lib/allowlist'); @@ -12,6 +13,26 @@ const allowlist = loadAllowlist(); const ajv = new Ajv({ strict: false }); const validateRequest = ajv.compile(openapi.components.schemas.ValidateRequest); +function buildRateLimiter() { + const value = process.env.RATE_LIMIT_RPM; + if (value === undefined || value === '') return null; + const rpm = Number(value); + if (!Number.isInteger(rpm) || rpm <= 0) { + throw new Error( + `RATE_LIMIT_RPM: must be a positive integer (got "${value}")` + ); + } + return rateLimit({ + windowMs: 60 * 1000, + limit: rpm, + standardHeaders: 'draft-7', + legacyHeaders: false, + message: { safe: false, error: 'Rate limit exceeded' }, + }); +} + +const rateLimiter = buildRateLimiter() || ((_req, _res, next) => next()); + const REQUEST_ID_RE = /^[a-zA-Z0-9_.-]{1,128}$/; const FRONT_MATTER_RE = /^---\n([\s\S]*?)\n---\n/; const HTML_LIKE = /<\s*[a-zA-Z!/]/; @@ -58,7 +79,7 @@ app.get('/openapi.json', (_req, res) => { res.status(200).json(openapi); }); -app.post('/validate', (req, res) => { +app.post('/validate', rateLimiter, (req, res) => { if (!validateRequest(req.body || {})) { return res.status(400).json({ safe: false, diff --git a/tests/openapi.test.js b/tests/openapi.test.js index 5f4fa22..e4f4854 100644 --- a/tests/openapi.test.js +++ b/tests/openapi.test.js @@ -37,11 +37,16 @@ describe("OpenAPI specification", () => { ]); }); - it("declares /validate 200, 400 and 413 responses", async () => { + it("declares /validate 200, 400, 413 and 429 responses", async () => { const res = await request(app).get("/openapi.json"); const responses = res.body.paths["/validate"].post.responses; - expect(Object.keys(responses).sort()).toEqual(["200", "400", "413"]); + expect(Object.keys(responses).sort()).toEqual([ + "200", + "400", + "413", + "429", + ]); }); it("keeps info.version in sync with package.json", async () => { diff --git a/tests/rate-limit.test.js b/tests/rate-limit.test.js new file mode 100644 index 0000000..4c78b18 --- /dev/null +++ b/tests/rate-limit.test.js @@ -0,0 +1,94 @@ +const request = require("supertest"); + +describe("Rate limiting (RATE_LIMIT_RPM)", () => { + let originalEnv; + + beforeEach(() => { + originalEnv = process.env.RATE_LIMIT_RPM; + }); + + afterEach(() => { + if (originalEnv === undefined) delete process.env.RATE_LIMIT_RPM; + else process.env.RATE_LIMIT_RPM = originalEnv; + jest.resetModules(); + }); + + const loadApp = () => { + let app; + jest.isolateModules(() => { + app = require("../server"); + }); + return app; + }; + + it("does not rate-limit when RATE_LIMIT_RPM is unset", async () => { + delete process.env.RATE_LIMIT_RPM; + const app = loadApp(); + + for (let i = 0; i < 5; i++) { + const res = await request(app) + .post("/validate") + .send({ markdown: "# hi" }) + .set("Content-Type", "application/json"); + expect(res.status).toBe(200); + } + }); + + it("returns 429 once the per-minute limit is exceeded on /validate", async () => { + process.env.RATE_LIMIT_RPM = "2"; + const app = loadApp(); + + const results = []; + for (let i = 0; i < 3; i++) { + results.push( + await request(app) + .post("/validate") + .send({ markdown: "# hi" }) + .set("Content-Type", "application/json") + ); + } + + expect(results[0].status).toBe(200); + expect(results[1].status).toBe(200); + expect(results[2].status).toBe(429); + expect(results[2].body).toEqual({ + safe: false, + error: "Rate limit exceeded", + }); + expect(results[2].headers).toHaveProperty("retry-after"); + }); + + it("does not rate-limit /health", async () => { + process.env.RATE_LIMIT_RPM = "1"; + const app = loadApp(); + + for (let i = 0; i < 5; i++) { + const res = await request(app).get("/health"); + expect(res.status).toBe(200); + } + }); + + it("does not rate-limit /openapi.json", async () => { + process.env.RATE_LIMIT_RPM = "1"; + const app = loadApp(); + + for (let i = 0; i < 5; i++) { + const res = await request(app).get("/openapi.json"); + expect(res.status).toBe(200); + } + }); + + it.each(["0", "-1", "1.5", "abc", ""])( + "rejects RATE_LIMIT_RPM=%p at startup", + (value) => { + if (value === "") { + // Empty string is treated as unset; loading succeeds with no limiter. + process.env.RATE_LIMIT_RPM = value; + expect(() => loadApp()).not.toThrow(); + return; + } + process.env.RATE_LIMIT_RPM = value; + expect(() => loadApp()).toThrow(/RATE_LIMIT_RPM/); + } + ); +});