feat(observability): Prometheus /metrics endpoint#43
Merged
Conversation
Closes audit #12 -- the observability gap noted in the 2026-05-15 audit (Sentry covers errors but no latency percentiles, throughput, or queue depth, blocking SLA validation and capacity planning). Adds a small focused plugin at src/routes/metrics.ts using prom-client directly (no wrapper plugin): - GET /metrics returns the standard Prometheus text format (v0.0.4) - Registers Node.js default metrics: heap, GC, event loop lag, CPU - Adds two custom metrics labeled by method/route/status_code: - http_requests_total (counter) - http_request_duration_seconds (histogram with realistic facilitator latency buckets: 5ms to 10s) - Default label `service="cardano402"` on every series - Route label uses the templated pattern (e.g. "/files/:cid") not the raw URL -- bounded cardinality, won't explode the time-series count - Skips /metrics itself (recursive accounting) and /health (k8s-style liveness-probe noise that would skew latency percentiles) robots.txt updated to Disallow /metrics so it isn't crawl-indexed. 9 new tests cover: content type + Prometheus format, default Node.js metrics presence, custom counter/histogram presence, request tracking across multiple calls, route-pattern cardinality boundedness, method + status_code labels, and the /metrics + /health exclusions. Full suite: 34 files / 452 tests passing (was 33 / 443). No production behaviour change: the plugin only adds a new GET route and an onResponse hook that does in-memory increments. Memory cost is trivial (prom-client is ~50KB). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes audit #12 — the observability gap noted in the 2026-05-15 audit. Sentry covers errors/traces but there's no latency percentiles, throughput, or queue depth, blocking SLA validation and capacity planning.
What
Adds `GET /metrics` returning the standard Prometheus text format (v0.0.4), scrape-able by any Prometheus-compatible system.
Default metrics (from `prom-client`'s `collectDefaultMetrics`):
Custom HTTP metrics (added by an `onResponse` hook):
Default label `service="cardano402"` on every series.
Design choices
What it doesn't do (yet, deliberately)
Test plan
🤖 Generated with Claude Code