A composable reverse proxy framework for Go. Parapet is a library, not a binary: you build your edge or backend by importing the pieces you need and chaining them together with Use.
go get github.com/moonrhythm/parapetRequires Go 1.25 or later.
Middleware— anything that satisfiesServeHandler(http.Handler) http.Handler. Every feature in this library is aMiddleware.Middlewares— an ordered slice ofMiddleware, applied in reverse order so the firstUsecall sits outermost (like an onion).Server— wrapshttp.Serverwith a middleware chain and adds TLS, H2C, graceful shutdown (30 s grace, 10 s wait), and reuseport.- Composition helpers —
Cond{If, Then, Else}branches inline without ablock;Handleradapts anhttp.HandlerFuncinto a terminal middleware;MiddlewareFuncandServer.UseFunclet a plainfunc(http.Handler) http.Handlerbe used directly.
Three constructors pick sensible defaults for the role the server plays:
| Constructor | Use for | Notable defaults |
|---|---|---|
parapet.New() |
General purpose, behind another proxy | Trusts standard private CIDRs, long idle timeout |
parapet.NewFrontend() |
Edge / internet-facing | Read/write/header timeouts, no trusted proxies |
parapet.NewBackend() |
Internal service behind parapet | H2C enabled, trusts private CIDRs |
Each subdirectory under pkg/ is a self-contained middleware:
| Package | What it does |
|---|---|
upstream |
Reverse proxy and load balancing (round-robin, weighted, least-conn, ejecting, circuit-breaking, latency-ejecting, hedging) with active or passive health checks, automatic retries, over HTTP, H2C, HTTPS, or a Unix socket |
host |
Virtual-host routing on the Host header — wildcard prefixes, a * catch-all, and CIDR matching (NewCIDR), plus StripPort/ToLower normalizers |
location |
Path routing — exact, segment-boundary prefix, and regexp matchers |
router |
Simple URL router — subtree dispatch, falls through to the chain when no pattern matches |
block |
Conditional middleware container — match a request, then apply an inner chain (a nil matcher makes it an unconditional catch-all) |
mirror |
Traffic shadowing — tee a copy of matched/sampled requests to a canary, fire-and-forget |
ratelimit |
Fixed-window (in-memory or Redis-backed for a global limit), sliding-window, concurrent (drop-on-full or bounded-queue), and leaky-bucket limiters, with an Observe hook |
compress |
Content-negotiated compression (Gzip, Brotli, Deflate, Zstd; Brotli needs the cbrotli build tag) |
cache |
HTTP response cache — honor-origin policy, in-memory or disk backend, single-flight fills, X-Cache tag |
cache/purge |
Cache invalidation — purge by host, URL, path prefix, or surrogate tag, plus a reaper |
body |
Request body limiting (custom over-limit handler; default 413) and buffering (small bodies in memory, large ones spilled to a temp file with a known Content-Length) |
headers |
Request/response header manipulation |
cors |
CORS handling — allow-list via AllowOriginFunc (or AllowOrigins(...)); a disallowed Origin is rejected with 403 |
hsts |
Strict-Transport-Security (with preload) |
redirect |
HTTPS (driven by X-Forwarded-Proto), www/non-www, and arbitrary redirects — 301 by default, configurable StatusCode |
requestid |
Inject and propagate a request ID — validated, configurable header, TrustProxy for edge use |
logger |
Structured request logging |
healthz |
Liveness and readiness endpoint (readiness drains on graceful shutdown) |
timeout |
Per-request deadlines — Timeout (time to response headers) and RequestDeadline (whole request, headers + body) |
fileserver |
Static file serving — optional directory listing, falls through to the chain on 404, path-confined to root (symlink-safe) |
stripprefix |
Strip a URL path prefix before proxying |
authn |
JWT and basic-auth helpers |
waf |
Web application firewall driven by CEL expressions, hot reloadable |
prom |
Prometheus metrics — server (requests, connections, bytes), upstream, cache, WAF, rate-limit, and mirror collectors, plus a /metrics handler |
proxyprotocol |
HAProxy PROXY protocol (v1/v2) — recover the real client IP behind an L4 load balancer |
h2push |
HTTP/2 server push — a fixed link, or driven by the upstream's Link: rel=preload response headers |
gcs |
Serve static content from a Google Cloud Storage bucket — sets Content-Type/Cache-Control from object metadata, with main-page, not-found-page, and fallback-handler support |
gcp, stackdriver, trace |
Google Cloud integrations (LB real-client-IP extraction via gcp.HLBImmediateIP) and distributed tracing |
package main
import (
"log"
"github.com/moonrhythm/parapet"
"github.com/moonrhythm/parapet/pkg/body"
"github.com/moonrhythm/parapet/pkg/compress"
"github.com/moonrhythm/parapet/pkg/headers"
"github.com/moonrhythm/parapet/pkg/healthz"
"github.com/moonrhythm/parapet/pkg/host"
"github.com/moonrhythm/parapet/pkg/hsts"
"github.com/moonrhythm/parapet/pkg/location"
"github.com/moonrhythm/parapet/pkg/logger"
"github.com/moonrhythm/parapet/pkg/ratelimit"
"github.com/moonrhythm/parapet/pkg/redirect"
"github.com/moonrhythm/parapet/pkg/requestid"
"github.com/moonrhythm/parapet/pkg/upstream"
)
func main() {
s := parapet.New()
s.Use(logger.Stdout())
s.Use(requestid.New())
s.Use(ratelimit.FixedWindowPerSecond(60))
s.Use(ratelimit.FixedWindowPerMinute(300))
s.Use(ratelimit.FixedWindowPerHour(2000))
s.Use(body.LimitRequest(15 * 1024 * 1024)) // 15 MiB
s.Use(body.BufferRequest())
s.Use(compress.Gzip())
s.Use(compress.Br())
// sites
s.Use(example())
s.Use(mysite())
s.Use(wordpress())
// health check
{
l := location.Exact("/healthz")
l.Use(logger.Disable())
l.Use(healthz.New())
s.Use(l)
}
s.Addr = ":8080"
if err := s.ListenAndServe(); err != nil {
log.Fatal(err)
}
}
func example() parapet.Middleware {
h := host.New("example.com", "www.example.com")
h.Use(ratelimit.FixedWindowPerSecond(20))
h.Use(redirect.HTTPS())
h.Use(hsts.Preload())
h.Use(redirect.NonWWW())
h.Use(upstream.New(upstream.NewRoundRobinLoadBalancer([]*upstream.Target{
{Host: "example.default.svc.cluster.local:8080", Transport: &upstream.HTTPTransport{}},
{Host: "example1.default.svc.cluster.local:8080", Transport: &upstream.H2CTransport{}},
{Host: "myexamplebackuphost.com", Transport: &upstream.HTTPSTransport{}},
})))
return h
}
func mysite() parapet.Middleware {
var hs parapet.Middlewares
{
h := host.New("mysiteaaa.io", "www.mysiteaaa.io")
h.Use(ratelimit.FixedWindowPerSecond(15))
h.Use(redirect.HTTPS())
h.Use(hsts.Preload())
h.Use(redirect.WWW())
h.Use(headers.DeleteResponse(
"Server",
"x-goog-generation",
"x-goog-hash",
"x-goog-meta-goog-reserved-file-mtime",
"x-goog-metageneration",
"x-goog-storage-class",
"x-goog-stored-content-encoding",
"x-goog-stored-content-length",
"x-guploader-uploadid",
))
h.Use(upstream.SingleHost("storage.googleapis.com", &upstream.HTTPSTransport{}))
hs.Use(h)
}
{
h := host.New("mail.mysiteaaa.io")
h.Use(redirect.HTTPS())
h.Use(hsts.Preload())
h.Use(redirect.To("https://mail.google.com/a/mysiteaaa.io", 302))
hs.Use(h)
}
return hs
}
func wordpress() parapet.Middleware {
h := host.New("myblogaaa.com", "www.myblogaaa.com")
h.Use(ratelimit.FixedWindowPerMinute(150))
h.Use(redirect.HTTPS())
h.Use(hsts.Preload())
h.Use(redirect.NonWWW())
backend := upstream.SingleHost("wordpress.default.svc.cluster.local", &upstream.HTTPTransport{})
l := location.RegExp(`\.(js|css|svg|png|jp(e)?g|gif)$`)
l.Use(headers.SetResponse("Cache-Control", "public, max-age=31536000"))
l.Use(backend)
h.Use(l)
h.Use(backend)
return h
}ratelimit ships several strategies, all keyed per-client by
default (ClientIP, which reads X-Real-IP):
- Fixed window —
FixedWindowPerSecond/Minute/Hour(n). - Sliding window —
SlidingWindowPerSecond/Minute/Hour(n), smoother at the boundary. - Leaky bucket —
LeakyBucket(perRequest, size)admits one request perperRequestinterval, queueing up tosizebefore dropping. - Concurrent —
Concurrent(n)drops at capacity;ConcurrentQueue(capacity, size)instead queues up tosizebefore dropping.
Override RateLimiter.Key to limit by something other than IP, and
ExceededHandler to change the over-limit response (default 429 with
Retry-After). Wire Observe (and a bounded Name label) to count decisions:
rl := ratelimit.FixedWindowPerSecond(60)
rl.Name = "api"
rl.Observe = prom.RateLimit() // parapet_ratelimit_total{name,result=allowed|limited}
s.Use(rl)RedisFixedWindowPerSecond/Minute/Hour(runner, rate) enforce one global limit
across a fleet of proxies. parapet pulls in no Redis client — inject one through
the tiny RedisRunner interface (a RedisRunnerFunc wraps any client):
runner := ratelimit.RedisRunnerFunc(func(ctx context.Context, script string, keys []string, args ...any) (int64, error) {
return myRedis.Eval(ctx, script, keys, args...).Int64()
})
s.Use(ratelimit.RedisFixedWindowPerSecond(runner, 1000))The constructors fail open on a Redis error or timeout (admit, trading strict
limiting for availability) — the zero-value RedisFixedWindowStrategy{} fails
closed. Because a fail-open admit lands in result="allowed", wire
strategy.OnError = prom.RateLimitRedisError() to surface the otherwise-silent
parapet_ratelimit_redis_errors_total — the alertable "Redis is down, limits
aren't enforced" signal. Tunables: Max, Size, Prefix (default parapet:rl:),
Timeout (default 100 ms).
compress negotiates a response encoding from Accept-Encoding:
Gzip(), Deflate(), Zstd(), and Br() (Brotli). Each returns a tunable
*Compress whose Types (MIME allow-list, * for all), MinLength (default 860
bytes), and Vary (default on) you can set; it skips already-encoded responses,
WebSocket upgrades, and bodies below MinLength. Level variants exist:
GzipWithLevel(int), ZstdWithLevel(zstd.EncoderLevel), and BrWithQuality(int) /
BrWithOption(cbrotli.WriterOptions).
⚠️ Brotli requires thecbrotlibuild tag (and CGO + thegoogle/brotliC library). Without it,compress.Br()compiles to a no-op pass-through — requests negotiate down to another encoding silently. Build with-tags cbrotlito enable real Brotli;BrWithOptiononly exists under that tag.
The waf package runs CEL expressions against incoming requests. Rules compile inside SetRules, so the hot path never parses or type-checks, and rules can be swapped atomically at runtime.
import (
"net/http"
"github.com/moonrhythm/parapet"
"github.com/moonrhythm/parapet/pkg/waf"
)
w := waf.New()
_ = w.SetRules([]waf.Rule{{
ID: "block-sqli",
Expression: `request.query.contains("' OR '1'='1") || request.path.matches("(?i).*union.*select.*")`,
Action: waf.ActionBlock,
Status: http.StatusForbidden,
}})
s := parapet.NewFrontend()
s.Use(w)See pkg/waf/doc.go for the full list of request.* fields and helper functions exposed to expressions.
Each rule's Action is one of three:
| Action | Effect |
|---|---|
ActionBlock |
Terminate the request with the rule's Status (default 403) and Message. |
ActionAllow |
Short-circuit the chain — forward the request immediately and stop evaluating further rules. An explicit allowlist for trusted health-checkers or internal scanners; give it a low Priority so it runs first. |
ActionLog |
Record the match (via WAF.Logger / WAF.OnMatch) and keep evaluating. Shadow-deploy a new rule before switching it to ActionBlock. |
Fail-open by default. A rule that errors at evaluation time — a recovered panic, a type mismatch, an exceeded CostLimit, or the per-request EvalTimeout (default 5 ms) — is logged and the request is allowed through, the safer default for a reverse proxy. Set w.FailMode = waf.FailClosed to instead reject such requests with 500. CostLimit (CEL evaluation cost per rule) and DisableMacros harden the evaluator when rules come from a less-trusted source.
GeoIP / ASN filtering. The WAF stays storage-agnostic: supply w.Country func(*http.Request) string and w.ASN func(*http.Request) int64 (backed by a GeoIP database or an edge header) and rules can test request.country == "TH" or request.asn == 13335. Both keys are always present to expressions (empty string / 0 when unresolved), so referencing them never errors.
Observability. w.OnMatch fires per matched rule (for custom metrics and alerts by rule ID); w.Observe = prom.WAF() fires once per evaluated request and records the rule-eval latency histogram parapet_waf_eval_duration_seconds{outcome} (pass/allow/block/error) — covering the common no-match path OnMatch can't see.
w := waf.New()
w.FailMode = waf.FailClosed // reject on rule-eval error instead of allowing
w.Country = geoipLookup // func(*http.Request) string
w.Observe = prom.WAF() // rule-eval latency by outcome
_ = w.SetRules([]waf.Rule{
{ID: "allow-internal", Expression: `ipInCidr(request.remote_ip, "10.0.0.0/8")`, Action: waf.ActionAllow, Priority: 100},
{ID: "geo-block", Expression: `request.country == "XX"`, Action: waf.ActionBlock, Status: http.StatusForbidden},
})SetRules validates and compiles the whole set atomically: a duplicate ID, an empty or non-boolean expression, or a compile error rejects the entire batch and leaves the previously loaded ruleset serving, so a bad deploy can't brick the WAF.
The authn package ships JWT (static key or remote JWKS), HTTP Basic, and forward (external auth-server) authenticators. All wrap a common authn.Authenticator base, which you can use directly for a custom scheme.
The authn package verifies Authorization: Bearer tokens with
authn.JWT. The accepted signature algorithms are pinned by the caller — a
token signed with any other algorithm (including none) is rejected, which
prevents algorithm-confusion attacks. The signature, exp/nbf (with leeway),
and optional iss/aud claims are all verified; verified claims are placed on
the request context for downstream handlers.
Algorithms are pinned with this package's own constants (authn.HS256,
authn.RS256, …), so callers don't import a JOSE library — only pkg/authn.
import "github.com/moonrhythm/parapet/pkg/authn"
m := authn.JWT([]byte(secret), authn.HS256) // []byte for HMAC; a public key for RS*/ES*/EdDSA
m.Issuer = "https://issuer.example.com"
m.Audience = "my-api"
m.Leeway = 30 * time.Second // clock-skew tolerance on exp/nbf/iat; default 1m when zero
m.Realm = "my-api" // realm reported in the WWW-Authenticate challenge on rejection
s.Use(m)
// downstream
claims, ok := authn.JWTClaimsFromContext(r.Context())For tokens signed by an OIDC provider (Auth0, Okta, Google, …), verify against
the provider's jwks_uri instead of a static key with authn.JWKS. It fetches
the key set over HTTP and caches it, picking up signing-key rotation without
a restart: a stale cache is refreshed in the background while the last good set
keeps serving, and a token bearing an unknown kid triggers a single-flighted
refetch (rate-limited so bogus kids can't hammer the endpoint). Refresh
failures are fail-static — once a set has been fetched, a later fetch error
never starts rejecting valid tokens. The algorithm allowlist is still mandatory
and enforced exactly as above.
m := authn.JWTFromKeySource(
&authn.JWKS{URL: "https://issuer.example.com/.well-known/jwks.json"},
authn.RS256, // pin the accepted algorithm(s)
)
m.Issuer = "https://issuer.example.com"
m.Audience = "my-api"
s.Use(m)JWKS exposes RefreshInterval (cache TTL, default 15m), MinRefreshInterval
(unknown-kid refetch rate limit, default 1m), Client, and MaxResponseBytes.
authn.Basic(user, pass) checks Authorization: Basic credentials with a
constant-time comparison (crypto/subtle), so timing leaks neither which
field mismatched nor how many bytes matched. Realm is emitted in the
WWW-Authenticate challenge. For a backend-backed check, set
BasicAuthenticator.Authenticate and return authn.ErrInvalidCredentials on a
miss:
m := authn.Basic("admin", "s3cret")
m.Realm = "admin"
s.Use(m)
// or verify against your own store:
m := &authn.BasicAuthenticator{
Realm: "admin",
Authenticate: func(r *http.Request, user, pass string) error {
if !checkUser(user, pass) {
return authn.ErrInvalidCredentials
}
return nil
},
}authn.Forward delegates the decision to a separate auth service — the same
"auth request" / auth_request model as nginx and Traefik's ForwardAuth. It
issues a GET to the configured URL, allows the request on a 2xx and
rejects it otherwise, relaying the auth server's status, headers, and body
verbatim. It injects X-Forwarded-Method/-Host/-Uri (plus -Proto/-For)
so the auth server can see the original request, and on a transport error it
fails with 503 Auth Server Unavailable rather than leaking the error.
u, _ := url.Parse("http://auth.default.svc.cluster.local/auth")
m := authn.Forward(u)
m.AuthRequestHeaders = []string{"Cookie", "Authorization"} // default: all request headers (minus Content-Length)
m.AuthResponseHeaders = []string{"X-Auth-User"} // copied from the auth response onto the forwarded request
s.Use(m)By default every request header is forwarded to the auth server; set
AuthRequestHeaders to forward only a subset. AuthResponseHeaders are copied
from the auth response onto the downstream request, so the backend receives
identity headers (e.g. X-Auth-User) the auth server resolved.
The cache package is a CDN-style, honor-origin response cache. It caches a response only when the origin opts in with explicit freshness (Cache-Control: s-maxage/max-age or Expires); refuses private/no-store/no-cache, Set-Cookie, and Vary: *; honors Vary; serves GET/HEAD only; and ignores the client's request Cache-Control so a client can't bust the shared cache. Concurrent misses for one key collapse into a single origin fetch (single-flight), and it's fail-static — any storage error degrades to a miss, never an error to the client. Every response is tagged X-Cache: HIT|MISS.
Two storage backends ship: an in-memory one (lost on restart) and a disk-backed one (survives restarts, streams bodies to disk). Both bound their total size with LRU eviction plus a per-object cap. cache.Storage is a public interface (with EntryWriter and Meta) — implement it to back the cache with your own store (Redis, S3, …). Mount it ahead of the upstream/handler whose responses it should cache.
import "github.com/moonrhythm/parapet/pkg/cache"
// Disk-backed, 1 GiB on disk, 8 MiB per object; or cache.NewMemory(size) for RAM.
store, err := cache.NewDisk("/var/cache/app", 1<<30)
if err != nil {
log.Fatal(err)
}
h := host.New("static.example.com")
h.Use(cache.New(store, cache.Options{MaxFileSize: 8 << 20}))
h.Use(upstream.SingleHost("origin.default.svc.cluster.local", &upstream.HTTPTransport{}))
s.Use(h)Options also exposes Cacheable (a per-request predicate to exclude vetted paths), InvalidatedAfter (out-of-band purge), LockTimeout, and DecoupleFill (keep a slow client from stalling waiting followers). Because only origin-opted-in public content is cached, mark per-user or authorization-sensitive responses uncacheable at the origin.
Options.Override is a hook that returns a forced caching policy, overriding the origin's Cache-Control — so you can cache an origin that sends no (or unwanted) cache headers. It is called on each GET/HEAD fill with the request and the origin's response (status + headers), so the decision can key on anything in the request (host, path, extension) and the response (Content-Type, Content-Length, status). Return nil to honor the origin. The forced policy is baked into the stored entry only, so the served Cache-Control stays the origin's and doesn't propagate downstream.
cache.New(store, cache.Options{
Override: func(r *http.Request, status int, header http.Header) *cache.Override {
switch {
case status != http.StatusOK:
return nil // only force 200s
case strings.HasPrefix(header.Get("Content-Type"), "image/"):
return &cache.Override{TTL: time.Hour} // force images for 1h
case r.Host == "static.example.com" && strings.HasSuffix(r.URL.Path, ".js"):
return &cache.Override{TTL: 24 * time.Hour} // force this host's JS for a day
default:
return nil // everything else: respect upstream
}
},
})status and header are the live origin response — read them, don't mutate them.
Override.Mode chooses how far the force reaches over the origin's own directives — the safety trade-off is yours per request:
| Mode | Overrides | Still refuses |
|---|---|---|
OverrideBalanced (default) |
missing freshness, no-cache, max-age, Expires |
no-store, private, Set-Cookie, Vary: *, non-cacheable status, oversize, Authorization without a shared opt-in |
OverrideConservative |
only missing freshness | everything the origin says (no-cache/no-store/private/max-age all honored) |
OverrideAggressive |
almost everything, incl. no-store/private/Authorization |
Set-Cookie, Vary: *, non-cacheable status, oversize |
⚠️ Forcing trusts you to target cacheable paths. The cache key ignores the request'sCookieandAuthorization, so don't force per-user paths: evenOverrideBalancedwill cross-user-leak a response gated by a sessionCookiewhen the origin sends noSet-Cookie/private/no-store.OverrideAggressiveadditionally bypasses theAuthorizationgate. Scope the hook to known-public paths (or useOptions.Cacheable).
Override.StaleWhileRevalidate / StaleIfError force the RFC 5861 windows too (see below). For an unconditional default instead of a per-request hook, use Options.DefaultStaleWhileRevalidate / DefaultStaleIfError.
When the origin sets Cache-Control: stale-while-revalidate=<s> or stale-if-error=<s> on a cacheable response, the cache may serve the entry after it goes stale:
stale-while-revalidate— within the window, a stale entry is served immediately (X-Cache: STALE) while a single background revalidation refreshes it, so the client never waits on the origin. The detached fetch is bounded byOptions.RevalidateTimeout(default 30s).stale-if-error— past anystale-while-revalidatewindow, the cache contacts the origin and, only if it answers with a server error (5xx), serves the stale entry instead of the error (X-Cache: STALE).
must-revalidate/proxy-revalidate suppress both. The client's request Cache-Control is ignored (only the origin's response directives are honored), consistent with the rest of the cache. Note that an entry offering these windows is retained in storage until it is past the larger window (not just past freshness), so stale-if-error still has something to fall back to — total size remains bounded by the backend's LRU cap.
Forcing stale serving for an origin you don't control. Set Options.DefaultStaleWhileRevalidate / Options.DefaultStaleIfError to apply a window to any cacheable response that doesn't carry the directive itself. An explicit directive on the response still wins, and must-revalidate/proxy-revalidate still suppress it. These stay private to this cache — the served Cache-Control remains the origin's, so the policy doesn't propagate to downstream clients or caches.
cache.New(store, cache.Options{
DefaultStaleWhileRevalidate: 30 * time.Second,
DefaultStaleIfError: 24 * time.Hour,
})Alternatively, inject the directive with a headers middleware mounted below the cache (so the cache sees it on the response). The cache parses every Cache-Control header, so this adds the windows without clobbering the origin's max-age — but unlike the options above, the injected directive is served to clients:
h.Use(cache.New(store, cache.Options{})) // outer
h.Use(headers.AddResponse("Cache-Control", "stale-while-revalidate=30")) // below the cache
h.Use(upstream.SingleHost("origin...", &upstream.HTTPTransport{})) // innercache/purge invalidates cached entries by host, URL, path prefix, or surrogate tag (the origin's Cache-Tag). A purge.Table plugs into Options.InvalidatedAfter; invalidation is lazy (issuing a purge is O(1), a purged entry is reclaimed on its next lookup) and immediate (a purged entry is never served). Memory is bounded — an overflowing scope map folds into a global flush — and epochs are monotonic, so an NTP step-back can't un-purge.
pt := purge.New()
c := cache.New(store, cache.Options{InvalidatedAfter: pt.InvalidatedAfter})
pt.PurgeURL("example.com", "/a") // one URL: all methods/schemes/Vary variants
pt.PurgePrefix("example.com", "/blog") // a section, boundary-aware (/blog, not /blogger)
pt.PurgeTag("product-42") // every response carrying this surrogate key, any host
pt.FlushAll() // everything
go func() { for range time.Tick(5 * time.Minute) { pt.Reap(store) } }() // proactively reclaim bytesSnapshot/Restore serialize the table so purges survive a restart (persist however you like), and Table.Stats() returns a snapshot of per-scope record counts and the cap-fold count for diagnostics. The per-scope cap that triggers the global-flush fold is tunable with purge.New(purge.WithMaxRecords(n)) (default 65536). It's the engine parapet-ingress-controller builds its control-plane purge distribution on top of.
Surrogate keys come from the origin's Cache-Tag header; it is captured but left on the response (strip it at the origin if it must not reach clients), and an entry keeps at most 64 tags of up to 256 chars each.
Options.OnResult (a cache.ResultFunc) is called once per served request with the outcome, exposing two states the X-Cache header can't: a stale-if-error fallback (also STALE on the wire) and a BYPASS (which sends no X-Cache at all). Two ready-made consumers ship:
cache.New(store, cache.Options{OnResult: prom.Cache()}) // metrics
// or cache.LogResult to add a `cacheStatus` field to the access logprom.Cache() emits parapet_cache_total{host,result} (HIT|MISS|STALE|STALE_ERROR|BYPASS; hit ratio = HIT / all) and parapet_cache_fill_duration_seconds{host} (origin-fill latency, observed only when the origin is contacted).
upstream.NewRoundRobinLoadBalancer weights every target equally. Two strategies
bias by a per-Target Weight (values <= 0 count as 1), each optimizing a
different axis:
upstream.NewWeightedRoundRobinLoadBalancerdistributes request count in proportion to weight, using smooth weighted round-robin (the nginx algorithm), so a heavy target's picks are interleaved rather than dealt in a burst. With equal weights it is plain round-robin.upstream.NewLeastConnLoadBalancerroutes each request to the target with the fewest in-flight requests (weighted: lowestactive/Weight), so it tracks concurrency rather than count — which adapts to slow backends and long-lived requests a count-based balancer misses. A request stays counted until its response body is closed, which parapet's reverse proxy always does. SetTarget.MaxConcurrentto cap a target's in-flight requests (the bulkhead pattern): the cap is hard and never exceeded, surplus requests route to an under-cap target, and when every target is full the balancer sheds with503rather than overloading a saturated origin. A slot is freed only when the response body is closed, so bound total request time (a request-context deadline the transport honors) to keep a backend that stalls mid-body from latching the cap — a response-header or idle timeout alone does not cover a mid-body stall.
s.Use(upstream.New(upstream.NewWeightedRoundRobinLoadBalancer([]*upstream.Target{
{Host: "10.0.0.1:8080", Transport: &upstream.HTTPTransport{}, Weight: 3},
{Host: "10.0.0.2:8080", Transport: &upstream.HTTPTransport{}, Weight: 1},
})))Make the bulkhead observable: set lb.OnShed = prom.UpstreamShed() to count load-shed events by cause in parapet_upstream_shed_total{reason} (saturated = bulkhead full, all_dark = the active-health-check pool is down, empty = no targets), and call prom.UpstreamInflight(lb) to export scrape-time gauges parapet_upstream_inflight{host} and parapet_upstream_inflight_capacity{host} (from LeastConnLoadBalancer.Inflight()). A target pinned at inflight/capacity == 1 is the one driving shed_total{reason="saturated"}.
upstream.NewRoundRobinLoadBalancer spreads requests evenly but keeps routing to
a dead backend. upstream.NewEjectingLoadBalancer adds passive health checking
(outlier ejection): after a target returns MaxFails consecutive failures it is
ejected from rotation for EjectTimeout (doubling on each repeat ejection, up to
MaxEjectTimeout), then allowed back with no background probing. A single
success clears its failure count. If every target is ejected the balancer fails
open and keeps routing, so a transient outage cannot black-hole all traffic.
lb := upstream.NewEjectingLoadBalancer([]*upstream.Target{
{Host: "10.0.0.1:8080", Transport: &upstream.HTTPTransport{}},
{Host: "10.0.0.2:8080", Transport: &upstream.HTTPTransport{}},
})
lb.MaxFails = 3 // consecutive failures before ejection
lb.EjectTimeout = 30 * time.Second // base cooldown
s.Use(upstream.New(lb))By default only transport errors (other than a client-canceled request) count as
failures. Set lb.IsFailure to also treat responses such as 5xx as failures:
lb.IsFailure = func(resp *http.Response, err error) bool {
return err != nil || (resp != nil && resp.StatusCode >= 500)
}Pair it with prom.Upstream() (wired into Upstream.OnRoundTrip) to watch
ejections take effect: traffic shifts off a failing backend in
parapet_upstream_requests{host,status}. Wire each reliability balancer's
OnStateChange to prom.UpstreamState() to make the state machine itself
observable — parapet_upstream_state_transitions_total{host,from,to,reason}
(trips, ejections, recoveries, half-open probes) plus a current-state gauge and,
from ActiveHealthCheck, parapet_upstream_probe_down_total{host,cause} (cause =
timeout/refused/reset/dns/tls/status/error). prom.Upstream() also
records a per-target time-to-first-byte histogram
parapet_upstream_request_duration_seconds{host} (once per round-trip attempt, so
retries count individually) and counts fail-fast 503s in
parapet_upstream_fast_rejects_total.
upstream.NewCircuitBreakingLoadBalancer goes a step further: it fails fast.
An open target is rejected without a round-trip (so a request never pays the
dead backend's connect+timeout), and when every target is open it returns 503
rather than failing open — shedding load instead of hammering a dead origin.
After FailureThreshold consecutive failures a target opens for OpenTimeout
(doubling per repeat trip, up to MaxOpenTimeout), then admits a small half-open
trickle (HalfOpenMaxProbes) to test recovery: SuccessThreshold successes close
it, one failure re-opens it.
lb := upstream.NewCircuitBreakingLoadBalancer([]*upstream.Target{
{Host: "10.0.0.1:8080", Transport: &upstream.HTTPTransport{}},
{Host: "10.0.0.2:8080", Transport: &upstream.HTTPTransport{}},
})
lb.FailureThreshold = 5
lb.OpenTimeout = 5 * time.Second
s.Use(upstream.New(lb))Use EjectingLoadBalancer when you want fail-open (keep routing during a total
outage); use CircuitBreakingLoadBalancer when you want fail-fast (shed load).
The same IsFailure hook applies. Both ignore Target.Weight.
upstream.NewLatencyEjectingLoadBalancer catches what those two miss — a gray
failure, a backend still returning 200s but far slower than its peers. A target
whose decayed mean time-to-first-byte exceeds EjectionFactor × the pool median
is ejected and re-probed. Because the test is relative to the pool, it self-tunes: a
uniform slowdown raises every target and the median together, so no one is an
outlier (guard rails — a max-ejection cap and a panic threshold — keep a systemic
slowdown from draining the pool). It is latency-only: pair it with the circuit
breaker or EjectingLoadBalancer for error ejection.
lb := upstream.NewLatencyEjectingLoadBalancer([]*upstream.Target{
{Host: "10.0.0.1:8080", Transport: &upstream.HTTPTransport{}},
{Host: "10.0.0.2:8080", Transport: &upstream.HTTPTransport{}},
{Host: "10.0.0.3:8080", Transport: &upstream.HTTPTransport{}},
})
lb.EjectionFactor = 3 // eject a target 3× slower than the pool median
s.Use(upstream.New(lb))The guard rails are tunable too: MaxEjectionPercent (default 30) caps how much
of the pool can be ejected at once, and PanicThreshold (default 50) stops
ejecting entirely when too many targets look slow. Other knobs — MinSamples
(100), MinHosts (3), HalfLife (10 s decay), MinEjectDelta (50 ms),
MinEjectLatency (off), and EjectTimeout/MaxEjectTimeout (30 s → 5 m backoff)
— have sensible defaults.
A Target.Transport picks the wire protocol: HTTPTransport, H2CTransport,
HTTPSTransport, UnixTransport (dial a Unix socket), or the dynamic
multi-scheme Transport. Each of the first three and the dynamic one exposes an
optional DialContext seam to replace the default net.Dialer — to observe dial
errors, re-resolve endpoints, or wrap the connection. (Setting DialContext
makes DialTimeout a no-op — the custom dialer owns its timeouts; UnixTransport
has no seam.)
Upstream also rewrites the proxied request: Upstream.Host overrides the
Host header sent to the backend, and Upstream.Path prefixes a base path onto
the request path so a backend can be mounted under a subpath (the inverse of
stripprefix).
Every upstream.Upstream automatically retries a failed transport round-trip
up to Retries times (default 3) with exponential backoff (BackoffFactor << attempt, default base 50 ms). Only eligible requests are retried: by default an
idempotent method (GET/HEAD/OPTIONS/TRACE) whose body is absent or
rewindable (r.GetBody != nil, so each attempt can resend the full body). Set
Upstream.RetryPolicy to widen eligibility (e.g. an idempotent PUT/DELETE)
or narrow it.
up := upstream.New(lb)
up.Retries = 2
up.BackoffFactor = 20 * time.Millisecond
s.Use(up)
⚠️ Retry amplification. An eligible request can hit upstreams up toRetries+1times, and if the sameUpstreamis fronted by aHedgingLoadBalancer, each attempt fans out toMaxHedgemore — worst case ≈(Retries+1) × (MaxHedge+1)origin calls. Never mark a non-idempotent request retryable: a retriedPOSTcan double-apply a side effect.
upstream.NewHedgingLoadBalancer wraps any balancer to cut tail latency: if an
idempotent, body-less request hasn't responded within HedgeDelay, it sends a
duplicate to another target (the wrapped balancer self-selects a different one),
returns whichever response arrives first, and cancels the loser. The race happens
inside the RoundTripper, so the proxy only ever sees the winner.
h := upstream.NewHedgingLoadBalancer(lb) // lb is any balancer
h.HedgeDelay = 30 * time.Millisecond // ~p95; <= 0 disables (zero-cost pass-through)
s.Use(upstream.New(h))MaxHedge (default 1) caps the fan-out. Non-idempotent requests, and a request
already inside the retry loop, pass straight through. Because losing legs are
cancelled with context.Canceled, a custom IsFailure on the wrapped balancer
must exclude it (the default does), or hedging would slowly eject the healthy
backend it raced.
Three hooks tune the race. HedgeOnError launches the next hedge immediately on
a losing transport error instead of waiting out HedgeDelay — it is on when
built with NewHedgingLoadBalancer but off for a bare HedgingLoadBalancer{}
literal, so prefer the constructor. IsHedgeable overrides the default
eligibility rule (idempotent, body-less, not already retrying), and IsWinner
overrides the default "any non-error response wins" predicate — e.g. to keep
racing when a leg returns a 5xx.
The balancers above are passive — they learn a target is unhealthy only from
real traffic's failures. upstream.NewActiveHealthCheck adds active probing:
it wraps any balancer and probes each target out-of-band (one background goroutine
per target), routing only to those answering. Pass the same []*Target to both
the balancer and the wrapper so their indices line up:
targets := []*upstream.Target{
{Host: "10.0.0.1:8080", Transport: tr},
{Host: "10.0.0.2:8080", Transport: tr},
}
ahc := upstream.NewActiveHealthCheck(targets, upstream.NewRoundRobinLoadBalancer(targets))
ahc.Path = "/healthz"
ahc.Interval = 5 * time.Second
ahc.UnhealthyThld = 3 // down after 3 consecutive failed probes; HealthyThld re-admits
s.Use(upstream.New(ahc))Active and passive compose: the health gate only removes candidates, and the
wrapped balancer keeps its own strategy over the survivors — a weighted balancer
keeps its exact ratio, the circuit breaker still trips, least-conn still balances.
A target must pass both to be picked. When the gate marks every target down,
each balancer falls back to its own all-down policy: round-robin / ejecting /
latency-ejecting / least-conn route best-effort (so a broken probe path can't 503 a
whole healthy pool), while the circuit breaker still sheds. (Least-conn still sheds
on its capacity cap — MaxConcurrent — independently of health.)
Probing auto-starts on the first request and, when served by a parapet.Server,
stops on graceful shutdown. For a bare RoundTripper, or to bound the prober's
lifetime explicitly, call ahc.Start(ctx) before serving and ahc.Close() after.
By default a slot probes through each Target.Transport (exercising the real pool);
set ProbeTransport to isolate probe traffic. A held probe is bounded by Timeout,
and targets begin up by default (StartUnhealthy flips to fail-closed) so a
misconfigured probe path cannot black-hole a fresh deploy. The probe uses http;
for a target on the dynamic multi-scheme Transport set ahc.Scheme to "h2c" or
"unix" (the dedicated transports force their own scheme and ignore it). The probe
method defaults to GET (ahc.Method overrides it), and ahc.IsHealthy overrides
the default "a non-error response with status < 400 is healthy" check.
timeout offers two deadlines that bound different spans:
timeout.New(d)(Timeout) is a write-header deadline. It fires only until the upstream writes response headers, then disarms — a backend that sends headers and stalls mid-body is not bounded by it. On expiry it sends a default504 Gateway Timeout; setTimeout.TimeoutHandlerto customize that response (e.g. add aRetry-After).timeout.NewRequestDeadline(d)(RequestDeadline) is a total-request deadline (headers + body), implemented as a request-context deadline the upstream transports honor — so it does abort a backend that stalls mid-body (the gap that lets a stalled stream latch aMaxConcurrentbulkhead slot). It writes no response of its own; the cancelled context propagates andupstreamsurfaces a502/504.
api := location.Prefix("/api")
api.Use(timeout.NewRequestDeadline(30 * time.Second)) // total time for these routes
api.Use(upstream.New(lb))
s.Use(api)
⚠️ A total-request deadline kills Server-Sent Events, streaming responses, WebSocket-style upgrades, and large downloads. Never applyRequestDeadlineglobally — scope it per-route (vialocationorblock) and exclude streaming endpoints.
pkg/upstream has grown a stack of reliability primitives; reach for one by the
failure you are defending against. They compose — ActiveHealthCheck and
NewHedgingLoadBalancer each wrap any balancer — so the owning balancer handles the
dominant failure mode and the wrappers layer on top.
| Failure mode | Reach for |
|---|---|
| Flaky backend, hard 5xx / errors | NewEjectingLoadBalancer (keeps routing during a total outage) — or the circuit breaker if you would rather shed |
| Dead / brownout origin, fail fast and shed | NewCircuitBreakingLoadBalancer (rejects an open target with no round-trip) |
| Tail latency (p99) on a healthy pool | NewHedgingLoadBalancer (race a duplicate after HedgeDelay) |
| Gray failure: 200s but one host far slower than peers | NewLatencyEjectingLoadBalancer (relative to the pool median) |
| Overload: a slow backend draining the pool | Target.MaxConcurrent on NewLeastConnLoadBalancer + a total-request-deadline middleware (pkg/timeout) |
| Cold deploy / readiness / black-holing a fresh pod | NewActiveHealthCheck (probe out-of-band; route only to answering targets) |
| Uneven backend capacity | NewWeightedRoundRobinLoadBalancer (by count) or NewLeastConnLoadBalancer (by concurrency) |
All-down semantics — know this before an incident. When every target is out, the primitives diverge, and which one you ran decides whether a correlated outage degrades or hard-fails:
| Primitive | When all targets are out |
|---|---|
| Round-robin / weighted / least-conn (health) / ejecting / latency-ejecting | Fail open — route best-effort (a degraded answer beats none; a broken signal must not black-hole a healthy pool) |
NewLeastConnLoadBalancer (capacity, every target at MaxConcurrent) |
Shed 503 (the bulkhead contract — independent of health) |
NewCircuitBreakingLoadBalancer (every target open) |
Shed 503 (don't hammer a dead origin) |
ActiveHealthCheck never adds an all-down override: when its gate marks every target
down, each balancer falls back to its own policy above. The full failure-mode and
all-down guide, the composition rules, and the observability hooks
(OnRoundTrip → prom.Upstream(), OnStateChange → prom.UpstreamState()) live in
the pkg/upstream package doc.
mirror.New tees a copy of matched/sampled requests to a separate destination
(a "canary") so you can exercise a new build with real production traffic. It is
fire-and-forget: the primary request and its response are never affected — a
mirror that is slow, queue-full, or panicking is dropped or recovered, never
propagated. The canary's response is discarded.
mr := mirror.New()
mr.Match = func(r *http.Request) bool { return r.Method == http.MethodGet }
mr.SampleRate = 0.1 // shadow 10% of matched requests
mr.Observe = prom.Mirror() // optional outcome/latency metrics
mr.Use(upstream.SingleHost("canary:8080", &upstream.HTTPTransport{}))
s.Use(mr) // tees, then falls through to the real chain
s.Use(upstream.SingleHost("prod:8080", &upstream.HTTPTransport{}))A fixed worker pool (Workers, default 8) bounds mirror concurrency; a full
QueueSize queue drops rather than blocking. Effective concurrency is Workers,
not Workers + QueueSize — the queue only absorbs bursts that drain at worker
speed, so size Workers for the canary's latency. The request body is buffered up
front (bounded by MaxBodyBytes) so the primary and the mirror read byte-identical
bytes; an over-cap body skips the mirror (or set DisableBody to mirror with no
body at all, for large-payload services). Each mirror runs on a detached
context.Background() deadline (Timeout), so a client disconnect never cancels it.
The mirrored request is marked (X-Mirror: 1 by default; override with
MarkHeader/MarkValue, or DisableMark to go fully transparent) so the canary
can no-op side effects. End-to-end credentials are replayed by design — use Match
to exclude sensitive routes.
Observe is a func(mirror.MirrorInfo) — prom.Mirror() is one implementation,
emitting parapet_mirror_total{outcome} (dispatched/completed/dropped_full/
dropped_oversize/panicked) and parapet_mirror_request_duration_seconds. A
custom hook runs synchronously (on the request goroutine for dispatch/drop
outcomes, on workers for completed/panicked), so keep it fast and concurrency-safe;
Mirror.Stats() exposes the same counters without wiring Observe. All config and
the destination chain are read once and frozen on the first request — set them
before serving (a later Use is silently ignored). The pool has no Close and
lives for the process lifetime, so construct one Mirror per destination and reuse it.
Enable HTTPS by setting Server.TLSConfig to a *tls.Config (with Certificates
or a GetCertificate callback for SNI/ACME). Serve/ListenAndServe then call
ServeTLS, and the listen address defaults to :443 when Addr is empty. For
dev or internal TLS, parapet.GenerateSelfSignCertificate(parapet.SelfSign{...})
builds a tls.Certificate (RSA-2048, 10-year default validity; each Hosts entry
becomes an IP SAN if it parses as an IP, otherwise a DNS SAN).
s := parapet.NewFrontend()
cert, _ := parapet.GenerateSelfSignCertificate(parapet.SelfSign{Hosts: []string{"localhost", "127.0.0.1"}})
s.TLSConfig = &tls.Config{Certificates: []tls.Certificate{cert}}
// s.Addr == "" → serves HTTPS on :443Graceful shutdown. ListenAndServe traps SIGTERM and drains in-flight
requests. WaitBeforeShutdown (default 10 s) sleeps first — so load balancers
notice the instance leaving — then GraceTimeout (default 30 s) bounds the drain;
setting GraceTimeout <= 0 disables the built-in SIGTERM handling. Register
cleanup with Server.RegisterOnShutdown(func()) (safe to call concurrently, runs
each callback once); it's the seam healthz and ActiveHealthCheck
use to deregister during the drain.
Other tunables. ReadTimeout, ReadHeaderTimeout, WriteTimeout,
IdleTimeout, and MaxHeaderBytes map onto the embedded http.Server, while
TCPKeepAlivePeriod is applied to accepted connections at the listener. The
constructors pick role-appropriate defaults — e.g. NewFrontend sets a 10 s
ReadHeaderTimeout and 1 m read/write timeouts. Set
Server.ReusePort = true to bind the listener with SO_REUSEPORT for
zero-downtime restarts and cross-process load distribution. All Server fields
must be set before serving.
healthz serves both Kubernetes probes from one endpoint:
GET <Path> is liveness (driven by Set(bool)), and GET <Path>?ready=1 is
readiness (driven by SetReady(bool)). Readiness automatically returns 503
once the parapet.Server enters graceful shutdown — so a load balancer drains the
instance — while liveness stays 200 to avoid a restart mid-drain. Path defaults
to /healthz, and Set/SetReady let background checks flip the flags (fail
readiness while warming up, fail liveness when a dependency dies).
By default the endpoint only answers when the Host header is an IP (suiting
kubelet/LB probes that address the pod by IP) and passes hostname-addressed
requests through to the next handler; set Host = true to answer those too.
requestid injects an X-Request-Id (set Header to use another
key), writes it on both the forwarded request and the response, and records it
as the requestId field in the access log. New() defaults TrustProxy = true,
reusing a valid incoming ID for trace continuity — but an incoming ID is accepted
only if it passes a conservative charset check and is ≤ 128 bytes, otherwise a
fresh UUIDv4 replaces it. At the edge, set TrustProxy = false so clients
can't spoof or poison the ID that flows into your logs and upstreams.
logger emits one structured JSON record per request.
logger.Stdout() and logger.Stderr() are the ready-made constructors (or set
Logger.Writer to any io.Writer); OmitEmpty drops empty fields and is on by
default for Stdout/Stderr but off for a bare Logger{}. Downstream handlers
enrich the record with logger.Set(r.Context(), "userID", id) (read back with
logger.Get) — a no-op when no Logger is mounted upstream. A client-cancelled
request is logged with the synthetic status 499, and logger.Disable() silences
logging for a route (as the health-check block in the example does).
Parapet only reads X-Forwarded-* and X-Real-IP when the connection comes from a trusted CIDR. Configure trust with TrustCIDRs(...) or accept the defaults from Trusted() (standard private and loopback ranges). Servers created with NewFrontend() start with no trusted proxies by default.
An L4 load balancer (AWS NLB, HAProxy in TCP mode, …) terminates the TCP
connection, so without help the proxy sees the balancer's IP, not the client's —
and an L4 balancer adds no X-Forwarded-For. The proxyprotocol
package reads the HAProxy PROXY protocol header (v1 and v2) that such
balancers prepend and rewrites the connection's RemoteAddr to the real client,
so ratelimit, waf, logger, and the trust logic above all see the right
address. Mount it with Server.ModifyConnection; the header is parsed lazily on
the connection's first read, off the accept loop.
// Only the listed CIDRs (your load balancer) may set a client address; a direct
// connection from outside them is passed through untouched and cannot spoof one.
pp := proxyprotocol.New("10.0.0.0/8")
s := parapet.NewFrontend()
s.ModifyConnection(pp.ModifyConnection)Set Require to reject a trusted connection that arrives without a PROXY header
(use it when every connection from the balancer is guaranteed to carry one); by
default such a connection is served with its real peer address. HeaderTimeout
(default 10 s; negative disables) bounds how long a trusted connection has to
deliver its header, so a stalled connection can't hold a slot during the parse.
⚠️ proxyprotocol.New()with no CIDRs trusts every peer — any client could then spoof its address via a PROXY header — so it is safe only when the listener is reachable exclusively through the load balancer. An invalid CIDR string panics at startup.
Server.ShareProtoSlice makes that server's proxy write a single shared
[]string for X-Forwarded-Proto instead of allocating a fresh slice per
request, saving one allocation on every request that sets the header (~16% on
the distrust path in the proxy benchmarks). It is off by default and scoped to
the server.
This is unsafe if any middleware mutates the X-Forwarded-Proto value slice in
place — e.g. headers.MapRequest("X-Forwarded-Proto", …) or code doing
r.Header["X-Forwarded-Proto"][0] = … — because the mutation would corrupt the
shared slice for every subsequent request. Appending (headers.AddRequest) is
safe. Enable it only if you control the whole middleware chain, and set it
before serving:
s := parapet.NewBackend()
s.ShareProtoSlice = trueThe hsts and authn middlewares expose the same opt-in for their fixed
response headers via a ShareValueSlice field (off by default), sharing one
Strict-Transport-Security / WWW-Authenticate slice across requests:
hsts.HSTS{MaxAge: 365 * 24 * time.Hour, ShareValueSlice: true}The same caveat applies — only enable it when nothing in the chain mutates that response header's value slice in place.
prom collects into a shared registry and exposes it two ways: mount
prom.Handler() on a route, or run prom.Start(addr) as a standalone scrape
server. prom.Registry() returns the registry and prom.Namespace (default
parapet) is the prefix on every series name.
The three server-wide collectors instrument the whole chain. prom.Requests() is
a middleware; prom.Connections and prom.Networks take the *Server (they wire
ConnState/ModifyConnection, so call them rather than Use):
s := parapet.NewFrontend()
s.Use(prom.Requests()) // parapet_requests{host,status,method}
prom.Connections(s) // parapet_connections{state}
prom.Networks(s) // parapet_network_request_bytes / _response_bytes
// expose them
{
l := location.Exact("/metrics")
l.Use(prom.Handler())
s.Use(l)
}
// …or: go prom.Start(":9187")Every observable feature exposes a hook you assign a prom.* adapter to:
| Wire | Metrics |
|---|---|
up.OnRoundTrip = prom.Upstream() |
upstream_requests{host,status}, upstream_request_duration_seconds{host}, upstream_fast_rejects_total{host} |
lb.OnStateChange = prom.UpstreamState() |
upstream_state_transitions_total, upstream_breaker_state, upstream_probe_down_total{host,cause} |
prom.UpstreamInflight(lb) / lb.OnShed = prom.UpstreamShed() |
upstream_inflight{host} + _capacity{host}, upstream_shed_total{reason} |
rl.Observe = prom.RateLimit() / strategy.OnError = prom.RateLimitRedisError() |
ratelimit_total{name,result}, ratelimit_redis_errors_total |
cache.Options{OnResult: prom.Cache()} |
cache_total{host,result}, cache_fill_duration_seconds{host} |
w.Observe = prom.WAF() |
waf_eval_duration_seconds{outcome} |
mr.Observe = prom.Mirror() |
mirror_total{outcome}, mirror_request_duration_seconds |
All series carry the prom.Namespace prefix (shown unprefixed above).