parapet

A composable reverse proxy framework for Go. Parapet is a library, not a binary: you build your edge or backend by importing the pieces you need and chaining them together with Use.

Install

go get github.com/moonrhythm/parapet

Requires Go 1.25 or later.

Concepts

Middleware — anything that satisfies ServeHandler(http.Handler) http.Handler. Every feature in this library is a Middleware.
Middlewares — an ordered slice of Middleware, applied in reverse order so the first Use call sits outermost (like an onion).
Server — wraps http.Server with a middleware chain and adds TLS, H2C, graceful shutdown (30 s grace, 10 s wait), and reuseport.
Composition helpers — Cond{If, Then, Else} branches inline without a block; Handler adapts an http.HandlerFunc into a terminal middleware; MiddlewareFunc and Server.UseFunc let a plain func(http.Handler) http.Handler be used directly.

Three constructors pick sensible defaults for the role the server plays:

Constructor	Use for	Notable defaults
`parapet.New()`	General purpose, behind another proxy	Trusts standard private CIDRs, long idle timeout
`parapet.NewFrontend()`	Edge / internet-facing	Read/write/header timeouts, no trusted proxies
`parapet.NewBackend()`	Internal service behind parapet	H2C enabled, trusts private CIDRs

Packages

Each subdirectory under pkg/ is a self-contained middleware:

Package	What it does
`upstream`	Reverse proxy and load balancing (round-robin, weighted, least-conn, ejecting, circuit-breaking, latency-ejecting, hedging) with active or passive health checks, automatic retries, over HTTP, H2C, HTTPS, or a Unix socket
`host`	Virtual-host routing on the `Host` header — wildcard prefixes, a `*` catch-all, and CIDR matching (`NewCIDR`), plus `StripPort`/`ToLower` normalizers
`location`	Path routing — exact, segment-boundary prefix, and regexp matchers
`router`	Simple URL router — subtree dispatch, falls through to the chain when no pattern matches
`block`	Conditional middleware container — match a request, then apply an inner chain (a nil matcher makes it an unconditional catch-all)
`mirror`	Traffic shadowing — tee a copy of matched/sampled requests to a canary, fire-and-forget
`ratelimit`	Fixed-window (in-memory or Redis-backed for a global limit), sliding-window, concurrent (drop-on-full or bounded-queue), and leaky-bucket limiters, with an `Observe` hook
`compress`	Content-negotiated compression (Gzip, Brotli, Deflate, Zstd; Brotli needs the `cbrotli` build tag)
`cache`	HTTP response cache — honor-origin policy, in-memory or disk backend, single-flight fills, `X-Cache` tag
`cache/purge`	Cache invalidation — purge by host, URL, path prefix, or surrogate tag, plus a reaper
`body`	Request body limiting (custom over-limit handler; default `413`) and buffering (small bodies in memory, large ones spilled to a temp file with a known `Content-Length`)
`headers`	Request/response header manipulation
`cors`	CORS handling — allow-list via `AllowOriginFunc` (or `AllowOrigins(...)`); a disallowed `Origin` is rejected with `403`
`hsts`	`Strict-Transport-Security` (with preload)
`redirect`	HTTPS (driven by `X-Forwarded-Proto`), www/non-www, and arbitrary redirects — `301` by default, configurable `StatusCode`
`requestid`	Inject and propagate a request ID — validated, configurable header, `TrustProxy` for edge use
`logger`	Structured request logging
`healthz`	Liveness and readiness endpoint (readiness drains on graceful shutdown)
`timeout`	Per-request deadlines — `Timeout` (time to response headers) and `RequestDeadline` (whole request, headers + body)
`fileserver`	Static file serving — optional directory listing, falls through to the chain on 404, path-confined to root (symlink-safe)
`stripprefix`	Strip a URL path prefix before proxying
`authn`	JWT and basic-auth helpers
`waf`	Web application firewall driven by CEL expressions, hot reloadable
`prom`	Prometheus metrics — server (requests, connections, bytes), upstream, cache, WAF, rate-limit, and mirror collectors, plus a `/metrics` handler
`proxyprotocol`	HAProxy PROXY protocol (v1/v2) — recover the real client IP behind an L4 load balancer
`h2push`	HTTP/2 server push — a fixed link, or driven by the upstream's `Link: rel=preload` response headers
`gcs`	Serve static content from a Google Cloud Storage bucket — sets `Content-Type`/`Cache-Control` from object metadata, with main-page, not-found-page, and fallback-handler support
`gcp`, `stackdriver`, `trace`	Google Cloud integrations (LB real-client-IP extraction via `gcp.HLBImmediateIP`) and distributed tracing

Example

package main

import (
	"log"

	"github.com/moonrhythm/parapet"
	"github.com/moonrhythm/parapet/pkg/body"
	"github.com/moonrhythm/parapet/pkg/compress"
	"github.com/moonrhythm/parapet/pkg/headers"
	"github.com/moonrhythm/parapet/pkg/healthz"
	"github.com/moonrhythm/parapet/pkg/host"
	"github.com/moonrhythm/parapet/pkg/hsts"
	"github.com/moonrhythm/parapet/pkg/location"
	"github.com/moonrhythm/parapet/pkg/logger"
	"github.com/moonrhythm/parapet/pkg/ratelimit"
	"github.com/moonrhythm/parapet/pkg/redirect"
	"github.com/moonrhythm/parapet/pkg/requestid"
	"github.com/moonrhythm/parapet/pkg/upstream"
)

func main() {
	s := parapet.New()
	s.Use(logger.Stdout())
	s.Use(requestid.New())
	s.Use(ratelimit.FixedWindowPerSecond(60))
	s.Use(ratelimit.FixedWindowPerMinute(300))
	s.Use(ratelimit.FixedWindowPerHour(2000))
	s.Use(body.LimitRequest(15 * 1024 * 1024)) // 15 MiB
	s.Use(body.BufferRequest())
	s.Use(compress.Gzip())
	s.Use(compress.Br())

	// sites
	s.Use(example())
	s.Use(mysite())
	s.Use(wordpress())

	// health check
	{
		l := location.Exact("/healthz")
		l.Use(logger.Disable())
		l.Use(healthz.New())
		s.Use(l)
	}

	s.Addr = ":8080"
	if err := s.ListenAndServe(); err != nil {
		log.Fatal(err)
	}
}

func example() parapet.Middleware {
	h := host.New("example.com", "www.example.com")
	h.Use(ratelimit.FixedWindowPerSecond(20))
	h.Use(redirect.HTTPS())
	h.Use(hsts.Preload())
	h.Use(redirect.NonWWW())
	h.Use(upstream.New(upstream.NewRoundRobinLoadBalancer([]*upstream.Target{
		{Host: "example.default.svc.cluster.local:8080", Transport: &upstream.HTTPTransport{}},
		{Host: "example1.default.svc.cluster.local:8080", Transport: &upstream.H2CTransport{}},
		{Host: "myexamplebackuphost.com", Transport: &upstream.HTTPSTransport{}},
	})))
	return h
}

func mysite() parapet.Middleware {
	var hs parapet.Middlewares

	{
		h := host.New("mysiteaaa.io", "www.mysiteaaa.io")
		h.Use(ratelimit.FixedWindowPerSecond(15))
		h.Use(redirect.HTTPS())
		h.Use(hsts.Preload())
		h.Use(redirect.WWW())
		h.Use(headers.DeleteResponse(
			"Server",
			"x-goog-generation",
			"x-goog-hash",
			"x-goog-meta-goog-reserved-file-mtime",
			"x-goog-metageneration",
			"x-goog-storage-class",
			"x-goog-stored-content-encoding",
			"x-goog-stored-content-length",
			"x-guploader-uploadid",
		))
		h.Use(upstream.SingleHost("storage.googleapis.com", &upstream.HTTPSTransport{}))
		hs.Use(h)
	}

	{
		h := host.New("mail.mysiteaaa.io")
		h.Use(redirect.HTTPS())
		h.Use(hsts.Preload())
		h.Use(redirect.To("https://mail.google.com/a/mysiteaaa.io", 302))
		hs.Use(h)
	}

	return hs
}

func wordpress() parapet.Middleware {
	h := host.New("myblogaaa.com", "www.myblogaaa.com")
	h.Use(ratelimit.FixedWindowPerMinute(150))
	h.Use(redirect.HTTPS())
	h.Use(hsts.Preload())
	h.Use(redirect.NonWWW())

	backend := upstream.SingleHost("wordpress.default.svc.cluster.local", &upstream.HTTPTransport{})

	l := location.RegExp(`\.(js|css|svg|png|jp(e)?g|gif)$`)
	l.Use(headers.SetResponse("Cache-Control", "public, max-age=31536000"))
	l.Use(backend)
	h.Use(l)

	h.Use(backend)

	return h
}

Rate limiting

ratelimit ships several strategies, all keyed per-client by default (ClientIP, which reads X-Real-IP):

Fixed window — FixedWindowPerSecond/Minute/Hour(n).
Sliding window — SlidingWindowPerSecond/Minute/Hour(n), smoother at the boundary.
Leaky bucket — LeakyBucket(perRequest, size) admits one request per perRequest interval, queueing up to size before dropping.
Concurrent — Concurrent(n) drops at capacity; ConcurrentQueue(capacity, size) instead queues up to size before dropping.

Override RateLimiter.Key to limit by something other than IP, and ExceededHandler to change the over-limit response (default 429 with Retry-After). Wire Observe (and a bounded Name label) to count decisions:

rl := ratelimit.FixedWindowPerSecond(60)
rl.Name = "api"
rl.Observe = prom.RateLimit() // parapet_ratelimit_total{name,result=allowed|limited}
s.Use(rl)

Distributed (Redis-backed) rate limiting

RedisFixedWindowPerSecond/Minute/Hour(runner, rate) enforce one global limit across a fleet of proxies. parapet pulls in no Redis client — inject one through the tiny RedisRunner interface (a RedisRunnerFunc wraps any client):

runner := ratelimit.RedisRunnerFunc(func(ctx context.Context, script string, keys []string, args ...any) (int64, error) {
    return myRedis.Eval(ctx, script, keys, args...).Int64()
})
s.Use(ratelimit.RedisFixedWindowPerSecond(runner, 1000))

The constructors fail open on a Redis error or timeout (admit, trading strict limiting for availability) — the zero-value RedisFixedWindowStrategy{} fails closed. Because a fail-open admit lands in result="allowed", wire strategy.OnError = prom.RateLimitRedisError() to surface the otherwise-silent parapet_ratelimit_redis_errors_total — the alertable "Redis is down, limits aren't enforced" signal. Tunables: Max, Size, Prefix (default parapet:rl:), Timeout (default 100 ms).

Compression

compress negotiates a response encoding from Accept-Encoding: Gzip(), Deflate(), Zstd(), and Br() (Brotli). Each returns a tunable *Compress whose Types (MIME allow-list, * for all), MinLength (default 860 bytes), and Vary (default on) you can set; it skips already-encoded responses, WebSocket upgrades, and bodies below MinLength. Level variants exist: GzipWithLevel(int), ZstdWithLevel(zstd.EncoderLevel), and BrWithQuality(int) / BrWithOption(cbrotli.WriterOptions).

⚠️ Brotli requires the cbrotli build tag (and CGO + the google/brotli C library). Without it, compress.Br() compiles to a no-op pass-through — requests negotiate down to another encoding silently. Build with -tags cbrotli to enable real Brotli; BrWithOption only exists under that tag.

WAF with CEL rules

The waf package runs CEL expressions against incoming requests. Rules compile inside SetRules, so the hot path never parses or type-checks, and rules can be swapped atomically at runtime.

import (
	"net/http"

	"github.com/moonrhythm/parapet"
	"github.com/moonrhythm/parapet/pkg/waf"
)

w := waf.New()
_ = w.SetRules([]waf.Rule{{
	ID:         "block-sqli",
	Expression: `request.query.contains("' OR '1'='1") || request.path.matches("(?i).*union.*select.*")`,
	Action:     waf.ActionBlock,
	Status:     http.StatusForbidden,
}})

s := parapet.NewFrontend()
s.Use(w)

See pkg/waf/doc.go for the full list of request.* fields and helper functions exposed to expressions.

Each rule's Action is one of three:

Action	Effect
`ActionBlock`	Terminate the request with the rule's `Status` (default `403`) and `Message`.
`ActionAllow`	Short-circuit the chain — forward the request immediately and stop evaluating further rules. An explicit allowlist for trusted health-checkers or internal scanners; give it a low `Priority` so it runs first.
`ActionLog`	Record the match (via `WAF.Logger` / `WAF.OnMatch`) and keep evaluating. Shadow-deploy a new rule before switching it to `ActionBlock`.

Fail-open by default. A rule that errors at evaluation time — a recovered panic, a type mismatch, an exceeded CostLimit, or the per-request EvalTimeout (default 5 ms) — is logged and the request is allowed through, the safer default for a reverse proxy. Set w.FailMode = waf.FailClosed to instead reject such requests with 500. CostLimit (CEL evaluation cost per rule) and DisableMacros harden the evaluator when rules come from a less-trusted source.

GeoIP / ASN filtering. The WAF stays storage-agnostic: supply w.Country func(*http.Request) string and w.ASN func(*http.Request) int64 (backed by a GeoIP database or an edge header) and rules can test request.country == "TH" or request.asn == 13335. Both keys are always present to expressions (empty string / 0 when unresolved), so referencing them never errors.

Observability. w.OnMatch fires per matched rule (for custom metrics and alerts by rule ID); w.Observe = prom.WAF() fires once per evaluated request and records the rule-eval latency histogram parapet_waf_eval_duration_seconds{outcome} (pass/allow/block/error) — covering the common no-match path OnMatch can't see.

w := waf.New()
w.FailMode = waf.FailClosed       // reject on rule-eval error instead of allowing
w.Country = geoipLookup           // func(*http.Request) string
w.Observe = prom.WAF()            // rule-eval latency by outcome
_ = w.SetRules([]waf.Rule{
    {ID: "allow-internal", Expression: `ipInCidr(request.remote_ip, "10.0.0.0/8")`, Action: waf.ActionAllow, Priority: 100},
    {ID: "geo-block", Expression: `request.country == "XX"`, Action: waf.ActionBlock, Status: http.StatusForbidden},
})

SetRules validates and compiles the whole set atomically: a duplicate ID, an empty or non-boolean expression, or a compile error rejects the entire batch and leaves the previously loaded ruleset serving, so a bad deploy can't brick the WAF.

Authentication

The authn package ships JWT (static key or remote JWKS), HTTP Basic, and forward (external auth-server) authenticators. All wrap a common authn.Authenticator base, which you can use directly for a custom scheme.

JWT (bearer tokens)

The authn package verifies Authorization: Bearer tokens with authn.JWT. The accepted signature algorithms are pinned by the caller — a token signed with any other algorithm (including none) is rejected, which prevents algorithm-confusion attacks. The signature, exp/nbf (with leeway), and optional iss/aud claims are all verified; verified claims are placed on the request context for downstream handlers.

Algorithms are pinned with this package's own constants (authn.HS256, authn.RS256, …), so callers don't import a JOSE library — only pkg/authn.

import "github.com/moonrhythm/parapet/pkg/authn"

m := authn.JWT([]byte(secret), authn.HS256) // []byte for HMAC; a public key for RS*/ES*/EdDSA
m.Issuer = "https://issuer.example.com"
m.Audience = "my-api"
m.Leeway = 30 * time.Second // clock-skew tolerance on exp/nbf/iat; default 1m when zero
m.Realm = "my-api"          // realm reported in the WWW-Authenticate challenge on rejection
s.Use(m)

// downstream
claims, ok := authn.JWTClaimsFromContext(r.Context())

Rotating keys from a remote JWKS

For tokens signed by an OIDC provider (Auth0, Okta, Google, …), verify against the provider's jwks_uri instead of a static key with authn.JWKS. It fetches the key set over HTTP and caches it, picking up signing-key rotation without a restart: a stale cache is refreshed in the background while the last good set keeps serving, and a token bearing an unknown kid triggers a single-flighted refetch (rate-limited so bogus kids can't hammer the endpoint). Refresh failures are fail-static — once a set has been fetched, a later fetch error never starts rejecting valid tokens. The algorithm allowlist is still mandatory and enforced exactly as above.

m := authn.JWTFromKeySource(
	&authn.JWKS{URL: "https://issuer.example.com/.well-known/jwks.json"},
	authn.RS256, // pin the accepted algorithm(s)
)
m.Issuer = "https://issuer.example.com"
m.Audience = "my-api"
s.Use(m)

JWKS exposes RefreshInterval (cache TTL, default 15m), MinRefreshInterval (unknown-kid refetch rate limit, default 1m), Client, and MaxResponseBytes.

Basic authentication

authn.Basic(user, pass) checks Authorization: Basic credentials with a constant-time comparison (crypto/subtle), so timing leaks neither which field mismatched nor how many bytes matched. Realm is emitted in the WWW-Authenticate challenge. For a backend-backed check, set BasicAuthenticator.Authenticate and return authn.ErrInvalidCredentials on a miss:

m := authn.Basic("admin", "s3cret")
m.Realm = "admin"
s.Use(m)

// or verify against your own store:
m := &authn.BasicAuthenticator{
    Realm: "admin",
    Authenticate: func(r *http.Request, user, pass string) error {
        if !checkUser(user, pass) {
            return authn.ErrInvalidCredentials
        }
        return nil
    },
}

Forward authentication (external auth server)

authn.Forward delegates the decision to a separate auth service — the same "auth request" / auth_request model as nginx and Traefik's ForwardAuth. It issues a GET to the configured URL, allows the request on a 2xx and rejects it otherwise, relaying the auth server's status, headers, and body verbatim. It injects X-Forwarded-Method/-Host/-Uri (plus -Proto/-For) so the auth server can see the original request, and on a transport error it fails with 503 Auth Server Unavailable rather than leaking the error.

u, _ := url.Parse("http://auth.default.svc.cluster.local/auth")
m := authn.Forward(u)
m.AuthRequestHeaders = []string{"Cookie", "Authorization"} // default: all request headers (minus Content-Length)
m.AuthResponseHeaders = []string{"X-Auth-User"}            // copied from the auth response onto the forwarded request
s.Use(m)

By default every request header is forwarded to the auth server; set AuthRequestHeaders to forward only a subset. AuthResponseHeaders are copied from the auth response onto the downstream request, so the backend receives identity headers (e.g. X-Auth-User) the auth server resolved.

Response caching

The cache package is a CDN-style, honor-origin response cache. It caches a response only when the origin opts in with explicit freshness (Cache-Control: s-maxage/max-age or Expires); refuses private/no-store/no-cache, Set-Cookie, and Vary: *; honors Vary; serves GET/HEAD only; and ignores the client's request Cache-Control so a client can't bust the shared cache. Concurrent misses for one key collapse into a single origin fetch (single-flight), and it's fail-static — any storage error degrades to a miss, never an error to the client. Every response is tagged X-Cache: HIT|MISS.

Two storage backends ship: an in-memory one (lost on restart) and a disk-backed one (survives restarts, streams bodies to disk). Both bound their total size with LRU eviction plus a per-object cap. cache.Storage is a public interface (with EntryWriter and Meta) — implement it to back the cache with your own store (Redis, S3, …). Mount it ahead of the upstream/handler whose responses it should cache.

import "github.com/moonrhythm/parapet/pkg/cache"

// Disk-backed, 1 GiB on disk, 8 MiB per object; or cache.NewMemory(size) for RAM.
store, err := cache.NewDisk("/var/cache/app", 1<<30)
if err != nil {
	log.Fatal(err)
}

h := host.New("static.example.com")
h.Use(cache.New(store, cache.Options{MaxFileSize: 8 << 20}))
h.Use(upstream.SingleHost("origin.default.svc.cluster.local", &upstream.HTTPTransport{}))
s.Use(h)

Options also exposes Cacheable (a per-request predicate to exclude vetted paths), InvalidatedAfter (out-of-band purge), LockTimeout, and DecoupleFill (keep a slow client from stalling waiting followers). Because only origin-opted-in public content is cached, mark per-user or authorization-sensitive responses uncacheable at the origin.

Forcing caching for an origin you don't control

Options.Override is a hook that returns a forced caching policy, overriding the origin's Cache-Control — so you can cache an origin that sends no (or unwanted) cache headers. It is called on each GET/HEAD fill with the request and the origin's response (status + headers), so the decision can key on anything in the request (host, path, extension) and the response (Content-Type, Content-Length, status). Return nil to honor the origin. The forced policy is baked into the stored entry only, so the served Cache-Control stays the origin's and doesn't propagate downstream.

cache.New(store, cache.Options{
    Override: func(r *http.Request, status int, header http.Header) *cache.Override {
        switch {
        case status != http.StatusOK:
            return nil                                       // only force 200s
        case strings.HasPrefix(header.Get("Content-Type"), "image/"):
            return &cache.Override{TTL: time.Hour}           // force images for 1h
        case r.Host == "static.example.com" && strings.HasSuffix(r.URL.Path, ".js"):
            return &cache.Override{TTL: 24 * time.Hour}      // force this host's JS for a day
        default:
            return nil                                        // everything else: respect upstream
        }
    },
})

status and header are the live origin response — read them, don't mutate them.

Override.Mode chooses how far the force reaches over the origin's own directives — the safety trade-off is yours per request:

Mode	Overrides	Still refuses
`OverrideBalanced` (default)	missing freshness, `no-cache`, `max-age`, `Expires`	`no-store`, `private`, `Set-Cookie`, `Vary: *`, non-cacheable status, oversize, `Authorization` without a shared opt-in
`OverrideConservative`	only missing freshness	everything the origin says (`no-cache`/`no-store`/`private`/`max-age` all honored)
`OverrideAggressive`	almost everything, incl. `no-store`/`private`/`Authorization`	`Set-Cookie`, `Vary: *`, non-cacheable status, oversize

⚠️ Forcing trusts you to target cacheable paths. The cache key ignores the request's Cookie and Authorization, so don't force per-user paths: even OverrideBalanced will cross-user-leak a response gated by a session Cookie when the origin sends no Set-Cookie/private/no-store. OverrideAggressive additionally bypasses the Authorization gate. Scope the hook to known-public paths (or use Options.Cacheable).

Override.StaleWhileRevalidate / StaleIfError force the RFC 5861 windows too (see below). For an unconditional default instead of a per-request hook, use Options.DefaultStaleWhileRevalidate / DefaultStaleIfError.

Stale serving (RFC 5861)

When the origin sets Cache-Control: stale-while-revalidate=<s> or stale-if-error=<s> on a cacheable response, the cache may serve the entry after it goes stale:

stale-while-revalidate — within the window, a stale entry is served immediately (X-Cache: STALE) while a single background revalidation refreshes it, so the client never waits on the origin. The detached fetch is bounded by Options.RevalidateTimeout (default 30s).
stale-if-error — past any stale-while-revalidate window, the cache contacts the origin and, only if it answers with a server error (5xx), serves the stale entry instead of the error (X-Cache: STALE).

must-revalidate/proxy-revalidate suppress both. The client's request Cache-Control is ignored (only the origin's response directives are honored), consistent with the rest of the cache. Note that an entry offering these windows is retained in storage until it is past the larger window (not just past freshness), so stale-if-error still has something to fall back to — total size remains bounded by the backend's LRU cap.

Forcing stale serving for an origin you don't control. Set Options.DefaultStaleWhileRevalidate / Options.DefaultStaleIfError to apply a window to any cacheable response that doesn't carry the directive itself. An explicit directive on the response still wins, and must-revalidate/proxy-revalidate still suppress it. These stay private to this cache — the served Cache-Control remains the origin's, so the policy doesn't propagate to downstream clients or caches.

cache.New(store, cache.Options{
    DefaultStaleWhileRevalidate: 30 * time.Second,
    DefaultStaleIfError:         24 * time.Hour,
})

Alternatively, inject the directive with a headers middleware mounted below the cache (so the cache sees it on the response). The cache parses every Cache-Control header, so this adds the windows without clobbering the origin's max-age — but unlike the options above, the injected directive is served to clients:

h.Use(cache.New(store, cache.Options{}))                                       // outer
h.Use(headers.AddResponse("Cache-Control", "stale-while-revalidate=30"))       // below the cache
h.Use(upstream.SingleHost("origin...", &upstream.HTTPTransport{}))             // inner

Purging

cache/purge invalidates cached entries by host, URL, path prefix, or surrogate tag (the origin's Cache-Tag). A purge.Table plugs into Options.InvalidatedAfter; invalidation is lazy (issuing a purge is O(1), a purged entry is reclaimed on its next lookup) and immediate (a purged entry is never served). Memory is bounded — an overflowing scope map folds into a global flush — and epochs are monotonic, so an NTP step-back can't un-purge.

pt := purge.New()
c := cache.New(store, cache.Options{InvalidatedAfter: pt.InvalidatedAfter})

pt.PurgeURL("example.com", "/a")        // one URL: all methods/schemes/Vary variants
pt.PurgePrefix("example.com", "/blog")  // a section, boundary-aware (/blog, not /blogger)
pt.PurgeTag("product-42")               // every response carrying this surrogate key, any host
pt.FlushAll()                           // everything

go func() { for range time.Tick(5 * time.Minute) { pt.Reap(store) } }() // proactively reclaim bytes

Snapshot/Restore serialize the table so purges survive a restart (persist however you like), and Table.Stats() returns a snapshot of per-scope record counts and the cap-fold count for diagnostics. The per-scope cap that triggers the global-flush fold is tunable with purge.New(purge.WithMaxRecords(n)) (default 65536). It's the engine parapet-ingress-controller builds its control-plane purge distribution on top of.

Surrogate keys come from the origin's Cache-Tag header; it is captured but left on the response (strip it at the origin if it must not reach clients), and an entry keeps at most 64 tags of up to 256 chars each.

Observability

Options.OnResult (a cache.ResultFunc) is called once per served request with the outcome, exposing two states the X-Cache header can't: a stale-if-error fallback (also STALE on the wire) and a BYPASS (which sends no X-Cache at all). Two ready-made consumers ship:

cache.New(store, cache.Options{OnResult: prom.Cache()}) // metrics
// or cache.LogResult to add a `cacheStatus` field to the access log

prom.Cache() emits parapet_cache_total{host,result} (HIT|MISS|STALE|STALE_ERROR|BYPASS; hit ratio = HIT / all) and parapet_cache_fill_duration_seconds{host} (origin-fill latency, observed only when the origin is contacted).

Weighted and least-connection load balancing

upstream.NewRoundRobinLoadBalancer weights every target equally. Two strategies bias by a per-Target Weight (values <= 0 count as 1), each optimizing a different axis:

upstream.NewWeightedRoundRobinLoadBalancer distributes request count in proportion to weight, using smooth weighted round-robin (the nginx algorithm), so a heavy target's picks are interleaved rather than dealt in a burst. With equal weights it is plain round-robin.
upstream.NewLeastConnLoadBalancer routes each request to the target with the fewest in-flight requests (weighted: lowest active/Weight), so it tracks concurrency rather than count — which adapts to slow backends and long-lived requests a count-based balancer misses. A request stays counted until its response body is closed, which parapet's reverse proxy always does. Set Target.MaxConcurrent to cap a target's in-flight requests (the bulkhead pattern): the cap is hard and never exceeded, surplus requests route to an under-cap target, and when every target is full the balancer sheds with 503 rather than overloading a saturated origin. A slot is freed only when the response body is closed, so bound total request time (a request-context deadline the transport honors) to keep a backend that stalls mid-body from latching the cap — a response-header or idle timeout alone does not cover a mid-body stall.

s.Use(upstream.New(upstream.NewWeightedRoundRobinLoadBalancer([]*upstream.Target{
	{Host: "10.0.0.1:8080", Transport: &upstream.HTTPTransport{}, Weight: 3},
	{Host: "10.0.0.2:8080", Transport: &upstream.HTTPTransport{}, Weight: 1},
})))

Make the bulkhead observable: set lb.OnShed = prom.UpstreamShed() to count load-shed events by cause in parapet_upstream_shed_total{reason} (saturated = bulkhead full, all_dark = the active-health-check pool is down, empty = no targets), and call prom.UpstreamInflight(lb) to export scrape-time gauges parapet_upstream_inflight{host} and parapet_upstream_inflight_capacity{host} (from LeastConnLoadBalancer.Inflight()). A target pinned at inflight/capacity == 1 is the one driving shed_total{reason="saturated"}.

Load balancing with passive health checks

upstream.NewRoundRobinLoadBalancer spreads requests evenly but keeps routing to a dead backend. upstream.NewEjectingLoadBalancer adds passive health checking (outlier ejection): after a target returns MaxFails consecutive failures it is ejected from rotation for EjectTimeout (doubling on each repeat ejection, up to MaxEjectTimeout), then allowed back with no background probing. A single success clears its failure count. If every target is ejected the balancer fails open and keeps routing, so a transient outage cannot black-hole all traffic.

lb := upstream.NewEjectingLoadBalancer([]*upstream.Target{
	{Host: "10.0.0.1:8080", Transport: &upstream.HTTPTransport{}},
	{Host: "10.0.0.2:8080", Transport: &upstream.HTTPTransport{}},
})
lb.MaxFails = 3                      // consecutive failures before ejection
lb.EjectTimeout = 30 * time.Second   // base cooldown
s.Use(upstream.New(lb))

By default only transport errors (other than a client-canceled request) count as failures. Set lb.IsFailure to also treat responses such as 5xx as failures:

lb.IsFailure = func(resp *http.Response, err error) bool {
	return err != nil || (resp != nil && resp.StatusCode >= 500)
}

Pair it with prom.Upstream() (wired into Upstream.OnRoundTrip) to watch ejections take effect: traffic shifts off a failing backend in parapet_upstream_requests{host,status}. Wire each reliability balancer's OnStateChange to prom.UpstreamState() to make the state machine itself observable — parapet_upstream_state_transitions_total{host,from,to,reason} (trips, ejections, recoveries, half-open probes) plus a current-state gauge and, from ActiveHealthCheck, parapet_upstream_probe_down_total{host,cause} (cause = timeout/refused/reset/dns/tls/status/error). prom.Upstream() also records a per-target time-to-first-byte histogram parapet_upstream_request_duration_seconds{host} (once per round-trip attempt, so retries count individually) and counts fail-fast 503s in parapet_upstream_fast_rejects_total.

upstream.NewCircuitBreakingLoadBalancer goes a step further: it fails fast. An open target is rejected without a round-trip (so a request never pays the dead backend's connect+timeout), and when every target is open it returns 503 rather than failing open — shedding load instead of hammering a dead origin. After FailureThreshold consecutive failures a target opens for OpenTimeout (doubling per repeat trip, up to MaxOpenTimeout), then admits a small half-open trickle (HalfOpenMaxProbes) to test recovery: SuccessThreshold successes close it, one failure re-opens it.

lb := upstream.NewCircuitBreakingLoadBalancer([]*upstream.Target{
	{Host: "10.0.0.1:8080", Transport: &upstream.HTTPTransport{}},
	{Host: "10.0.0.2:8080", Transport: &upstream.HTTPTransport{}},
})
lb.FailureThreshold = 5
lb.OpenTimeout = 5 * time.Second
s.Use(upstream.New(lb))

Use EjectingLoadBalancer when you want fail-open (keep routing during a total outage); use CircuitBreakingLoadBalancer when you want fail-fast (shed load). The same IsFailure hook applies. Both ignore Target.Weight.

upstream.NewLatencyEjectingLoadBalancer catches what those two miss — a gray failure, a backend still returning 200s but far slower than its peers. A target whose decayed mean time-to-first-byte exceeds EjectionFactor × the pool median is ejected and re-probed. Because the test is relative to the pool, it self-tunes: a uniform slowdown raises every target and the median together, so no one is an outlier (guard rails — a max-ejection cap and a panic threshold — keep a systemic slowdown from draining the pool). It is latency-only: pair it with the circuit breaker or EjectingLoadBalancer for error ejection.

lb := upstream.NewLatencyEjectingLoadBalancer([]*upstream.Target{
	{Host: "10.0.0.1:8080", Transport: &upstream.HTTPTransport{}},
	{Host: "10.0.0.2:8080", Transport: &upstream.HTTPTransport{}},
	{Host: "10.0.0.3:8080", Transport: &upstream.HTTPTransport{}},
})
lb.EjectionFactor = 3 // eject a target 3× slower than the pool median
s.Use(upstream.New(lb))

The guard rails are tunable too: MaxEjectionPercent (default 30) caps how much of the pool can be ejected at once, and PanicThreshold (default 50) stops ejecting entirely when too many targets look slow. Other knobs — MinSamples (100), MinHosts (3), HalfLife (10 s decay), MinEjectDelta (50 ms), MinEjectLatency (off), and EjectTimeout/MaxEjectTimeout (30 s → 5 m backoff) — have sensible defaults.

Upstream transports and routing

A Target.Transport picks the wire protocol: HTTPTransport, H2CTransport, HTTPSTransport, UnixTransport (dial a Unix socket), or the dynamic multi-scheme Transport. Each of the first three and the dynamic one exposes an optional DialContext seam to replace the default net.Dialer — to observe dial errors, re-resolve endpoints, or wrap the connection. (Setting DialContext makes DialTimeout a no-op — the custom dialer owns its timeouts; UnixTransport has no seam.)

Upstream also rewrites the proxied request: Upstream.Host overrides the Host header sent to the backend, and Upstream.Path prefixes a base path onto the request path so a backend can be mounted under a subpath (the inverse of stripprefix).

Retries

Every upstream.Upstream automatically retries a failed transport round-trip up to Retries times (default 3) with exponential backoff (BackoffFactor << attempt, default base 50 ms). Only eligible requests are retried: by default an idempotent method (GET/HEAD/OPTIONS/TRACE) whose body is absent or rewindable (r.GetBody != nil, so each attempt can resend the full body). Set Upstream.RetryPolicy to widen eligibility (e.g. an idempotent PUT/DELETE) or narrow it.

up := upstream.New(lb)
up.Retries = 2
up.BackoffFactor = 20 * time.Millisecond
s.Use(up)

⚠️ Retry amplification. An eligible request can hit upstreams up to Retries+1 times, and if the same Upstream is fronted by a HedgingLoadBalancer, each attempt fans out to MaxHedge more — worst case ≈ (Retries+1) × (MaxHedge+1) origin calls. Never mark a non-idempotent request retryable: a retried POST can double-apply a side effect.

Hedging (speculative retry)

upstream.NewHedgingLoadBalancer wraps any balancer to cut tail latency: if an idempotent, body-less request hasn't responded within HedgeDelay, it sends a duplicate to another target (the wrapped balancer self-selects a different one), returns whichever response arrives first, and cancels the loser. The race happens inside the RoundTripper, so the proxy only ever sees the winner.

h := upstream.NewHedgingLoadBalancer(lb) // lb is any balancer
h.HedgeDelay = 30 * time.Millisecond     // ~p95; <= 0 disables (zero-cost pass-through)
s.Use(upstream.New(h))

MaxHedge (default 1) caps the fan-out. Non-idempotent requests, and a request already inside the retry loop, pass straight through. Because losing legs are cancelled with context.Canceled, a custom IsFailure on the wrapped balancer must exclude it (the default does), or hedging would slowly eject the healthy backend it raced.

Three hooks tune the race. HedgeOnError launches the next hedge immediately on a losing transport error instead of waiting out HedgeDelay — it is on when built with NewHedgingLoadBalancer but off for a bare HedgingLoadBalancer{} literal, so prefer the constructor. IsHedgeable overrides the default eligibility rule (idempotent, body-less, not already retrying), and IsWinner overrides the default "any non-error response wins" predicate — e.g. to keep racing when a leg returns a 5xx.

Active health checks

The balancers above are passive — they learn a target is unhealthy only from real traffic's failures. upstream.NewActiveHealthCheck adds active probing: it wraps any balancer and probes each target out-of-band (one background goroutine per target), routing only to those answering. Pass the same []*Target to both the balancer and the wrapper so their indices line up:

targets := []*upstream.Target{
	{Host: "10.0.0.1:8080", Transport: tr},
	{Host: "10.0.0.2:8080", Transport: tr},
}
ahc := upstream.NewActiveHealthCheck(targets, upstream.NewRoundRobinLoadBalancer(targets))
ahc.Path = "/healthz"
ahc.Interval = 5 * time.Second
ahc.UnhealthyThld = 3 // down after 3 consecutive failed probes; HealthyThld re-admits
s.Use(upstream.New(ahc))

Active and passive compose: the health gate only removes candidates, and the wrapped balancer keeps its own strategy over the survivors — a weighted balancer keeps its exact ratio, the circuit breaker still trips, least-conn still balances. A target must pass both to be picked. When the gate marks every target down, each balancer falls back to its own all-down policy: round-robin / ejecting / latency-ejecting / least-conn route best-effort (so a broken probe path can't 503 a whole healthy pool), while the circuit breaker still sheds. (Least-conn still sheds on its capacity cap — MaxConcurrent — independently of health.)

Probing auto-starts on the first request and, when served by a parapet.Server, stops on graceful shutdown. For a bare RoundTripper, or to bound the prober's lifetime explicitly, call ahc.Start(ctx) before serving and ahc.Close() after. By default a slot probes through each Target.Transport (exercising the real pool); set ProbeTransport to isolate probe traffic. A held probe is bounded by Timeout, and targets begin up by default (StartUnhealthy flips to fail-closed) so a misconfigured probe path cannot black-hole a fresh deploy. The probe uses http; for a target on the dynamic multi-scheme Transport set ahc.Scheme to "h2c" or "unix" (the dedicated transports force their own scheme and ignore it). The probe method defaults to GET (ahc.Method overrides it), and ahc.IsHealthy overrides the default "a non-error response with status < 400 is healthy" check.

Request timeouts

timeout offers two deadlines that bound different spans:

timeout.New(d) (Timeout) is a write-header deadline. It fires only until the upstream writes response headers, then disarms — a backend that sends headers and stalls mid-body is not bounded by it. On expiry it sends a default 504 Gateway Timeout; set Timeout.TimeoutHandler to customize that response (e.g. add a Retry-After).
timeout.NewRequestDeadline(d) (RequestDeadline) is a total-request deadline (headers + body), implemented as a request-context deadline the upstream transports honor — so it does abort a backend that stalls mid-body (the gap that lets a stalled stream latch a MaxConcurrent bulkhead slot). It writes no response of its own; the cancelled context propagates and upstream surfaces a 502/504.

api := location.Prefix("/api")
api.Use(timeout.NewRequestDeadline(30 * time.Second)) // total time for these routes
api.Use(upstream.New(lb))
s.Use(api)

⚠️ A total-request deadline kills Server-Sent Events, streaming responses, WebSocket-style upgrades, and large downloads. Never apply RequestDeadline globally — scope it per-route (via location or block) and exclude streaming endpoints.

Choosing a reliability primitive

pkg/upstream has grown a stack of reliability primitives; reach for one by the failure you are defending against. They compose — ActiveHealthCheck and NewHedgingLoadBalancer each wrap any balancer — so the owning balancer handles the dominant failure mode and the wrappers layer on top.

Failure mode	Reach for
Flaky backend, hard 5xx / errors	`NewEjectingLoadBalancer` (keeps routing during a total outage) — or the circuit breaker if you would rather shed
Dead / brownout origin, fail fast and shed	`NewCircuitBreakingLoadBalancer` (rejects an open target with no round-trip)
Tail latency (p99) on a healthy pool	`NewHedgingLoadBalancer` (race a duplicate after `HedgeDelay`)
Gray failure: 200s but one host far slower than peers	`NewLatencyEjectingLoadBalancer` (relative to the pool median)
Overload: a slow backend draining the pool	`Target.MaxConcurrent` on `NewLeastConnLoadBalancer` + a total-request-deadline middleware (`pkg/timeout`)
Cold deploy / readiness / black-holing a fresh pod	`NewActiveHealthCheck` (probe out-of-band; route only to answering targets)
Uneven backend capacity	`NewWeightedRoundRobinLoadBalancer` (by count) or `NewLeastConnLoadBalancer` (by concurrency)

All-down semantics — know this before an incident. When every target is out, the primitives diverge, and which one you ran decides whether a correlated outage degrades or hard-fails:

Primitive	When all targets are out
Round-robin / weighted / least-conn (health) / ejecting / latency-ejecting	Fail open — route best-effort (a degraded answer beats none; a broken signal must not black-hole a healthy pool)
`NewLeastConnLoadBalancer` (capacity, every target at `MaxConcurrent`)	Shed 503 (the bulkhead contract — independent of health)
`NewCircuitBreakingLoadBalancer` (every target open)	Shed 503 (don't hammer a dead origin)

ActiveHealthCheck never adds an all-down override: when its gate marks every target down, each balancer falls back to its own policy above. The full failure-mode and all-down guide, the composition rules, and the observability hooks (OnRoundTrip → prom.Upstream(), OnStateChange → prom.UpstreamState()) live in the pkg/upstream package doc.

Traffic mirroring (shadowing)

mirror.New tees a copy of matched/sampled requests to a separate destination (a "canary") so you can exercise a new build with real production traffic. It is fire-and-forget: the primary request and its response are never affected — a mirror that is slow, queue-full, or panicking is dropped or recovered, never propagated. The canary's response is discarded.

mr := mirror.New()
mr.Match = func(r *http.Request) bool { return r.Method == http.MethodGet }
mr.SampleRate = 0.1                  // shadow 10% of matched requests
mr.Observe = prom.Mirror()           // optional outcome/latency metrics
mr.Use(upstream.SingleHost("canary:8080", &upstream.HTTPTransport{}))

s.Use(mr)                            // tees, then falls through to the real chain
s.Use(upstream.SingleHost("prod:8080", &upstream.HTTPTransport{}))

A fixed worker pool (Workers, default 8) bounds mirror concurrency; a full QueueSize queue drops rather than blocking. Effective concurrency is Workers, not Workers + QueueSize — the queue only absorbs bursts that drain at worker speed, so size Workers for the canary's latency. The request body is buffered up front (bounded by MaxBodyBytes) so the primary and the mirror read byte-identical bytes; an over-cap body skips the mirror (or set DisableBody to mirror with no body at all, for large-payload services). Each mirror runs on a detached context.Background() deadline (Timeout), so a client disconnect never cancels it. The mirrored request is marked (X-Mirror: 1 by default; override with MarkHeader/MarkValue, or DisableMark to go fully transparent) so the canary can no-op side effects. End-to-end credentials are replayed by design — use Match to exclude sensitive routes.

Observe is a func(mirror.MirrorInfo) — prom.Mirror() is one implementation, emitting parapet_mirror_total{outcome} (dispatched/completed/dropped_full/ dropped_oversize/panicked) and parapet_mirror_request_duration_seconds. A custom hook runs synchronously (on the request goroutine for dispatch/drop outcomes, on workers for completed/panicked), so keep it fast and concurrency-safe; Mirror.Stats() exposes the same counters without wiring Observe. All config and the destination chain are read once and frozen on the first request — set them before serving (a later Use is silently ignored). The pool has no Close and lives for the process lifetime, so construct one Mirror per destination and reuse it.

TLS and server configuration

Enable HTTPS by setting Server.TLSConfig to a *tls.Config (with Certificates or a GetCertificate callback for SNI/ACME). Serve/ListenAndServe then call ServeTLS, and the listen address defaults to :443 when Addr is empty. For dev or internal TLS, parapet.GenerateSelfSignCertificate(parapet.SelfSign{...}) builds a tls.Certificate (RSA-2048, 10-year default validity; each Hosts entry becomes an IP SAN if it parses as an IP, otherwise a DNS SAN).

s := parapet.NewFrontend()
cert, _ := parapet.GenerateSelfSignCertificate(parapet.SelfSign{Hosts: []string{"localhost", "127.0.0.1"}})
s.TLSConfig = &tls.Config{Certificates: []tls.Certificate{cert}}
// s.Addr == "" → serves HTTPS on :443

Graceful shutdown. ListenAndServe traps SIGTERM and drains in-flight requests. WaitBeforeShutdown (default 10 s) sleeps first — so load balancers notice the instance leaving — then GraceTimeout (default 30 s) bounds the drain; setting GraceTimeout <= 0 disables the built-in SIGTERM handling. Register cleanup with Server.RegisterOnShutdown(func()) (safe to call concurrently, runs each callback once); it's the seam healthz and ActiveHealthCheck use to deregister during the drain.

Other tunables. ReadTimeout, ReadHeaderTimeout, WriteTimeout, IdleTimeout, and MaxHeaderBytes map onto the embedded http.Server, while TCPKeepAlivePeriod is applied to accepted connections at the listener. The constructors pick role-appropriate defaults — e.g. NewFrontend sets a 10 s ReadHeaderTimeout and 1 m read/write timeouts. Set Server.ReusePort = true to bind the listener with SO_REUSEPORT for zero-downtime restarts and cross-process load distribution. All Server fields must be set before serving.

Health checks

healthz serves both Kubernetes probes from one endpoint: GET <Path> is liveness (driven by Set(bool)), and GET <Path>?ready=1 is readiness (driven by SetReady(bool)). Readiness automatically returns 503 once the parapet.Server enters graceful shutdown — so a load balancer drains the instance — while liveness stays 200 to avoid a restart mid-drain. Path defaults to /healthz, and Set/SetReady let background checks flip the flags (fail readiness while warming up, fail liveness when a dependency dies).

By default the endpoint only answers when the Host header is an IP (suiting kubelet/LB probes that address the pod by IP) and passes hostname-addressed requests through to the next handler; set Host = true to answer those too.

Request IDs

requestid injects an X-Request-Id (set Header to use another key), writes it on both the forwarded request and the response, and records it as the requestId field in the access log. New() defaults TrustProxy = true, reusing a valid incoming ID for trace continuity — but an incoming ID is accepted only if it passes a conservative charset check and is ≤ 128 bytes, otherwise a fresh UUIDv4 replaces it. At the edge, set TrustProxy = false so clients can't spoof or poison the ID that flows into your logs and upstreams.

Logging

logger emits one structured JSON record per request. logger.Stdout() and logger.Stderr() are the ready-made constructors (or set Logger.Writer to any io.Writer); OmitEmpty drops empty fields and is on by default for Stdout/Stderr but off for a bare Logger{}. Downstream handlers enrich the record with logger.Set(r.Context(), "userID", id) (read back with logger.Get) — a no-op when no Logger is mounted upstream. A client-cancelled request is logged with the synthetic status 499, and logger.Disable() silences logging for a route (as the health-check block in the example does).

Trusted proxies

Parapet only reads X-Forwarded-* and X-Real-IP when the connection comes from a trusted CIDR. Configure trust with TrustCIDRs(...) or accept the defaults from Trusted() (standard private and loopback ranges). Servers created with NewFrontend() start with no trusted proxies by default.

PROXY protocol

An L4 load balancer (AWS NLB, HAProxy in TCP mode, …) terminates the TCP connection, so without help the proxy sees the balancer's IP, not the client's — and an L4 balancer adds no X-Forwarded-For. The proxyprotocol package reads the HAProxy PROXY protocol header (v1 and v2) that such balancers prepend and rewrites the connection's RemoteAddr to the real client, so ratelimit, waf, logger, and the trust logic above all see the right address. Mount it with Server.ModifyConnection; the header is parsed lazily on the connection's first read, off the accept loop.

// Only the listed CIDRs (your load balancer) may set a client address; a direct
// connection from outside them is passed through untouched and cannot spoof one.
pp := proxyprotocol.New("10.0.0.0/8")

s := parapet.NewFrontend()
s.ModifyConnection(pp.ModifyConnection)

Set Require to reject a trusted connection that arrives without a PROXY header (use it when every connection from the balancer is guaranteed to carry one); by default such a connection is served with its real peer address. HeaderTimeout (default 10 s; negative disables) bounds how long a trusted connection has to deliver its header, so a stalled connection can't hold a slot during the parse.

⚠️ proxyprotocol.New() with no CIDRs trusts every peer — any client could then spoof its address via a PROXY header — so it is safe only when the listener is reachable exclusively through the load balancer. An invalid CIDR string panics at startup.

Performance tuning

Server.ShareProtoSlice makes that server's proxy write a single shared []string for X-Forwarded-Proto instead of allocating a fresh slice per request, saving one allocation on every request that sets the header (~16% on the distrust path in the proxy benchmarks). It is off by default and scoped to the server.

This is unsafe if any middleware mutates the X-Forwarded-Proto value slice in place — e.g. headers.MapRequest("X-Forwarded-Proto", …) or code doing r.Header["X-Forwarded-Proto"][0] = … — because the mutation would corrupt the shared slice for every subsequent request. Appending (headers.AddRequest) is safe. Enable it only if you control the whole middleware chain, and set it before serving:

s := parapet.NewBackend()
s.ShareProtoSlice = true

The hsts and authn middlewares expose the same opt-in for their fixed response headers via a ShareValueSlice field (off by default), sharing one Strict-Transport-Security / WWW-Authenticate slice across requests:

hsts.HSTS{MaxAge: 365 * 24 * time.Hour, ShareValueSlice: true}

The same caveat applies — only enable it when nothing in the chain mutates that response header's value slice in place.

Prometheus metrics

prom collects into a shared registry and exposes it two ways: mount prom.Handler() on a route, or run prom.Start(addr) as a standalone scrape server. prom.Registry() returns the registry and prom.Namespace (default parapet) is the prefix on every series name.

The three server-wide collectors instrument the whole chain. prom.Requests() is a middleware; prom.Connections and prom.Networks take the *Server (they wire ConnState/ModifyConnection, so call them rather than Use):

s := parapet.NewFrontend()
s.Use(prom.Requests())   // parapet_requests{host,status,method}
prom.Connections(s)      // parapet_connections{state}
prom.Networks(s)         // parapet_network_request_bytes / _response_bytes

// expose them
{
    l := location.Exact("/metrics")
    l.Use(prom.Handler())
    s.Use(l)
}
// …or: go prom.Start(":9187")

Every observable feature exposes a hook you assign a prom.* adapter to:

Wire	Metrics
`up.OnRoundTrip = prom.Upstream()`	`upstream_requests{host,status}`, `upstream_request_duration_seconds{host}`, `upstream_fast_rejects_total{host}`
`lb.OnStateChange = prom.UpstreamState()`	`upstream_state_transitions_total`, `upstream_breaker_state`, `upstream_probe_down_total{host,cause}`
`prom.UpstreamInflight(lb)` / `lb.OnShed = prom.UpstreamShed()`	`upstream_inflight{host}` + `_capacity{host}`, `upstream_shed_total{reason}`
`rl.Observe = prom.RateLimit()` / `strategy.OnError = prom.RateLimitRedisError()`	`ratelimit_total{name,result}`, `ratelimit_redis_errors_total`
`cache.Options{OnResult: prom.Cache()}`	`cache_total{host,result}`, `cache_fill_duration_seconds{host}`
`w.Observe = prom.WAF()`	`waf_eval_duration_seconds{outcome}`
`mr.Observe = prom.Mirror()`	`mirror_total{outcome}`, `mirror_request_duration_seconds`

All series carry the prom.Namespace prefix (shown unprefixed above).

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 313 Commits
.github/workflows		.github/workflows
pkg		pkg
.gitignore		.gitignore
.golangci.yaml		.golangci.yaml
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cert.go		cert.go
cert_test.go		cert_test.go
example_test.go		example_test.go
go.mod		go.mod
go.sum		go.sum
handler.go		handler.go
listener.go		listener.go
middleware.go		middleware.go
middleware_bench_test.go		middleware_bench_test.go
middleware_test.go		middleware_test.go
new.go		new.go
new_test.go		new_test.go
proxy.go		proxy.go
proxy_bench_test.go		proxy_bench_test.go
proxy_internal_test.go		proxy_internal_test.go
server.go		server.go
server_shutdown_test.go		server_shutdown_test.go
server_test.go		server_test.go

Folders and files

Latest commit

History

Repository files navigation

parapet

Install

Concepts

Packages

Example

Rate limiting

Distributed (Redis-backed) rate limiting

Compression

WAF with CEL rules

Authentication

JWT (bearer tokens)

Rotating keys from a remote JWKS

Basic authentication

Forward authentication (external auth server)

Response caching

Forcing caching for an origin you don't control

Stale serving (RFC 5861)

Purging

Observability

Weighted and least-connection load balancing

Load balancing with passive health checks

Upstream transports and routing

Retries

Hedging (speculative retry)

Active health checks

Request timeouts

Choosing a reliability primitive

Traffic mirroring (shadowing)

TLS and server configuration

Health checks

Request IDs

Logging

Trusted proxies

PROXY protocol

Performance tuning

Prometheus metrics

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 52

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages