Skip to content

reverseproxy: Add per-upstream metrics#7574

Open
simonhammes wants to merge 3 commits intocaddyserver:masterfrom
simonhammes:reverse-proxy-metrics
Open

reverseproxy: Add per-upstream metrics#7574
simonhammes wants to merge 3 commits intocaddyserver:masterfrom
simonhammes:reverse-proxy-metrics

Conversation

@simonhammes
Copy link
Copy Markdown

This PR implements per-upstream metrics for reverse_proxy.

Fixes #4140

Assistance Disclosure

The code was generated by OpenCode (GLM-5). I manually verified the correctness.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 16, 2026

CLA assistant check
All committers have signed the CLA.

Comment on lines +138 to +141
// Guard for test cases that bypass Provision()
if reverseProxyMetrics.upstreamRequests == nil {
return
}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether this is the best approach?

TestDialErrorBodyRetry fails with a SEGV if I remove this check.

@mohammed90
Copy link
Copy Markdown
Member

Have you run the result? Can you share the resulting metrics output?

@simonhammes
Copy link
Copy Markdown
Author

Have you run the result? Can you share the resulting metrics output?

Yes:

# HELP caddy_reverse_proxy_upstream_request_duration_seconds Histogram of request durations to upstreams.
# TYPE caddy_reverse_proxy_upstream_request_duration_seconds histogram
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="0.005"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="0.01"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="0.025"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="0.05"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="0.1"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="0.25"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="0.5"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="1"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="2.5"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="5"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="10"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8000",le="+Inf"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_sum{upstream="localhost:8000"} 0.007529978
caddy_reverse_proxy_upstream_request_duration_seconds_count{upstream="localhost:8000"} 3
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="0.005"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="0.01"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="0.025"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="0.05"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="0.1"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="0.25"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="0.5"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="1"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="2.5"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="5"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="10"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_bucket{upstream="localhost:8001",le="+Inf"} 5
caddy_reverse_proxy_upstream_request_duration_seconds_sum{upstream="localhost:8001"} 0.011760646999999999
caddy_reverse_proxy_upstream_request_duration_seconds_count{upstream="localhost:8001"} 5
# HELP caddy_reverse_proxy_upstream_requests_total Counter of requests made to upstreams.
# TYPE caddy_reverse_proxy_upstream_requests_total counter
caddy_reverse_proxy_upstream_requests_total{code="404",method="GET",upstream="localhost:8000"} 3
caddy_reverse_proxy_upstream_requests_total{code="404",method="GET",upstream="localhost:8001"} 5

I just noticed that I should probably add the code and method labels to the caddy_reverse_proxy_upstream_request_duration_seconds metric to be consistent with the existing caddy_http_request_duration_seconds metric

@simonhammes
Copy link
Copy Markdown
Author

I just noticed that I should probably add the code and method labels to the caddy_reverse_proxy_upstream_request_duration_seconds metric to be consistent with the existing caddy_http_request_duration_seconds metric

Done: fba858e

const ns, sub = "caddy", "reverse_proxy"

upstreamsLabels := []string{"upstream"}
upstreamRequestLabels := []string{"upstream", "code", "method"}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next question: Do we also want a server label? 🤷

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And host if per_host is enabled?

@simonhammes
Copy link
Copy Markdown
Author

nit: if Upstream.String() ever gains per-request noise, label cardinality could spike.

Good point, this could indeed cause issues in environments with frequently changing IP addresses (e.g. K8S).

This is the result of using dynamic a $host:

caddy_reverse_proxy_upstream_requests_total{code="200",method="GET",upstream="172.19.0.2:80"} 6
caddy_reverse_proxy_upstream_requests_total{code="200",method="GET",upstream="172.19.0.3:80"} 2
caddy_reverse_proxy_upstream_requests_total{code="200",method="GET",upstream="172.19.0.4:80"} 5

Maybe the "new" metrics (or just the upstream label?) could be gated behind a configuration option instead?

@francislavoie
Copy link
Copy Markdown
Member

That comment was from an AI agent that spammed hundreds of repos. We blocked and reported. But make of it as you will.

@simonhammes
Copy link
Copy Markdown
Author

That comment was from an AI agent that spammed hundreds of repos. We blocked and reported. But make of it as you will.

Thanks for letting me know, that's what I assumed. I still think there's some truth to the (albeit AI-generated) comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prometheus Metrics for reverse_proxy upstreams

4 participants