Skip to content

feat: add version labels to Temporal Prometheus metrics#1083

Open
anbarasantr wants to merge 8 commits intomainfrom
feat/temporal-prometheus-metrics
Open

feat: add version labels to Temporal Prometheus metrics#1083
anbarasantr wants to merge 8 commits intomainfrom
feat/temporal-prometheus-metrics

Conversation

@anbarasantr
Copy link
Copy Markdown
Contributor

Summary

  • Adds service_version and sdk_version as global tags to all Temporal worker Prometheus metrics via TelemetryConfig.global_tags
  • service_version comes from OTEL_SERVICE_VERSION env var (default 0.1.0) — represents the app/connector version per deployment
  • sdk_version comes from application_sdk.version.__version__ (2.4.1) — the SDK package version

Why

The Temporal Prometheus /metrics endpoint on :9464 previously had no version information on any metric. This makes it impossible to filter or alert by app version in dashboards.

What changes

Only application_sdk/clients/temporal.py — added SERVICE_VERSION and SDK_VERSION imports, and passed global_tags to TelemetryConfig. Fully backward compatible — only adds new labels to existing metrics.

Verified

Tested locally with Temporal test server. All 249 metric lines now carry sdk_version and service_version labels:

temporal_activity_execution_latency_bucket{activity_type="say_hello",namespace="default",service_name="temporal-core-sdk",task_queue="test-queue",sdk_version="2.4.1",service_version="0.1.0",le="50"} 1

Test plan

  • All 15 existing test_temporal_client.py unit tests pass
  • Verified /metrics output with Temporal test server shows version labels
  • Deploy to staging and verify with curl http://<pod>:9464/metrics | grep service_version

🤖 Generated with Claude Code

anbarasantr and others added 8 commits March 2, 2026 18:52
Expose ~40 built-in Temporal SDK metrics (activity/workflow latencies,
failures, task slot usage, gRPC request stats) via a Prometheus endpoint
on port 9464. This is always-on with no configuration required.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use ATLAN_TEMPORAL_PROMETHEUS_BIND_ADDRESS env var with default
"0.0.0.0:9464", following the same convention as other SDK constants.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Follow the same convention as WORKFLOW_HOST/WORKFLOW_PORT:
- ATLAN_TEMPORAL_PROMETHEUS_HOST (default: 0.0.0.0)
- ATLAN_TEMPORAL_PROMETHEUS_PORT (default: 9464)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Host is always 0.0.0.0. Only the port needs to be configurable.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Single env var override for the full bind address (host:port).
Default: 0.0.0.0:9464

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use TelemetryConfig global_tags to expose version labels on all Temporal
worker metrics at the /metrics Prometheus endpoint. This enables version
tracking in dashboards and alerting without breaking existing queries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@snykgituser
Copy link
Copy Markdown

snykgituser commented Mar 4, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants