Skip to content

Cherry-pick: Add OpenTelemetry metrics for package resources size (#1018)#1052

Merged
efiacor merged 2 commits into
kptdev:1.5from
Nordix:feature-resources-size-metrics-cherry-pick
Jun 17, 2026
Merged

Cherry-pick: Add OpenTelemetry metrics for package resources size (#1018)#1052
efiacor merged 2 commits into
kptdev:1.5from
Nordix:feature-resources-size-metrics-cherry-pick

Conversation

@JamesMcDermott

@JamesMcDermott JamesMcDermott commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Title

Cherry-pick to 1.5:


Description

  • What changed:
  • Why it’s needed:
  • How it works:

Related Issue(s)

  • Closes/Fixes #

Type of Change

  • Cherry-pick
  • Bug fix
  • New feature
  • Enhancement
  • Refactor
  • Documentation
  • Tests
  • Other: ________

Checklist

  • Code follows project style guidelines
  • Self-reviewed changes
  • Tests added/updated
  • Documentation added/updated
  • All tests and gating checks pass

Testing Instructions (Optional)


Additional Notes (Optional)

  • Known issues:
  • Further improvements:
  • Review notes:

AI Disclosure

  • I have used AI in the creation of this PR.

Copilot AI review requested due to automatic review settings June 16, 2026 15:00
@dosubot dosubot Bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jun 16, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds OpenTelemetry-based telemetry and a Prometheus/Grafana monitoring stack, including new package-size metrics and E2E validations.

Changes:

  • Introduces internal/telemetry for OTel setup, Prometheus scraping endpoint, and package-size metric instruments.
  • Records package revision resource-size metrics from DB cache flows and updates E2E tests to assert presence/values.
  • Adds monitoring deployment assets (Prometheus, Grafana, dashboards) plus a helper script to deploy them.

Reviewed changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
test/e2e/suiteutils/suite_utils.go Adds Prometheus text parsing utilities for collected metrics so tests can assert specific label/value pairs.
test/e2e/suiteutils/suite.go Simplifies DB cache env detection and makes delete ignore NotFound.
test/e2e/api/metrics_test.go Updates metrics endpoint assertions and adds new tests for package-size metrics and values.
scripts/deploy-monitoring.sh New helper to render/apply monitoring manifests and port-forward Prometheus/Grafana.
pkg/cli/commands/rpkg/docs/docs.go Fixes grammar in CLI docs strings.
pkg/cli/commands/rpkg/approve/command_test.go Clarifies test comment casing.
pkg/cache/dbcache/dbreposync.go Records package resource-size metrics during external PR caching.
pkg/cache/dbcache/dbrepository.go Records package resource-size metrics after closing a draft.
pkg/cache/dbcache/dbpackagerevisionresourcessql.go Records resource-size=0 metric on successful resource deletion.
internal/telemetry/otel.go New unified OTel setup with optional Prometheus HTTP server and lifecycle management.
internal/telemetry/metrics.go New OTel instruments + recording helper for package revision resource size.
internal/telemetry/metrics_test.go Unit tests for metric recording behavior and nil-instrument safety.
internal/otel/otel_test.go Updates tests for new OTel behavior, but file location/package appear inconsistent with new code location.
internal/otel/otel.go Removes old OTel setup implementation.
go.mod Promotes OTel prometheus exporter + metric module to direct deps.
func/wrapper-server/main.go Switches to new telemetry setup and ensures graceful shutdown.
func/server/server.go Switches to new telemetry setup and ensures graceful shutdown.
controllers/main.go Switches to new telemetry setup and ensures graceful shutdown.
cmd/porch/main.go Switches to new telemetry setup and ensures graceful shutdown.
docs/content/en/docs/6_configuration_and_deployments/configurations/opentelemetry.md Documents new metrics and their Prometheus names/labels.
deployments/porch/9-controllers.yaml Adds a Service to expose controller metrics on 9464; trims trailing whitespace.
deployments/porch/3-porch-server.yaml Exposes metrics port and adds a startup probe; adds service port for metrics.
deployments/porch/2-function-runner.yaml Exposes metrics port via Service.
deployments/metrics/prometheus-deployment.yaml Adds prometheus deployment + RBAC + service for monitoring stack.
deployments/metrics/grafana-deployment.yaml Adds grafana deployment + provisioning config + admin secret for monitoring stack.
deployments/metrics-resources/prometheus-config.yaml Prometheus scrape config for Porch components.
deployments/metrics-resources/grafana-package-sizes-dashboard.json Grafana dashboard for package resource size gauge.
deployments/function-pods/deployment.yaml Enables Prometheus exporter env + exposes metrics port for wrapper-server function pods.

Comment thread internal/telemetry/otel.go
Comment thread internal/telemetry/otel.go
Comment thread internal/telemetry/otel.go
Comment thread scripts/deploy-monitoring.sh
Comment thread deployments/metrics/prometheus-deployment.yaml
* Feat: Add OpenTelemetry metrics for package resources size

- plus enough Prometheus monitoring stack to make it manually testable
  - picked from changes in WIP kptdev#561
- new histogram- and gauge-type metrics
  - available in e.g. Prometheus as:
    - porch_package_size_bytes_bucket
    - porch_package_size_bytes_count
    - porch_package_size_bytes_sum
    - porch_package_size_bytes_total
  - recorded in Porch flows that update package revision resources:
    - create package revision
    - delete package revision
    - discover/sync package revisions from a registered repository
    - delete package revisions on unregistering a repository
    - direct update of PackageRevisionResources in rpkg push

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Update docs to mention new package resources size metrics

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 2

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 3

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* comment nitpick to retrigger CI

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 4

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Fix failing unit test

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 5

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Nitpick to retrigger CI

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 6

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Introduce retrigger.txt for easier CI retriggering

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address copilot review comments part 7

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 8

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 9

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* retrigger

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 10

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* retrigger

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 11

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address review comment

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 12

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

---------

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
@JamesMcDermott JamesMcDermott force-pushed the feature-resources-size-metrics-cherry-pick branch from 5ec6aa9 to 88af3f0 Compare June 16, 2026 15:17
Copilot AI review requested due to automatic review settings June 16, 2026 15:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 8 comments.

Comment thread internal/telemetry/otel.go
Comment thread scripts/deploy-monitoring.sh
Comment thread internal/telemetry/metrics.go
Comment thread internal/telemetry/metrics.go
Comment thread test/e2e/api/metrics_test.go Outdated
Comment thread test/e2e/api/metrics_test.go
Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
@JamesMcDermott JamesMcDermott force-pushed the feature-resources-size-metrics-cherry-pick branch from 542d4b8 to 77004bc Compare June 16, 2026 16:15
@sonarqubecloud

Copy link
Copy Markdown

@dosubot dosubot Bot added the lgtm #ededed label Jun 17, 2026
@efiacor efiacor merged commit 37e6cbb into kptdev:1.5 Jun 17, 2026
21 checks passed
@efiacor efiacor deleted the feature-resources-size-metrics-cherry-pick branch June 17, 2026 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm #ededed size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants