Skip to content

Add OpenTelemetry metrics for package resources size#1018

Merged
efiacor merged 21 commits into
kptdev:mainfrom
Nordix:feature-resources-size-metrics
Jun 15, 2026
Merged

Add OpenTelemetry metrics for package resources size#1018
efiacor merged 21 commits into
kptdev:mainfrom
Nordix:feature-resources-size-metrics

Conversation

@JamesMcDermott

@JamesMcDermott JamesMcDermott commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Title

Feat: Add OpenTelemetry metrics for package resources size


Description

  • What changed:
    • new OpenTelemetry metrics: histogram- and gauge-type metrics to track package revision resource sizes
    • minimal Prometheus monitoring stack for manual testing
  • Why it's needed:
    • provides observability into package resource sizes across Porch operations, enabling monitoring and alerting on package growth.
  • How it works:
    • New metrics (porch_package_size_bytes_bucket, porch_package_size_bytes_count, porch_package_size_bytes_sum, porch_package_size_bytes_total) are recorded in Porch flows that update package revision resources:
      • create
      • delete
      • discover/sync from a registered repository
      • delete on unregistering a repository
      • direct update of PackageRevisionResources in rpkg push

Related Issue(s)


Type of Change

  • Bug fix
  • New feature
  • Enhancement
  • Refactor
  • Documentation
  • Tests
  • Other: ________

Checklist

  • Code follows project style guidelines
  • Self-reviewed changes
  • Tests added/updated
  • Documentation added/updated
  • All tests and gating checks pass

Testing Instructions (Optional)

  1. Deploy the monitoring stack using scripts/deploy-monitoring.sh
  2. Prometheus GUI will be available at http://localhost:9092
  3. Create/delete package revisions and verify new porch_package_size_bytes_* metrics can be queried in Prometheus
  4. Run go test ./internal/telemetry/... ./test/e2e/api/ to execute unit and e2e tests
  5. Verify new porch_package_size_bytes_* metrics for e2e tests' packages can be queried in Prometheus

Additional Notes (Optional)

  • Review notes: The internal/otel package has been refactored into internal/telemetry

AI Disclosure

  • I have used AI in the creation of this PR.

If so, please describe how:

  • Kiro/Amazon Q to:
    • generate the pull request description from the commit message
    • determine locations to add new metrics in code picked from PR 561
    • generate new unit tests to cover code updates
    • suggest rearrangement in internal/telemetry/otel.go.setupMetrics() to properly gate off Prometheus exporter setup

Copilot AI review requested due to automatic review settings June 2, 2026 13:08
@netlify

netlify Bot commented Jun 2, 2026

Copy link
Copy Markdown

Deploy Preview for kpt-porch ready!

Name Link
🔨 Latest commit cf910a7
🔍 Latest deploy log https://app.netlify.com/projects/kpt-porch/deploys/6a2fbf6ad5032c0008c90ec7
😎 Deploy Preview https://deploy-preview-1018--kpt-porch.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR introduces a new internal/telemetry package (replacing internal/otel) that provides a unified OpenTelemetry setup with proper lifecycle management, adds Prometheus-exporter support, and instruments package revision resource size as a new porch_package_size_bytes metric. Deployment manifests are extended with metrics ports/services and a new Prometheus/Grafana monitoring stack is added.

Changes:

  • New internal/telemetry package with SetupOpenTelemetry returning lifecycle-managed OTelResources and recording package size histograms/gauges.
  • Wiring of the new telemetry setup and deferred shutdown across cmd/porch, controllers, func/server, and func/wrapper-server, along with metric recording calls in dbcache.
  • Deployment additions: metrics ports/services on porch components, Prometheus/Grafana kpt package, and deploy/cleanup script; new e2e test for the package size metric.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
internal/telemetry/otel.go New unified OTel setup with explicit shutdown lifecycle and optional Prometheus HTTP server.
internal/telemetry/metrics.go New porch_package_size_bytes histogram + gauge instruments and recorder.
internal/telemetry/metrics_test.go Tests for RecordPackageSizeUpdate including nil-instrument guards.
internal/otel/otel.go Old package removed (replaced by internal/telemetry).
internal/otel/otel_test.go Tests moved to telemetry package; new Prometheus server tests added.
go.mod Promote otel/exporters/prometheus from indirect to direct dependency.
cmd/porch/main.go Use new telemetry package with deferred shutdown.
controllers/main.go Move OTel init out of newManager; add deferred shutdown.
func/server/server.go Switch to new telemetry package; add deferred shutdown.
func/wrapper-server/main.go Switch to new telemetry package; add deferred shutdown.
pkg/cache/dbcache/dbpackage.go Record package size on delete.
pkg/cache/dbcache/dbrepository.go Record package size on close-draft and delete.
pkg/cache/dbcache/dbreposync.go Record package size on external PR cache/delete.
deployments/porch/2-function-runner.yaml Add metrics port to function-runner service.
deployments/porch/3-porch-server.yaml Add 9464 metrics port and service entry.
deployments/porch/9-controllers.yaml Add Service exposing controller metrics port.
deployments/porch/22-function-templates.yaml Add metrics env/port for wrapper-server templates.
deployments/metrics/Kptfile New kpt package for monitoring deployment.
deployments/metrics/prometheus-deployment.yaml New Prometheus deployment, RBAC and Service.
deployments/metrics/grafana-deployment.yaml New Grafana deployment, datasources and Service.
deployments/metrics-resources/prometheus-config.yaml Prometheus scrape configuration for porch components.
scripts/deploy-monitoring.sh New script to deploy/cleanup monitoring stack via kpt.
docs/.../opentelemetry.md Documentation for new metrics.
test/e2e/api/metrics_test.go Fix regex globs, switch to suite assertions, add package-size metric test.

Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread internal/telemetry/metrics.go
Comment thread internal/telemetry/metrics.go
Comment thread internal/telemetry/metrics.go Outdated
Comment thread internal/telemetry/metrics.go Outdated
Comment thread test/e2e/api/metrics_test.go Outdated
Comment thread deployments/metrics-resources/prometheus-config.yaml
Comment thread pkg/cache/dbcache/dbpackage.go Outdated
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread internal/telemetry/metrics_test.go
@JamesMcDermott JamesMcDermott force-pushed the feature-resources-size-metrics branch from 47c4853 to c50c1ee Compare June 5, 2026 09:38
@dosubot dosubot Bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jun 5, 2026
Copilot AI review requested due to automatic review settings June 5, 2026 11:37
@JamesMcDermott JamesMcDermott force-pushed the feature-resources-size-metrics branch from c50c1ee to 908cfeb Compare June 5, 2026 11:37

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 10 comments.

Comment thread test/e2e/suiteutils/suite_utils.go Outdated
Comment thread test/e2e/suiteutils/suite_utils.go Outdated
Comment thread internal/telemetry/otel.go Outdated
Comment thread deployments/porch/3-porch-server.yaml
Comment thread deployments/porch/9-controllers.yaml
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread deployments/metrics/prometheus-deployment.yaml Outdated
@JamesMcDermott JamesMcDermott force-pushed the feature-resources-size-metrics branch from 908cfeb to 27fdc45 Compare June 5, 2026 12:50
Copilot AI review requested due to automatic review settings June 8, 2026 11:14
@JamesMcDermott JamesMcDermott force-pushed the feature-resources-size-metrics branch from 27fdc45 to 6ea6b41 Compare June 8, 2026 11:14

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 9 comments.

Comment thread test/e2e/api/metrics_test.go Outdated
Comment thread internal/telemetry/metrics.go
Comment thread deployments/porch/3-porch-server.yaml
Comment thread deployments/porch/3-porch-server.yaml
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread internal/telemetry/metrics_test.go Outdated
@JamesMcDermott JamesMcDermott force-pushed the feature-resources-size-metrics branch from 6ea6b41 to b963d7e Compare June 8, 2026 15:10
Copilot AI review requested due to automatic review settings June 8, 2026 15:52

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 7 comments.

Comment thread internal/telemetry/otel.go
Comment thread deployments/porch/22-function-templates.yaml
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread pkg/cache/dbcache/dbpackagerevisionresourcessql.go
Comment thread test/e2e/suiteutils/suite_utils.go Outdated
Comment thread deployments/metrics/prometheus-deployment.yaml
Copilot AI review requested due to automatic review settings June 8, 2026 17:03

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 26 out of 26 changed files in this pull request and generated 9 comments.

Comment thread test/e2e/suiteutils/suite_utils.go Outdated
Comment thread test/e2e/suiteutils/suite_utils.go Outdated
Comment thread test/e2e/suiteutils/suite_utils.go Outdated
Comment thread internal/telemetry/otel.go Outdated
Comment thread internal/telemetry/otel.go Outdated
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread test/e2e/api/metrics_test.go Outdated
@JamesMcDermott JamesMcDermott self-assigned this Jun 9, 2026
Copilot AI review requested due to automatic review settings June 9, 2026 15:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 9 comments.

Comment thread internal/telemetry/otel.go
Comment thread test/e2e/suiteutils/suite_utils.go
Comment thread test/e2e/suiteutils/suite_utils.go
Comment thread test/e2e/suiteutils/suite_utils.go
Comment thread deployments/metrics/grafana-deployment.yaml Outdated
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread scripts/deploy-monitoring.sh
Copilot AI review requested due to automatic review settings June 9, 2026 20:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 7 comments.

Comment thread internal/telemetry/otel.go Outdated
Comment thread internal/telemetry/otel.go
Comment thread internal/telemetry/metrics_test.go Outdated
Comment thread deployments/metrics/grafana-deployment.yaml Outdated
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread scripts/deploy-monitoring.sh Outdated
Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>
Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>
@JamesMcDermott JamesMcDermott force-pushed the feature-resources-size-metrics branch from 8466559 to 5439633 Compare June 11, 2026 20:08
Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>
Copilot AI review requested due to automatic review settings June 11, 2026 20:30

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 7 comments.

Comment thread internal/telemetry/otel.go
Comment thread internal/telemetry/otel.go
Comment thread test/e2e/api/metrics_test.go
Comment thread deployments/metrics-resources/grafana-package-sizes-dashboard.json Outdated
Comment thread scripts/deploy-monitoring.sh Outdated
Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>
Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>
Copilot AI review requested due to automatic review settings June 11, 2026 20:56

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 29 changed files in this pull request and generated 8 comments.

Comment thread test/e2e/api/metrics_test.go Outdated
Comment thread internal/telemetry/otel.go
Comment thread deployments/porch/3-porch-server.yaml
Comment thread deployments/porch/9-controllers.yaml
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread scripts/deploy-monitoring.sh
Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>
liamfallon
liamfallon previously approved these changes Jun 15, 2026
Comment thread .github/retrigger.txt Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this file? Members can retrigger CI on the Github UI.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@dosubot dosubot Bot added the lgtm #ededed label Jun 15, 2026
@liamfallon liamfallon self-requested a review June 15, 2026 05:55
Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
Copilot AI review requested due to automatic review settings June 15, 2026 08:04

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 28 out of 28 changed files in this pull request and generated 12 comments.

Comment thread internal/telemetry/otel.go
Comment thread internal/telemetry/otel.go
Comment thread internal/telemetry/otel.go Outdated
Comment thread internal/telemetry/otel.go
Comment thread test/e2e/api/metrics_test.go Outdated
Comment thread scripts/deploy-monitoring.sh
Comment thread scripts/deploy-monitoring.sh Outdated
Comment thread internal/telemetry/metrics.go Outdated
Comment thread internal/telemetry/metrics.go Outdated
Comment thread internal/telemetry/metrics.go Outdated
Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
@efiacor efiacor merged commit c3c0f90 into kptdev:main Jun 15, 2026
27 of 29 checks passed
@efiacor efiacor deleted the feature-resources-size-metrics branch June 15, 2026 12:35
JamesMcDermott added a commit to Nordix/porch that referenced this pull request Jun 16, 2026
* Feat: Add OpenTelemetry metrics for package resources size

- plus enough Prometheus monitoring stack to make it manually testable
  - picked from changes in WIP kptdev#561
- new histogram- and gauge-type metrics
  - available in e.g. Prometheus as:
    - porch_package_size_bytes_bucket
    - porch_package_size_bytes_count
    - porch_package_size_bytes_sum
    - porch_package_size_bytes_total
  - recorded in Porch flows that update package revision resources:
    - create package revision
    - delete package revision
    - discover/sync package revisions from a registered repository
    - delete package revisions on unregistering a repository
    - direct update of PackageRevisionResources in rpkg push

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Update docs to mention new package resources size metrics

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 2

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 3

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* comment nitpick to retrigger CI

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 4

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Fix failing unit test

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 5

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Nitpick to retrigger CI

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 6

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Introduce retrigger.txt for easier CI retriggering

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address copilot review comments part 7

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 8

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 9

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* retrigger

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 10

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* retrigger

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 11

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address review comment

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 12

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

---------

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
efiacor pushed a commit that referenced this pull request Jun 17, 2026
) (#1052)

* Add OpenTelemetry metrics for package resources size (#1018)

* Feat: Add OpenTelemetry metrics for package resources size

- plus enough Prometheus monitoring stack to make it manually testable
  - picked from changes in WIP #561
- new histogram- and gauge-type metrics
  - available in e.g. Prometheus as:
    - porch_package_size_bytes_bucket
    - porch_package_size_bytes_count
    - porch_package_size_bytes_sum
    - porch_package_size_bytes_total
  - recorded in Porch flows that update package revision resources:
    - create package revision
    - delete package revision
    - discover/sync package revisions from a registered repository
    - delete package revisions on unregistering a repository
    - direct update of PackageRevisionResources in rpkg push

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Update docs to mention new package resources size metrics

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 2

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 3

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* comment nitpick to retrigger CI

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 4

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Fix failing unit test

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 5

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Nitpick to retrigger CI

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 6

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Introduce retrigger.txt for easier CI retriggering

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address copilot review comments part 7

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 8

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 9

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* retrigger

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 10

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* retrigger

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 11

Signed-off-by: ezmcdja <james.j.mcdermott@ericsson.com>

* Address review comment

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Address Copilot review comments part 12

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

---------

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

* Cherry-pick #1050 to let CR cache tests pass

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>

---------

Signed-off-by: James McDermott <james.j.mcdermott@ericsson.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm #ededed size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Expose package resource file size as a Prometheus metric

4 participants