Add release and identifier labels to phase_duration metric.#1009
Add release and identifier labels to phase_duration metric.#1009rene-oromtz merged 1 commit intomainfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1009 +/- ##
==========================================
+ Coverage 73.98% 74.12% +0.14%
==========================================
Files 108 109 +1
Lines 10343 10398 +55
Branches 887 895 +8
==========================================
+ Hits 7652 7708 +56
Misses 2503 2503
+ Partials 188 187 -1
*This pull request uses carry forward flags. Click here to find out more.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR adds two new labels (identifier, release) to the agent’s phase_duration_seconds Prometheus histogram to enable better segmentation of phase-duration metrics by device identifier and provisioned OS release.
Changes:
- Extend
phase_duration_secondsmetric labels to includeidentifier(from agent config) andrelease(queried from DUT/etc/os-releaseafter provisioning). - Add DUT OS release querying/parsing logic and unit tests for normalization behavior.
- Update agent metrics unit test expectations to include the new labels.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| agent/src/testflinger_agent/metrics.py | Adds identifier/release labels to the phase duration histogram and updates the reporting API. |
| agent/src/testflinger_agent/agent.py | Captures config identifier, queries DUT release after successful provision, and reports phase durations with the new labels. |
| agent/src/testflinger_agent/os_release.py | Introduces SSH-based /etc/os-release query + parsing/normalization helpers. |
| agent/tests/test_os_release.py | Adds unit tests for parsing, label derivation, and SSH query failure handling. |
| agent/tests/test_agent.py | Updates metric label assertions to include identifier and release. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
rene-oromtz
left a comment
There was a problem hiding this comment.
Thank you for taking care of this! Overall this looks good to me!
I took the liberty to test it on staging and found a bug regarding the device_ip retrieval. Once that is addressed it should be good as I can see the identifier label was properly added:
phase_duration_seconds_sum{agent_id="audino", identifier="202409-35503", ...", test_phase="provision"}
rene-oromtz
left a comment
There was a problem hiding this comment.
Sweet! Working as intended:
Prometheus:
phase_duration_seconds_sum{agent_id="audino", identifier="202409-35503", ..., release="24.04.4", test_phase="provision"}
Agent logs:
[26-04-10 14:24:54] INFO: (job.py:129)| Running setup_command: tf-setup
[26-04-10 14:24:59] INFO: (job.py:129)| Running provision_command: tf-provision
[26-04-10 14:38:42] INFO: (os_release.py:75)| DUT release detected: 24.04.4
There is just a pending discussion with @ajzobro so let's see what he thinks, besides that LGTM
|
I won't hold this up. |
|
Thanks to both of you for the review. @ajzobro thanks for highlighting the issue with licensing - it is important to keep it consistent in the long term. It is not a problem to address your concerns and adapt copyright headers in newly added files. However, this would make them inconsistent with the other source files in the I would suggest to address the copyright topic for the entire |
f9f437a to
df423fa
Compare
Description
To enable more meaningful data separation for the provisioning duration metric, this PR adds 2 labels to the
phase_durationmetric:identifierandrelease.identifieris derived from theidentifierfield intestflinger-agent.conf.releaseindicates the OS with which the DUT was provisioned.releasebecomes available after the provisioning phase and is applied to all subsequent phases, including provisioning. Implementation details:/etc/os-releaseinformation.Note: Additional rationale for the selected approach is provided in the linked ticket.
Resolved issues
CERTTF-832
Documentation
Unit tests + linked ticket.
Web service API changes
N/A
Tests
Unit tests.