Testing of e2e code coverage for cert-manager operator #79300
Testing of e2e code coverage for cert-manager operator #79300siddhibhor-56 wants to merge 1 commit into
Conversation
WalkthroughThis PR introduces E2E coverage collection for the cert-manager operator and restructures telco DAST testing infrastructure. The cert-manager changes add a coverage image build and post-test coverage upload. The telco changes consolidate DAST jobs across OpenShift versions, replacing a setup-based chainsaw workflow with a scan-based RapidAST workflow and updating approvers accordingly. ChangesCert-Manager Operator E2E Coverage
Telco DAST Workflow and Job Restructuring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 12✅ Passed checks (12 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
ci-operator/config/openshift/cert-manager-operator/openshift-cert-manager-operator-master.yaml (2)
136-141: 💤 Low valueConsider adding a comment documenting the SIGTERM/restart assumption.
The logic relies on the container restarting in place after SIGTERM (preserving the emptyDir volume with coverage data). This works because deployments have
restartPolicy: Always, but a brief inline comment would help future maintainers understand why this approach is correct.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/config/openshift/cert-manager-operator/openshift-cert-manager-operator-master.yaml` around lines 136 - 141, Add a brief inline comment above the SIGTERM/restart block explaining the assumption that sending SIGTERM to PID 1 inside the cert-manager-operator container will cause the pod to restart in place (preserving the emptyDir volume with coverage data) because the deployment uses restartPolicy: Always; reference the kill -TERM 1 / oc exec command and the expectation that the container will be recreated rather than the pod being replaced, so future maintainers understand why we sleep and wait for the same pod to become Ready.
226-241: 💤 Low valueHardcoded array indices in JSON patches are fragile.
The patches assume the operator container is at
/containers/0and the deployment is at/deployments/0. If the CSV or deployment structure changes (e.g., init containers added, multiple deployments), these patches could target the wrong resource.Consider using
jqto find the correct index by container/deployment name, or add a comment acknowledging this assumption for the current operator structure.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/config/openshift/cert-manager-operator/openshift-cert-manager-operator-master.yaml` around lines 226 - 241, The JSON patches hardcode indices (/spec/install/spec/deployments/0 and /containers/0) which is brittle; update the script to compute the correct deployment and container indices dynamically (e.g., use jq to locate the deployment index by matching the deployment name and the container index by matching container.name == "cert-manager-operator" from the CSV or deployment JSON) and then build the patch paths using those computed indices (replace uses of deployments/0 and containers/0 in the oc patch/oc set commands with the variables holding the found indices like DEP_IDX and CT_IDX); alternatively, if you intentionally rely on a single known deployment/container, add a clear comment near the oc patch/oc set invocations (referencing the CSV/DEPLOYMENT, COVERAGE_IMAGE, GOCOVERDIR_PATH variables) documenting the assumption so future reviewers understand the constraint.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@ci-operator/config/openshift/cert-manager-operator/openshift-cert-manager-operator-master.yaml`:
- Around line 136-141: Add a brief inline comment above the SIGTERM/restart
block explaining the assumption that sending SIGTERM to PID 1 inside the
cert-manager-operator container will cause the pod to restart in place
(preserving the emptyDir volume with coverage data) because the deployment uses
restartPolicy: Always; reference the kill -TERM 1 / oc exec command and the
expectation that the container will be recreated rather than the pod being
replaced, so future maintainers understand why we sleep and wait for the same
pod to become Ready.
- Around line 226-241: The JSON patches hardcode indices
(/spec/install/spec/deployments/0 and /containers/0) which is brittle; update
the script to compute the correct deployment and container indices dynamically
(e.g., use jq to locate the deployment index by matching the deployment name and
the container index by matching container.name == "cert-manager-operator" from
the CSV or deployment JSON) and then build the patch paths using those computed
indices (replace uses of deployments/0 and containers/0 in the oc patch/oc set
commands with the variables holding the found indices like DEP_IDX and CT_IDX);
alternatively, if you intentionally rely on a single known deployment/container,
add a clear comment near the oc patch/oc set invocations (referencing the
CSV/DEPLOYMENT, COVERAGE_IMAGE, GOCOVERDIR_PATH variables) documenting the
assumption so future reviewers understand the constraint.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 0c5e551e-9c3b-4009-8ce0-61ef3dee6f38
⛔ Files ignored due to path filters (1)
ci-operator/jobs/openshift/external-secrets-operator/openshift-external-secrets-operator-main-postsubmits.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (2)
ci-operator/config/openshift/cert-manager-operator/openshift-cert-manager-operator-master.yamlci-operator/config/openshift/external-secrets-operator/openshift-external-secrets-operator-main.yaml
dc2fba5 to
53ff260
Compare
|
@siddhibhor-56, Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
ci-operator/step-registry/telcov10n/functional/dast/tests/telcov10n-functional-dast-tests-commands.sh (1)
72-72: 💤 Low valueConsider pinning the RapidAST image to a specific version.
Using
latesttag can lead to non-reproducible CI runs and unexpected breakages when the upstream image changes. Pin to a specific version for stability.♻️ Suggested change
- image: quay.io/redhatproductsecurity/rapidast:latest + image: quay.io/redhatproductsecurity/rapidast:2.13.0🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@ci-operator/step-registry/telcov10n/functional/dast/tests/telcov10n-functional-dast-tests-commands.sh` at line 72, The CI job currently uses an unpinned container image "quay.io/redhatproductsecurity/rapidast:latest"; update the image reference in the telcov10n-functional-dast-tests-commands.sh job to a specific version tag or an immutable digest (e.g., replace the ":latest" suffix with a concrete tag or `@sha256`:... digest) so builds are reproducible and stable; modify the "image: quay.io/redhatproductsecurity/rapidast:latest" line accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@ci-operator/step-registry/telcov10n/functional/dast/tests/telcov10n-functional-dast-tests-commands.sh`:
- Around line 97-101: The polling loop that waits for the sentinel file
/tmp/.done (the until oc exec -n dast "${POD_NAME}" -c rapidast -- test -f
/tmp/.done loop) must be given a timeout to avoid hanging; add a timeout
variable (e.g., SCAN_TIMEOUT_SECONDS=600) and record a start timestamp, then
inside the loop check elapsed time on each iteration and if it exceeds the
timeout print an error mentioning POD_NAME and container rapidast and exit
non‑zero; keep the existing sleep 15 between polls and ensure the failure path
breaks the loop and fails the script so CI does not hang indefinitely.
---
Nitpick comments:
In
`@ci-operator/step-registry/telcov10n/functional/dast/tests/telcov10n-functional-dast-tests-commands.sh`:
- Line 72: The CI job currently uses an unpinned container image
"quay.io/redhatproductsecurity/rapidast:latest"; update the image reference in
the telcov10n-functional-dast-tests-commands.sh job to a specific version tag or
an immutable digest (e.g., replace the ":latest" suffix with a concrete tag or
`@sha256`:... digest) so builds are reproducible and stable; modify the "image:
quay.io/redhatproductsecurity/rapidast:latest" line accordingly.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 5a1902e3-b62f-45ac-8fa5-dfa79300076b
⛔ Files ignored due to path filters (1)
ci-operator/jobs/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main-periodics.yamlis excluded by!ci-operator/jobs/**
📒 Files selected for processing (18)
ci-operator/config/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main__telco-operators-dast-4.16.yamlci-operator/config/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main__telco-operators-dast-4.17.yamlci-operator/config/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main__telco-operators-dast-4.18.yamlci-operator/config/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main__telco-operators-dast-4.19.yamlci-operator/config/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main__telco-operators-dast-4.20.yamlci-operator/config/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main__telco-operators-dast-4.21.yamlci-operator/config/openshift/cert-manager-operator/openshift-cert-manager-operator-master.yamlci-operator/step-registry/telcov10n/functional/dast/OWNERSci-operator/step-registry/telcov10n/functional/dast/scan/OWNERSci-operator/step-registry/telcov10n/functional/dast/scan/telcov10n-functional-dast-scan-workflow.metadata.jsonci-operator/step-registry/telcov10n/functional/dast/scan/telcov10n-functional-dast-scan-workflow.yamlci-operator/step-registry/telcov10n/functional/dast/setup/telcov10n-functional-dast-setup-workflow.metadata.jsonci-operator/step-registry/telcov10n/functional/dast/talm-prereq-crds/OWNERSci-operator/step-registry/telcov10n/functional/dast/talm-prereq-crds/telcov10n-functional-dast-talm-prereq-crds-ref.metadata.jsonci-operator/step-registry/telcov10n/functional/dast/tests/OWNERSci-operator/step-registry/telcov10n/functional/dast/tests/telcov10n-functional-dast-tests-commands.shci-operator/step-registry/telcov10n/functional/dast/tests/telcov10n-functional-dast-tests-ref.metadata.jsonci-operator/step-registry/telcov10n/functional/dast/tests/telcov10n-functional-dast-tests-ref.yaml
💤 Files with no reviewable changes (2)
- ci-operator/step-registry/telcov10n/functional/dast/setup/telcov10n-functional-dast-setup-workflow.metadata.json
- ci-operator/config/openshift-kni/eco-ci-cd/openshift-kni-eco-ci-cd-main__telco-operators-dast-4.17.yaml
✅ Files skipped from review due to trivial changes (5)
- ci-operator/step-registry/telcov10n/functional/dast/tests/OWNERS
- ci-operator/step-registry/telcov10n/functional/dast/scan/OWNERS
- ci-operator/step-registry/telcov10n/functional/dast/scan/telcov10n-functional-dast-scan-workflow.metadata.json
- ci-operator/step-registry/telcov10n/functional/dast/OWNERS
- ci-operator/step-registry/telcov10n/functional/dast/talm-prereq-crds/telcov10n-functional-dast-talm-prereq-crds-ref.metadata.json
🚧 Files skipped from review as they are similar to previous changes (1)
- ci-operator/config/openshift/cert-manager-operator/openshift-cert-manager-operator-master.yaml
| # Wait for sentinel file | ||
| echo "Waiting for scan to complete..." | ||
| until oc exec -n dast "${POD_NAME}" -c rapidast -- test -f /tmp/.done 2>/dev/null; do | ||
| sleep 15 | ||
| done |
There was a problem hiding this comment.
Add a timeout to the polling loop to prevent indefinite hangs.
The until loop polling for /tmp/.done has no timeout. If the scan crashes or hangs without creating the sentinel file, this CI job will run indefinitely. The pod startup already enforces a 600s timeout, but the scan completion polling does not.
🛡️ Proposed fix to add timeout
# Wait for sentinel file
echo "Waiting for scan to complete..."
+ TIMEOUT=3600 # 1 hour max for scan
+ ELAPSED=0
- until oc exec -n dast "${POD_NAME}" -c rapidast -- test -f /tmp/.done 2>/dev/null; do
+ until oc exec -n dast "${POD_NAME}" -c rapidast -- test -f /tmp/.done 2>/dev/null; do
+ if [[ ${ELAPSED} -ge ${TIMEOUT} ]]; then
+ echo "ERROR: Scan timed out after ${TIMEOUT}s for ${OPERATOR_NAME}"
+ OVERALL_RC=1
+ kill "${LOGS_PID}" 2>/dev/null || true
+ oc delete pod "${POD_NAME}" -n dast --ignore-not-found
+ continue 2
+ fi
sleep 15
+ ELAPSED=$((ELAPSED + 15))
done🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@ci-operator/step-registry/telcov10n/functional/dast/tests/telcov10n-functional-dast-tests-commands.sh`
around lines 97 - 101, The polling loop that waits for the sentinel file
/tmp/.done (the until oc exec -n dast "${POD_NAME}" -c rapidast -- test -f
/tmp/.done loop) must be given a timeout to avoid hanging; add a timeout
variable (e.g., SCAN_TIMEOUT_SECONDS=600) and record a start timestamp, then
inside the loop check elapsed time on each iteration and if it exceeds the
timeout print an error mentioning POD_NAME and container rapidast and exit
non‑zero; keep the existing sleep 15 between polls and ensure the failure path
breaks the loop and fails the script so CI does not hang indefinitely.
53ff260 to
4611763
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: siddhibhor-56 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
4611763 to
99044fe
Compare
|
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
1 similar comment
|
[REHEARSALNOTIFIER]
Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
@siddhibhor-56: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
/pj-rehearse max |
|
@siddhibhor-56: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
Overview
This PR extends e2e test coverage collection for the cert-manager operator and refactors the telco operators' DAST (Dynamic Application Security Testing) workflow infrastructure in OpenShift CI.
Cert-Manager Operator: E2E Coverage Instrumentation
The cert-manager operator CI configuration gains end-to-end code coverage collection and automated upload to Codecov:
Coverage Image Build: Introduces a new multi-stage
cert-manager-operator-coverageimage that compiles the operator binary with coverage instrumentation in a UBI9 minimal runtime, with coverage data collected to/tmp/e2e-cover.Setup Phase: A new
setup-coveragestep patches the deployed operator pod to use the coverage-instrumented image, sets theGOCOVERDIRenvironment variable, and mounts anemptyDirvolume for coverage data collection.Collection and Upload: A
poststep namedcollect-coverageperforms the following:go tool covdataTelco Operators: DAST Workflow Modernization
Multiple OCP release version configurations (4.16–4.21) for telco operators DAST testing are refactored:
Test Job Consolidation: The
telco-dast-citest job is replaced across all configurations with a newtelco-dast-operators-cijob that:00 13 * * 1to Friday mornings)telcov10n-functional-dast-setuptotelcov10n-functional-dast-scanADDITIONAL_WORKER_ARCHITECTURE,ADDITIONAL_WORKERS*,DISCONNECTED) to simplify test configurationRemoved Resources: The
distributed-tracing-tests-runner-telcotest runner is removed from all affected configurations.Workflow Update: The new
telcov10n-functional-dast-scanworkflow is introduced with updated metadata and includes thetelcov10n-functional-dast-testsstep in its pre-execution phase.Test Step Registry Updates
DAST Test Scripts: The
telcov10n-functional-dast-tests-commands.shscript is rewritten to orchestrate RapidAST DAST security scans instead of chainsaw compliance tests, automating vulnerability scanning across operators with result collection and artifact archiving.Approver Changes: The OWNERS lists for dast-related test registry steps are updated to reflect team membership changes (removing
natifridmanandobochan-rh, addingoblau).Configuration Cleanup: The old
telcov10n-functional-dast-setupworkflow metadata is removed as it is superseded by the new scan workflow.Impact
These changes enable automated code coverage tracking for cert-manager operator releases and modernize the telco operators' security testing infrastructure to use RapidAST DAST scanning on a more consistent schedule across OCP versions.