Skip to content

GCP-441: tolerate 1 restart for GCP CCM token-minter race condition#7865

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
cristianoveiga:GCP-441-ccm-crash-toleration
Mar 6, 2026
Merged

GCP-441: tolerate 1 restart for GCP CCM token-minter race condition#7865
openshift-merge-bot[bot] merged 1 commit intoopenshift:mainfrom
cristianoveiga:GCP-441-ccm-crash-toleration

Conversation

@cristianoveiga
Copy link
Copy Markdown
Contributor

@cristianoveiga cristianoveiga commented Mar 5, 2026

What this PR does / why we need it:

The GCP CCM intermittently crashes on startup when the token-minter sidecar hasn't written the token file yet. Allow 1 restart in EnsureNoCrashingPods to unblock CI. Proper fix tracked in GCP-447.

Which issue(s) this PR fixes:

Fixes
GCP-441

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Tests
    • Improved test robustness by increasing crash tolerance for GCP cloud controller manager component, allowing for graceful handling of a known race condition during test execution.

…(GCP-441)

The GCP CCM intermittently crashes on startup when the token-minter
sidecar hasn't written the token file yet. Allow 1 restart in
EnsureNoCrashingPods to unblock CI. Proper fix tracked in GCP-447.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci-robot
Copy link
Copy Markdown

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 5, 2026
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 5, 2026

@cristianoveiga: This pull request references GCP-441 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

The GCP CCM intermittently crashes on startup when the token-minter sidecar hasn't written the token file yet. Allow 1 restart in EnsureNoCrashingPods to unblock CI. Proper fix tracked in GCP-447.

Which issue(s) this PR fixes:

Fixes
GCP-441

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 5, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 5, 2026

@cristianoveiga: This pull request references GCP-441 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

The GCP CCM intermittently crashes on startup when the token-minter sidecar hasn't written the token file yet. Allow 1 restart in EnsureNoCrashingPods to unblock CI. Proper fix tracked in GCP-447.

Which issue(s) this PR fixes:

Fixes
GCP-441

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added do-not-merge/needs-area area/testing Indicates the PR includes changes for e2e testing and removed do-not-merge/needs-area labels Mar 5, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 20d2738f-b080-4741-9370-4be62397c815

📥 Commits

Reviewing files that changed from the base of the PR and between 563412d and 5df7901.

📒 Files selected for processing (1)
  • test/e2e/util/util.go

Walkthrough

Adds a new entry to the podCrashTolerations map in test utilities, allowing the "gcp-cloud-controller-manager" component to tolerate 1 restart due to a token-minter race condition. No functional logic changes; configuration update only.

Changes

Cohort / File(s) Summary
Test Configuration
test/e2e/util/util.go
Added "gcp-cloud-controller-manager" with toleration count of 1 to podCrashTolerations map to handle token-minter race condition restarts.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding toleration for 1 restart of the GCP Cloud Controller Manager due to a known token-minter race condition. It is concise, directly related to the changeset, and references the Jira ticket.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Stable And Deterministic Test Names ✅ Passed The pull request modifies a utility configuration map in test/e2e/util/util.go with a static, descriptive string key and no dynamic information, timestamps, or generated identifiers.
Test Structure And Quality ✅ Passed The PR only modifies a configuration variable in a utility module, not test code itself, making the custom test quality check not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

@cristianoveiga cristianoveiga marked this pull request as ready for review March 5, 2026 17:20
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 5, 2026
@openshift-ci openshift-ci bot requested review from csrwng and jparrill March 5, 2026 17:21
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 5, 2026

@cristianoveiga: This pull request references GCP-441 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

The GCP CCM intermittently crashes on startup when the token-minter sidecar hasn't written the token file yet. Allow 1 restart in EnsureNoCrashingPods to unblock CI. Proper fix tracked in GCP-447.

Which issue(s) this PR fixes:

Fixes
GCP-441

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Tests
  • Improved test robustness by increasing crash tolerance for GCP cloud controller manager component, allowing for graceful handling of a known race condition during test execution.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@billmvt
Copy link
Copy Markdown

billmvt commented Mar 5, 2026

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-21
/test e2e-aws-4-21
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws

@cwbotbot
Copy link
Copy Markdown

cwbotbot commented Mar 5, 2026

Test Results

e2e-aws

e2e-aks

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/verified later @cristianoveiga

@openshift-ci-robot openshift-ci-robot added verified-later verified Signifies that the PR passed pre-merge verification criteria labels Mar 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@cristianoveiga: This PR has been marked to be verified later by @cristianoveiga.

Details

In response to this:

/verified later @cristianoveiga

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@csrwng
Copy link
Copy Markdown
Contributor

csrwng commented Mar 5, 2026

/approve

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 5, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cristianoveiga, csrwng

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 5, 2026
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

1 similar comment
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 6, 2026

@cristianoveiga: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit cca8038 into openshift:main Mar 6, 2026
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/testing Indicates the PR includes changes for e2e testing jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria verified-later

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants