Skip to content

CNTRLPLANE-1137: Add support for external CNI provider Cilium#7423

Merged
openshift-merge-bot[bot] merged 4 commits intoopenshift:mainfrom
wewang58:cntr_1137
Jan 30, 2026
Merged

CNTRLPLANE-1137: Add support for external CNI provider Cilium#7423
openshift-merge-bot[bot] merged 4 commits intoopenshift:mainfrom
wewang58:cntr_1137

Conversation

@wewang58
Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

This PR implements cilium installation and check basic network policy scenarios test in aks env. close previous pr to make it clear.

Which issue(s) this PR fixes:

Fixes https://issues.redhat.com/browse/CNTRLPLANE-1137

Special notes for your reviewer:

  • Added ExternalCNIProvider flag and option to the e2e framework.
  • Implemented logic to install Cilium and skip default network validation when enabled.
  • Added utility functions ApplyYAMLFromURL and wait helpers for Deployments/DaemonSets.
  • Included Cilium-specific installation logic and network policy tests.
  • Pass log

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 31, 2025
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Dec 31, 2025

@wewang58: This pull request references CNTRLPLANE-1137 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

This PR implements cilium installation and check basic network policy scenarios test in aks env. close previous pr to make it clear.

Which issue(s) this PR fixes:

Fixes https://issues.redhat.com/browse/CNTRLPLANE-1137

Special notes for your reviewer:

  • Added ExternalCNIProvider flag and option to the e2e framework.
  • Implemented logic to install Cilium and skip default network validation when enabled.
  • Added utility functions ApplyYAMLFromURL and wait helpers for Deployments/DaemonSets.
  • Included Cilium-specific installation logic and network policy tests.
  • Pass log

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Dec 31, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 31, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 31, 2025

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review

Walkthrough

Adds Cilium v1.14.5 manifests and E2E support: CRD and deployment manifests, connectivity workloads, test utilities for installing/validating Cilium, framework wiring for an external CNI provider and asset reader, and a new Cilium connectivity test plus small test harness updates.

Changes

Cohort / File(s) Summary
Cilium manifests
test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yaml, test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yaml, test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml
Adds Cilium v1.14.5 Kubernetes manifests: a namespaced CiliumConfig CRD (v1alpha1), namespace, ServiceAccount, Deployment, Service, multiple RBAC Role/ClusterRole and bindings, and a large connectivity-check multi-resource manifest with workloads, services, and CiliumNetworkPolicy resources.
E2E utilities (Cilium)
test/e2e/util/cilium.go
New test utility implementing InstallCilium, EnsureCiliumConnectivityTestResources, and CleanupCiliumConnectivityTestResources; manages manifest apply, namespace/SCC creation, CiliumConfig lifecycle, daemonset/pod readiness polling, and connectivity test orchestration.
General test utilities
test/e2e/util/util.go
Adds exported helpers: WaitForDeploymentAvailable and ApplyYAMLBytes (multi-doc YAML decode, REST mapping namespace inference, Server-Side Apply patch). Adds imports for YAML decoding and io.
Framework & options
test/e2e/util/hypershift_framework.go, test/e2e/util/options.go, test/e2e/e2e_test.go
Adds ExternalCNIProvider field to Options and PlatformAgnosticOptions; adds assetReader assets.AssetReader field and WithAssetReader method to hypershiftTest; integrates pre-test Cilium installation when external CNI is set; adds CLI flag --e2e.external-cni-provider.
Tests & harness wiring
test/e2e/cilium_network_policy_test.go, test/e2e/autoscaling_test.go, test/e2e/azure_scheduler_test.go, test/e2e/control_plane_upgrade_test.go, test/e2e/create_cluster_test.go
Adds TestCiliumConnectivity (new e2e test file). Updates multiple tests to pass an asset reader into the test harness via .WithAssetReader(content.ReadFile) before .Execute.
Small API change in helper
test/e2e/util/reqserving/vpa.go
Changes EnsureVPAOperatorInstalled signature from a generic helper interface to func EnsureVPAOperatorInstalled(ctx context.Context, t *testing.T, c crclient.Client) error and replaces internal wait logic with e2eutil.WaitForDeploymentAvailable.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

🧪 Unit Test Generation v2 is now available!

We have significantly improved our unit test generation capabilities.

To enable: Add this to your .coderabbit.yaml configuration:

reviews:
  finishing_touches:
    unit_tests:
      enabled: true

Try it out by using the @coderabbitai generate unit tests command on your code files or under ✨ Finishing Touches on the walkthrough!

Have feedback? Share your thoughts on our Discord thread!


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added do-not-merge/needs-area area/testing Indicates the PR includes changes for e2e testing and removed do-not-merge/needs-area labels Dec 31, 2025
@wewang58
Copy link
Copy Markdown
Contributor Author

@heliubj18 I add all templates to assets, can you help to review it, thanks

@wewang58 wewang58 marked this pull request as ready for review January 5, 2026 03:30
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 5, 2026
@openshift-ci openshift-ci Bot requested review from csrwng and jparrill January 5, 2026 03:30
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml (1)

61-64: Consider adding resource limits for completeness.

The deployment specifies resource requests but no limits. While acceptable for e2e test infrastructure, adding limits would prevent unbounded resource consumption if the operator misbehaves.

🔎 Proposed enhancement
           resources:
             requests:
               cpu: 100m
               memory: 512Mi
+            limits:
+              cpu: 500m
+              memory: 1Gi
test/e2e/util/util.go (1)

4198-4211: Consider adding a timeout for HTTP requests.

http.Get uses the default HTTP client which has no timeout. In flaky network conditions, this could cause tests to hang indefinitely.

🔎 Proposed fix to add HTTP timeout
+	httpClient := &http.Client{
+		Timeout: 30 * time.Second,
+	}
+
 	if strings.HasPrefix(pathOrURL, "http://") || strings.HasPrefix(pathOrURL, "https://") {
 		// Download YAML content
-		resp, err := http.Get(pathOrURL)
+		resp, err := httpClient.Get(pathOrURL)
 		if err != nil {
 			return fmt.Errorf("failed to download manifest from %s: %w", pathOrURL, err)
 		}
test/e2e/util/cilium.go (4)

161-161: Use CiliumNamespace constant for consistency.

The constant CiliumNamespace is defined but a hardcoded string "cilium" is used here.

🔎 Proposed fix
-		err := guestClient.Get(ctx, crclient.ObjectKey{Name: "cilium"}, ciliumNs)
+		err := guestClient.Get(ctx, crclient.ObjectKey{Name: CiliumNamespace}, ciliumNs)

320-333: Consider adding parentheses for clarity.

The condition logic is correct due to Go's operator precedence (&& before ||), but explicit parentheses would improve readability.

🔎 Proposed fix
-				if ds.Name == "cilium" || strings.HasPrefix(ds.Name, "cilium-") && !strings.Contains(ds.Name, "operator") {
+				if ds.Name == "cilium" || (strings.HasPrefix(ds.Name, "cilium-") && !strings.Contains(ds.Name, "operator")) {

470-493: Fixed sleep duration may be fragile.

Using time.Sleep with a fixed duration doesn't adapt to varying cluster conditions. Consider polling for specific success indicators if the connectivity test provides them, or document why a fixed duration is necessary.


556-558: Redundant helper function - use ptr.To instead.

The ptrBool helper is redundant since k8s.io/utils/ptr is already imported (line 28) and ptr.To[bool](false) or simply ptr.To(false) can be used.

🔎 Proposed fix

Remove ptrBool function and update the call site:

-		AllowPrivilegeEscalation: ptrBool(false),
+		AllowPrivilegeEscalation: ptr.To(false),

And remove lines 556-558:

-func ptrBool(b bool) *bool {
-	return &b
-}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between c1d28e3 and 0fa952c.

📒 Files selected for processing (20)
  • test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml
  • test/e2e/cilium_network_policy_test.go
  • test/e2e/e2e_test.go
  • test/e2e/util/cilium.go
  • test/e2e/util/hypershift_framework.go
  • test/e2e/util/options.go
  • test/e2e/util/util.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yaml
  • test/e2e/util/options.go
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml
  • test/e2e/util/hypershift_framework.go
  • test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yaml
  • test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml
  • test/e2e/e2e_test.go
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yaml
  • test/e2e/cilium_network_policy_test.go
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yaml
  • test/e2e/util/util.go
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yaml
  • test/e2e/util/cilium.go
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yaml
🧬 Code graph analysis (4)
test/e2e/util/options.go (1)
api/hypershift/v1beta1/hostedcluster_types.go (6)
  • NonePlatform (1191-1191)
  • AWSPlatform (1188-1188)
  • KubevirtPlatform (1200-1200)
  • AzurePlatform (1203-1203)
  • PowerVSPlatform (1206-1206)
  • OpenStackPlatform (1209-1209)
test/e2e/util/hypershift_framework.go (5)
test/e2e/util/external_oidc.go (1)
  • ExtOIDCConfig (53-70)
api/hypershift/v1beta1/hostedcluster_types.go (3)
  • NonePlatform (1191-1191)
  • NetworkType (1166-1166)
  • Other (1179-1179)
support/util/visibility.go (1)
  • IsPrivateHC (34-44)
test/e2e/util/util.go (1)
  • WaitForGuestClient (325-373)
test/e2e/util/cilium.go (1)
  • InstallCilium (151-342)
test/e2e/cilium_network_policy_test.go (5)
api/hypershift/v1beta1/hostedcluster_types.go (1)
  • AzurePlatform (1203-1203)
test/e2e/util/hypershift_framework.go (1)
  • NewHypershiftTest (66-78)
support/azureutil/azureutil.go (1)
  • IsAroHCP (217-219)
test/e2e/util/util.go (1)
  • WaitForGuestClient (325-373)
test/e2e/util/cilium.go (1)
  • EnsureCiliumConnectivityTestResources (346-500)
test/e2e/util/cilium.go (3)
support/azureutil/azureutil.go (1)
  • IsAroHCP (217-219)
test/e2e/util/util.go (3)
  • ApplyYAML (4194-4264)
  • WaitForDeploymentAvailable (4163-4175)
  • WaitForDaemonSetReady (4178-4190)
support/util/util.go (1)
  • DeleteIfNeeded (142-144)
🪛 Checkov (3.2.334)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml

[medium] 1-75: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 1-75: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 1-75: Containers should not share the host network namespace

(CKV_K8S_19)

test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml

[medium] 3-58: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 3-58: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 59-115: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 59-115: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 116-180: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 116-180: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 116-180: Containers should not share the host network namespace

(CKV_K8S_19)


[medium] 181-236: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 181-236: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 237-292: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 237-292: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 293-338: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 293-338: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 339-394: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 339-394: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 395-450: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 395-450: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 451-516: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 451-516: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 517-582: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 517-582: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 583-649: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 583-649: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 583-649: Containers should not share the host network namespace

(CKV_K8S_19)


[medium] 650-716: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 650-716: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 650-716: Containers should not share the host network namespace

(CKV_K8S_19)


[medium] 717-782: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 717-782: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 783-848: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 783-848: Minimize the admission of root containers

(CKV_K8S_23)

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml

[medium] 1-82: Minimize wildcard use in Roles and ClusterRoles

(CKV_K8S_49)

🪛 golangci-lint (2.5.0)
test/e2e/util/cilium.go

[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test]
sync-global-pullsecret/sync-global-pullsecret_test.go:228:23: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:234:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:247:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:257:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:270:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:283:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:296:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:309:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:327:12: undefined: NewMockdbusConn

(typecheck)

🔇 Additional comments (26)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yaml (1)

1-13: LGTM!

The RoleBinding correctly binds the cilium-olm ServiceAccount to the cilium-olm Role within the cilium namespace. The manifest structure is valid and follows Kubernetes RBAC conventions.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yaml (1)

1-13: LGTM!

The RoleBinding correctly enables leader election for the cilium-olm ServiceAccount by binding it to the leader-election Role within the cilium namespace. The manifest structure is valid.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yaml (1)

1-12: LGTM! Note AI summary discrepancy.

The ClusterRoleBinding correctly references the ClusterRole cilium-cilium-olm (defined in cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml) and binds it to the cilium-olm ServiceAccount. The manifest structure is valid.

Note: The AI summary incorrectly states this binds ClusterRole "cilium-cilium", but the code correctly references "cilium-cilium-olm".

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yaml (1)

1-11: LGTM!

The Namespace configuration is appropriate for a system-level CNI namespace:

  • Empty node-selector allows flexible pod scheduling
  • Cluster logging and monitoring labels enable observability
  • Run-level "0" indicates this is an early-lifecycle system namespace

The manifest structure is valid and follows OpenShift conventions.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml (1)

14-28: This ClusterRole is sourced from the upstream isovalent/olm-for-cilium project and follows the expected pattern for Helm operators, which require broad RBAC permissions to orchestrate application deployment and lifecycle management. The broad CRUD permissions on RBAC resources (roles, rolebindings, clusterroles, clusterrolebindings) are necessary for cilium-olm to create and manage Cilium's own RBAC requirements during cluster setup. Since this is a test fixture derived from upstream, the scope and permissions are already vetted by the Cilium project.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yaml (1)

1-7: LGTM!

The ServiceAccount manifest follows standard Kubernetes conventions and is correctly structured for use by the cilium-olm deployment.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yaml (1)

1-14: LGTM!

The Service manifest is correctly structured with appropriate port mapping (443→9443) for the cilium-olm component. The selector properly targets pods with the name: cilium-olm label.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yaml (1)

1-24: LGTM!

The leader election Role follows standard Kubernetes patterns with appropriate permissions for managing ConfigMap-based leader election leases and creating events.

test/e2e/util/options.go (2)

94-95: LGTM!

The ExternalCNIProvider field is properly documented and follows the existing pattern for optional feature flags in the Options struct.


206-212: LGTM!

The ExternalCNIProvider field is correctly propagated to PlatformAgnosticOptions, maintaining consistency with how other platform-specific options are handled in DefaultClusterOptions.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml (1)

1-82: LGTM - Standard Cilium operator RBAC.

The wildcard permissions on cilium.io resources and CRDs are expected for a CNI operator that needs to manage its own custom resources. The remaining rules are appropriately scoped. This aligns with upstream Cilium installation patterns and is acceptable for e2e test infrastructure.

test/e2e/cilium_network_policy_test.go (1)

27-52: LGTM - Well-structured test with proper skip conditions and cleanup.

The test correctly:

  • Runs in parallel
  • Skips appropriately when not on Azure platform or when Cilium isn't the external CNI provider
  • Performs ARO HCP environment check inside the test harness
  • Defers cleanup of connectivity test resources
test/e2e/util/hypershift_framework.go (1)

51-52: LGTM - New fields extend platform options appropriately.

The ExternalCNIProvider field follows the existing pattern established by ExtOIDCConfig.

test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yaml (1)

1-41: LGTM - Standard CiliumConfig CRD definition.

The CRD follows proper structure with appropriate use of x-kubernetes-preserve-unknown-fields for the operator-managed spec and status fields.

test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml (2)

1-57: Note: Unusual YAML document ordering is valid but unconventional.

The apiVersion and kind fields appear at the end of each document rather than the beginning. While valid YAML (field order doesn't matter), this is atypical and may confuse readers. Since this is marked as auto-generated from Cilium upstream, modifying it may complicate future updates.


116-180: Host networking is intentional for connectivity test scenarios.

The hostNetwork: true in echo-b-host and similar deployments is required for testing host-to-pod and cross-node connectivity scenarios. The static analysis warnings about hostNetwork and security contexts can be safely ignored for these e2e test assets.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yaml (1)

1-12: LGTM - Properly binds ClusterRole to ServiceAccount.

The binding correctly references the cilium-cilium ClusterRole and the cilium-olm ServiceAccount in the cilium namespace, consistent with other manifests in this PR.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml (1)

68-72: Host networking and broad tolerations are appropriate for CNI operators.

The hostNetwork: true is required for CNI operators to configure the node's network stack. The broad toleration (operator: Exists) ensures the operator can run on control plane nodes if needed. Static analysis warnings can be safely ignored for this use case.

test/e2e/e2e_test.go (1)

93-93: LGTM!

The new CLI flag follows the established pattern for e2e test options and integrates cleanly with the existing flag definitions.

test/e2e/util/util.go (3)

4162-4175: LGTM!

The helper function follows the established gomega Eventually pattern used elsewhere in this file and provides a clean interface for waiting on deployment readiness.


4177-4190: LGTM!

The helper function correctly implements the wait pattern for DaemonSet readiness, consistent with WaitForDeploymentAvailable.


4220-4264: LGTM!

The YAML decoding and Server-Side Apply logic is well-implemented with proper handling of multi-document YAMLs, namespace scoping via RESTMapper, and clear error messages.

test/e2e/util/cilium.go (4)

33-63: LGTM!

Constants are well-organized with clear naming and documentation. Reading CiliumVersion from environment allows flexibility in CI configurations.


65-147: LGTM!

The manifest helper functions are well-structured with appropriate PodSecurity labels and SCC capabilities for Cilium. The local/remote manifest fallback pattern provides good flexibility for testing.


502-528: LGTM!

Good defensive coding with sensible defaults and graceful error handling. The IPv4 detection logic via strings.Contains(clusterNet.CIDR, ".") is simple and effective.


560-629: LGTM!

The CreateCiliumConfig function creates a comprehensive Cilium configuration with appropriate settings for OpenShift integration. Debug mode being enabled is reasonable for e2e test observability.

Comment on lines +39 to +66
- apiGroups:
- ""
resources:
- secrets
verbs:
- '*'
- apiGroups:
- ""
resources:
- serviceaccounts
- configmaps
- secrets
- services
verbs:
- '*'
- apiGroups:
- apps
resources:
- deployments
- daemonsets
verbs:
- '*'
- apiGroups:
- monitoring.coreos.com
resources:
- servicemonitors
verbs:
- '*'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Review the use of wildcard RBAC permissions.

Multiple rules use wildcard verbs (*), granting full control over secrets, serviceaccounts, configmaps, services, deployments, daemonsets, and servicemonitors within the cilium namespace. While operators typically require elevated permissions, wildcards are overly permissive and violate the principle of least privilege.

Consider explicitly listing only the required verbs (e.g., get, list, watch, create, update, patch, delete) rather than using *. This reduces the attack surface if the operator is compromised.

Note: secrets appears in two separate rules (lines 42 and 50), which is redundant.

🤖 Prompt for AI Agents
In
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yaml
around lines 39–66, several RBAC rules use wildcard verbs ("*") and secrets is
duplicated; replace each "*" with the minimal explicit verbs the operator
actually needs (for example get, list, watch, create, update, patch, delete as
applicable per resource) and remove the duplicate secrets entry by consolidating
permissions into a single rule for secrets; ensure each apiGroup/resources block
only lists the precise verbs required for those resources to follow
least-privilege.

Comment on lines +150 to +155
guestClient := WaitForGuestClient(t, context.Background(), h.client, hostedCluster)
InstallCilium(t, context.Background(), guestClient, hostedCluster)
// wait hosted cluster ready
WaitForNReadyNodes(t, context.Background(), guestClient, opts.NodePoolReplicas, platform)
WaitForImageRollout(t, context.Background(), h.client, hostedCluster)
ValidateHostedClusterConditions(t, context.Background(), h.client, hostedCluster, true, 10*time.Minute)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use h.ctx instead of context.Background() for proper cancellation propagation.

The test framework already has h.ctx which is properly configured with cancellation. Using context.Background() here means these operations won't respect test cancellation or timeouts, potentially leaving resources orphaned if the test is aborted.

🔎 Proposed fix
 			if opts.NodePoolReplicas == 0 {
 				t.Fatal("NodePool replicas must be positive for Cilium to install.")
 			}
-			guestClient := WaitForGuestClient(t, context.Background(), h.client, hostedCluster)
-			InstallCilium(t, context.Background(), guestClient, hostedCluster)
+			guestClient := WaitForGuestClient(t, h.ctx, h.client, hostedCluster)
+			InstallCilium(t, h.ctx, guestClient, hostedCluster)
 			// wait hosted cluster ready
-			WaitForNReadyNodes(t, context.Background(), guestClient, opts.NodePoolReplicas, platform)
-			WaitForImageRollout(t, context.Background(), h.client, hostedCluster)
-			ValidateHostedClusterConditions(t, context.Background(), h.client, hostedCluster, true, 10*time.Minute)
+			WaitForNReadyNodes(t, h.ctx, guestClient, opts.NodePoolReplicas, platform)
+			WaitForImageRollout(t, h.ctx, h.client, hostedCluster)
+			ValidateHostedClusterConditions(t, h.ctx, h.client, hostedCluster, true, 10*time.Minute)
 		}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
guestClient := WaitForGuestClient(t, context.Background(), h.client, hostedCluster)
InstallCilium(t, context.Background(), guestClient, hostedCluster)
// wait hosted cluster ready
WaitForNReadyNodes(t, context.Background(), guestClient, opts.NodePoolReplicas, platform)
WaitForImageRollout(t, context.Background(), h.client, hostedCluster)
ValidateHostedClusterConditions(t, context.Background(), h.client, hostedCluster, true, 10*time.Minute)
guestClient := WaitForGuestClient(t, h.ctx, h.client, hostedCluster)
InstallCilium(t, h.ctx, guestClient, hostedCluster)
// wait hosted cluster ready
WaitForNReadyNodes(t, h.ctx, guestClient, opts.NodePoolReplicas, platform)
WaitForImageRollout(t, h.ctx, h.client, hostedCluster)
ValidateHostedClusterConditions(t, h.ctx, h.client, hostedCluster, true, 10*time.Minute)
🤖 Prompt for AI Agents
In test/e2e/util/hypershift_framework.go around lines 150 to 155, several calls
use context.Background() which bypasses the framework's cancellation; replace
each context.Background() with h.ctx (i.e., use h.ctx for WaitForGuestClient,
InstallCilium, WaitForNReadyNodes, WaitForImageRollout, and
ValidateHostedClusterConditions) so the operations inherit the test's context
and proper cancellation/timeouts.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Fix all issues with AI Agents 🤖
In @test/e2e/util/util.go:
- Around line 4198-4211: The HTTP download uses the default client (http.Get)
without a timeout which can hang; replace the http.Get(pathOrURL) call with a
dedicated http.Client having a reasonable Timeout (e.g., 10s) and call
client.Get(pathOrURL) instead, then keep the existing resp.Body.Close, status
code check and io.ReadAll handling; locate the download block around the
http.Get usage (the pathOrURL/http.Get/yamlContent section) and swap in the
timeout-capable client to ensure the request fails fast on unresponsive hosts.
♻️ Duplicate comments (1)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yaml (1)

39-66: Wildcard RBAC and duplicate secrets entry already flagged.

This concern was raised in a previous review. The wildcard verbs and duplicate secrets resource (Lines 42 and 50) should be addressed per the earlier feedback.

🧹 Nitpick comments (7)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml (1)

17-72: Consider adding a security context to restrict privilege escalation.

The deployment lacks a securityContext to explicitly disable privilege escalation. While hostNetwork: true and the broad toleration are necessary for CNI operators, you can still harden the container:

      containers:
        - name: operator
          securityContext:
            allowPrivilegeEscalation: false
            readOnlyRootFilesystem: true

Since this is test infrastructure for e2e validation, this is optional but would align with security best practices flagged by static analysis.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml (1)

5-17: Wildcard permissions on CRDs are broader than necessary.

The wildcard verbs (*) on cilium.io/* resources are acceptable for the Cilium operator. However, the wildcard on customresourcedefinitions (Lines 12-17) grants ability to create/delete any CRD in the cluster, not just Cilium's.

For test infrastructure this is acceptable, but in production environments consider restricting to specific CRD names or limiting verbs to what's actually needed (e.g., get, list, watch, create, update, patch).

test/e2e/util/util.go (2)

4162-4174: Consider strengthening the readiness check.

The current check only verifies that ReadyReplicas > 0, which may not ensure the Deployment is fully ready. If the Deployment has multiple replicas, this could return success prematurely.

🔎 Proposed improvement
 func WaitForDeploymentAvailable(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
 	g := NewWithT(t)
 	t.Logf("Waiting for deployment %s/%s to be ready", namespace, name)
 	g.Eventually(func() bool {
 		deployment := &appsv1.Deployment{}
 		err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, deployment)
 		if err != nil {
 			t.Logf("Failed to get deployment %s/%s: %v", namespace, name, err)
 			return false
 		}
-		return deployment.Status.ReadyReplicas > 0
+		// Check that all desired replicas are ready
+		return deployment.Status.ReadyReplicas > 0 && 
+		       deployment.Status.ReadyReplicas == deployment.Status.Replicas &&
+		       deployment.Status.ObservedGeneration >= deployment.Generation
 	}, timeout, interval).Should(BeTrue(), fmt.Sprintf("deployment %s/%s should be ready", namespace, name))
 }

4177-4189: Consider strengthening the DaemonSet readiness check.

The current check only verifies that NumberReady > 0, which doesn't ensure the DaemonSet has pods ready on all desired nodes.

🔎 Proposed improvement
 func WaitForDaemonSetReady(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
 	g := NewWithT(t)
 	t.Logf("Waiting for DaemonSet %s/%s to be ready", namespace, name)
 	g.Eventually(func() bool {
 		ds := &appsv1.DaemonSet{}
 		err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, ds)
 		if err != nil {
 			t.Logf("Failed to get DaemonSet %s/%s: %v", namespace, name, err)
 			return false
 		}
-		return ds.Status.NumberReady > 0
+		// Check that all desired pods are ready and the rollout is complete
+		return ds.Status.NumberReady > 0 &&
+		       ds.Status.NumberReady == ds.Status.DesiredNumberScheduled &&
+		       ds.Status.ObservedGeneration >= ds.Generation
 	}, timeout, interval).Should(BeTrue(), fmt.Sprintf("DaemonSet %s/%s should be ready", namespace, name))
 }
test/e2e/util/cilium.go (3)

259-307: Complex Eventually block with nested updates - consider simplifying.

The Eventually block handles the case where the Cilium operator might overwrite the CiliumConfig, but the nested update logic makes it harder to debug failures. Consider extracting this to a helper function or adding more structured logging.

This is not blocking, but could improve maintainability for debugging test failures.


470-493: Replace fixed sleep with polling for test completion.

The current implementation waits a fixed 60 seconds (line 474) and then only checks if pods are still running. This approach has limitations:

  1. Fixed waits are brittle - tests might pass/fail based on timing
  2. Checking only pod phase misses container-level failures (CrashLoopBackOff, errors)
  3. Could result in false positives if connectivity tests are still running

Consider polling for test completion status instead (e.g., checking container termination status or using a Job-based approach) and examining container logs for test results.


556-558: Use existing pointer helper instead of custom function.

The repository already uses k8s.io/utils/ptr (imported as ptr in other files). Consider using ptr.To[bool](b) instead of maintaining a custom ptrBool function.

🔎 Proposed change
-func ptrBool(b bool) *bool {
-	return &b
-}

And update the usage at line 386:

-			AllowPrivilegeEscalation: ptrBool(false),
+			AllowPrivilegeEscalation: ptr.To(false),
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 0fa952c and 609cbef.

📒 Files selected for processing (20)
  • test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml
  • test/e2e/cilium_network_policy_test.go
  • test/e2e/e2e_test.go
  • test/e2e/util/cilium.go
  • test/e2e/util/hypershift_framework.go
  • test/e2e/util/options.go
  • test/e2e/util/util.go
🚧 Files skipped from review as they are similar to previous changes (10)
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yaml
  • test/e2e/e2e_test.go
  • test/e2e/util/options.go
  • test/e2e/util/hypershift_framework.go
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yaml
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/cilium_network_policy_test.go
  • test/e2e/util/util.go
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yaml
  • test/e2e/util/cilium.go
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml
  • test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml
  • test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml
🧬 Code graph analysis (1)
test/e2e/util/cilium.go (2)
support/azureutil/azureutil.go (1)
  • IsAroHCP (217-219)
test/e2e/util/util.go (3)
  • ApplyYAML (4194-4264)
  • WaitForDeploymentAvailable (4163-4175)
  • WaitForDaemonSetReady (4178-4190)
🪛 Checkov (3.2.334)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml

[medium] 1-75: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 1-75: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 1-75: Containers should not share the host network namespace

(CKV_K8S_19)

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml

[medium] 1-82: Minimize wildcard use in Roles and ClusterRoles

(CKV_K8S_49)

test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml

[medium] 3-58: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 3-58: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 59-115: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 59-115: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 116-180: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 116-180: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 116-180: Containers should not share the host network namespace

(CKV_K8S_19)


[medium] 181-236: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 181-236: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 237-292: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 237-292: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 293-338: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 293-338: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 339-394: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 339-394: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 395-450: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 395-450: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 451-516: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 451-516: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 517-582: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 517-582: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 583-649: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 583-649: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 583-649: Containers should not share the host network namespace

(CKV_K8S_19)


[medium] 650-716: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 650-716: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 650-716: Containers should not share the host network namespace

(CKV_K8S_19)


[medium] 717-782: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 717-782: Minimize the admission of root containers

(CKV_K8S_23)


[medium] 783-848: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 783-848: Minimize the admission of root containers

(CKV_K8S_23)

🪛 golangci-lint (2.5.0)
test/e2e/util/cilium.go

[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test]
sync-global-pullsecret/sync-global-pullsecret_test.go:228:23: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:234:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:247:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:257:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:270:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:283:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:296:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:309:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:327:12: undefined: NewMockdbusConn

(typecheck)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Red Hat Konflux / hypershift-release-mce-211-on-pull-request
  • GitHub Check: Red Hat Konflux / hypershift-operator-main-on-pull-request
  • GitHub Check: Red Hat Konflux / hypershift-cli-mce-211-on-pull-request
🔇 Additional comments (7)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yaml (1)

1-13: LGTM!

Standard RoleBinding correctly binding the cilium-olm Role to the cilium-olm ServiceAccount in the cilium namespace.

test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yaml (1)

1-41: LGTM!

Well-structured CRD definition for CiliumConfig with appropriate x-kubernetes-preserve-unknown-fields for flexible operator configuration and status subresource enabled.

test/e2e/cilium_network_policy_test.go (2)

43-51: LGTM!

Good test structure with proper resource cleanup via defer. The nested ARO HCP check correctly differentiates between generic Azure and ARO HCP-specific requirements.


17-25: No changes needed. The comment accurately describes the test implementation. EnsureCiliumConnectivityTestResources internally checks Cilium pod readiness as its first step via the CheckCiliumPodsRunning subtest before proceeding with subsequent setup operations. The documented steps align with the actual execution flow.

test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml (1)

1-2: Auto-generated test manifest - static analysis findings are expected.

This is a standard Cilium connectivity test suite auto-generated by Makefile. The hostNetwork: true configurations and lack of explicit security contexts are intentional for testing various networking scenarios (host-to-pod, cross-node, nodeport access). The static analysis warnings about privilege escalation and host networking are false positives in this context.

test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yaml (1)

1-12: LGTM!

Standard ClusterRoleBinding manifest that correctly binds the cilium-cilium ClusterRole to the cilium-olm ServiceAccount in the cilium namespace.

test/e2e/util/cilium.go (1)

81-114: LGTM - Appropriately permissive SCC for CNI workload.

The SecurityContextConstraints are correctly configured for a CNI provider like Cilium, which requires elevated privileges (host network, privileged containers, etc.). The permissions are scoped to specific service accounts in the cilium namespace.

Comment thread test/e2e/util/util.go Outdated
@wewang58
Copy link
Copy Markdown
Contributor Author

wewang58 commented Jan 6, 2026

@jparrill @mgencur @heliubj18 Submitted a new pr to make it easy to review, it's almost the same with the old pr, just move some templates to assets directory, can you review again, thanks.

@mgencur
Copy link
Copy Markdown
Contributor

mgencur commented Jan 8, 2026

The AKS test fails with:

failed to wait for DaemonSet ovnkube-node to be ready: context deadline exceeded

(waits for OVN but this is Cilium)

Copy link
Copy Markdown
Contributor

@mgencur mgencur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wewang58 , a few comments here and there. Thanks

Comment thread test/e2e/cilium_network_policy_test.go Outdated
t.Skip("Skipping test because it requires Azure platform")
}

if globalOpts.ExternalCNIProvider != "cilium" {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extract "cilium" as const and re-use in the other places? It's used a few times in this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgencur Thanks reviewing, i will update others and test it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mgencur can you help to see my updated the feature codes again? if any other needs to update,feel free to add comments, thanks a lot for reviewing.

Comment thread test/e2e/e2e_test.go Outdated
flag.BoolVar(&globalOpts.HOInstallationOptions.EnableCIDebugOutput, "e2e.ho-enable-ci-debug-output", false, "Install the HyperShift Operator with extra CI debug output enabled. This is a HyperShift Operator installation option")
flag.StringVar(&globalOpts.HOInstallationOptions.PlatformMonitoring, "e2e.platform-monitoring", "All", "The option for enabling platform cluster monitoring when installing the HyperShift Operator. Valid values are: None, OperatorOnly, All. This is a HyperShift Operator installation option")
flag.BoolVar(&globalOpts.RunUpgradeTest, "upgrade.run-tests", false, "Run HyperShift Operator upgrade test")
flag.StringVar(&globalOpts.ExternalCNIProvider, "e2e.external-cni-provider", "", "the option supported third cni provider: cilium, calico")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to mention "calico" here? Maybe something like this would be better: The option supports the following CNI providers: cilium
...and later we could add more if necessary.

Comment thread test/e2e/util/cilium.go Outdated
CiliumDefaultTimeout = 10 * time.Minute
CiliumLongTimeout = 20 * time.Minute
CiliumShortTimeout = 2 * time.Minute
CiliumDefaultPollInterval = 10 * time.Second
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other PR, @jparrill was requesting those variables to be package private (AFAIK). #7077 (comment)
I'm just bring this here to speed up the loop.

Comment thread test/e2e/util/cilium.go Outdated
if _, err := os.Stat(localPath); err == nil {
manifests = append(manifests, localPath)
} else {
manifests = append(manifests, fmt.Sprintf("%s/%s", remoteBase, f))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you included the cilium yaml files in this PR to prevent downloading the files every time (e.g. for disconnected envs). So, we have the files locally now, so why this backup for downloading the files?
It is unlikely that some files would be accidentally removed from the repo.

func (h *hypershiftTest) before(hostedCluster *hyperv1.HostedCluster, opts *PlatformAgnosticOptions, platform hyperv1.PlatformType) {
h.Run("ValidateHostedCluster", func(t *testing.T) {
if platform != hyperv1.NonePlatform {
if platform != hyperv1.NonePlatform && hostedCluster.Spec.Networking.NetworkType != hyperv1.Other {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a related comment on the PR that will trigger the test suite: https://github.com/openshift/release/pull/71383/files#r2672253127 . PTAL. I think it's important to set --network-type=Other there.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean if EXTERNAL_CNI_PROVIDER is not null, will add --e2e.external-cni-provider and --network-type=Other in https://github.com/openshift/release/pull/71383/files#diff-36e084ed3de5a17840678199ccb286a3ff6b2e47b9b6ebbbf011a26e43932f5cL106 right? i will update that pr.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. This is what I mean.

@wewang58
Copy link
Copy Markdown
Contributor Author

wewang58 commented Jan 9, 2026

The AKS test fails with:

failed to wait for DaemonSet ovnkube-node to be ready: context deadline exceeded

(waits for OVN but this is Cilium)

It should be not related to my pr, though I add Cilium code, acutually, it does not actually installed cilium cni, you can see test log did not add --e2e.external-cni-provider="cilium" in test run

I have ci pr: openshift/release#71383 is dependent on the pr.

@openshift-ci openshift-ci Bot added the area/platform/azure PR/issue for Azure (AzurePlatform) platform label Jan 13, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
test/e2e/util/hypershift_framework.go (1)

132-168: Cilium pre-install hook: fix context usage + avoid exact-node-count hang + don’t Fatal on NodePoolReplicas==0.

  • context.Background() drops cancellation/timeouts from the test run.
  • WaitForNReadyNodes requires an exact node count; using opts.NodePoolReplicas can hang if actual node count differs.
  • Comment says NodePoolReplicas can be 0 (e.g., TestNodePool), but current code hard-fails.
Proposed fix
@@
 func (h *hypershiftTest) before(hostedCluster *hyperv1.HostedCluster, opts *PlatformAgnosticOptions, platform hyperv1.PlatformType) {
 	h.Run("ValidateHostedCluster", func(t *testing.T) {
+		ctx := t.Context()
 		if platform != hyperv1.NonePlatform && hostedCluster.Spec.Networking.NetworkType != hyperv1.Other {
 			if opts.AWSPlatform.EndpointAccess == string(hyperv1.Private) {
 				ValidatePrivateCluster(t, h.ctx, h.client, hostedCluster, opts)
 			} else {
 				ValidatePublicCluster(t, h.ctx, h.client, hostedCluster, opts)
 			}
@@
 		if opts.ExternalCNIProvider == CiliumCNIProvider {
@@
 			if !util.IsPrivateHC(hostedCluster) {
-				if opts.NodePoolReplicas == 0 {
-					t.Fatal("NodePool replicas must be positive for Cilium to install.")
-				}
+				if opts.NodePoolReplicas == 0 {
+					t.Log("Skipping Cilium install during initial validation: no worker nodes configured yet (NodePoolReplicas=0).")
+					return
+				}
 				if h.assetReader == nil {
 					t.Fatal("AssetReader is required for Cilium installation. Call WithAssetReader() on the test instance.")
 				}
-				guestClient := WaitForGuestClient(t, context.Background(), h.client, hostedCluster)
-				InstallCilium(t, context.Background(), guestClient, hostedCluster, h.assetReader)
+				guestClient := WaitForGuestClient(t, ctx, h.client, hostedCluster)
+				InstallCilium(t, ctx, guestClient, hostedCluster, h.assetReader)
+
+				// Match node counting behavior used elsewhere (avoid hanging on "got X, want Y").
+				expectedNodes := opts.NodePoolReplicas
+				if platform == hyperv1.AWSPlatform && len(opts.AWSPlatform.Zones) > 0 {
+					expectedNodes = expectedNodes * int32(len(opts.AWSPlatform.Zones))
+				}
 				// wait hosted cluster ready
-				WaitForNReadyNodes(t, context.Background(), guestClient, opts.NodePoolReplicas, platform)
-				WaitForImageRollout(t, context.Background(), h.client, hostedCluster)
-				ValidateHostedClusterConditions(t, context.Background(), h.client, hostedCluster, true, 10*time.Minute)
+				WaitForNReadyNodes(t, ctx, guestClient, expectedNodes, platform)
+				WaitForImageRollout(t, ctx, h.client, hostedCluster)
+				ValidateHostedClusterConditions(t, ctx, h.client, hostedCluster, true, 10*time.Minute)
 			}
 
 		}
 	})
 }
test/e2e/azure_scheduler_test.go (1)

42-59: Verify import availability for content.ReadFile call.
Ensure the content package is imported in the file; if not, add the appropriate import used by other e2e tests calling WithAssetReader(content.ReadFile).

🤖 Fix all issues with AI agents
In @test/e2e/util/cilium.go:
- Around line 33-48: The code uses CiliumVersion (env) to build asset paths via
MustAsset which will panic if the asset path is invalid; update places that
construct assets (any code calling MustAsset or building paths with
CiliumVersion, e.g., the functions/blocks around the current top-level
CiliumVersion var and the usages later at lines ~137-205) to first guard that
CiliumVersion != "" and fail fast or skip tests when empty; specifically, before
any call to MustAsset or any asset path construction that interpolates
CiliumVersion, add a check that returns an error/skips the test (or uses a safe
lookup that returns an error) instead of calling MustAsset directly so you avoid
a panic.
- Around line 334-486: In EnsureCiliumConnectivityTestResources inside the
"WaitForConnectivityTestCompletion" block update the pod phase validation (the
loop iterating over podList.Items) so it does not treat completed Job pods as
failures: change the failure condition that currently checks pod.Status.Phase !=
corev1.PodRunning to accept corev1.PodSucceeded as a valid terminal success
(e.g., treat pods with Phase Running OR Succeeded as OK), or alternatively skip
pods owned by Job resources; adjust the construction of failedPods accordingly
so only pods not in Running or Succeeded are reported as failures.

In @test/e2e/util/util.go:
- Around line 4162-4190: The current wait helpers only check for >0 readiness
which can hide partial rollouts; update WaitForDeploymentAvailable to fetch the
Deployment, compute desiredReplicas := int32(1) or use *deployment.Spec.Replicas
if set, and assert deployment.Status.ObservedGeneration >= deployment.Generation
AND deployment.Status.AvailableReplicas == desiredReplicas (or ReadyReplicas ==
desiredReplicas), returning true only when fully available; similarly update
WaitForDaemonSetReady to assert ds.Status.ObservedGeneration >= ds.Generation
AND ds.Status.NumberReady == ds.Status.DesiredNumberScheduled (or
NumberAvailable == DesiredNumberScheduled) so partial readiness fails; include
the error and status details in the t.Logf messages for easier debugging and
keep the NewWithT(...).Eventually checks but replace the >0 checks with these
full-availability conditions.
- Around line 4192-4221: The ApplyYAML function uses http.Get which ignores the
provided ctx and has no timeout; change ApplyYAML to create an HTTP request
bound to the passed ctx (e.g., http.NewRequestWithContext(ctx, "GET", pathOrURL,
nil)) and use an http.Client with a reasonable Timeout (or rely on ctx
cancellation) to perform the request, then read resp.Body as before and still
defer resp.Body.Close(); keep the existing error handling for non-200 responses
and ensure the request uses ctx so CI cannot hang indefinitely.
🧹 Nitpick comments (3)
test/e2e/util/hypershift_framework.go (1)

42-54: New ExternalCNIProvider option is fine, but please document/enforce supported values.
Right now anything other than "cilium" is silently treated as “off”; consider validating/normalizing input at option parsing time to avoid typos creating confusing runs.

test/e2e/util/util.go (1)

4223-4269: Namespace defaulting can use client.IsObjectNamespaced(...) (less RESTMapper plumbing).
This reduces per-object RESTMapping calls and keeps the code aligned with controller-runtime’s client API.

test/e2e/util/cilium.go (1)

542-615: ptrBool helper is redundant with k8s.io/utils/ptr.
Optional cleanup: replace ptrBool(false) with ptr.To(false) and drop the helper.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 609cbef and 239ca93.

📒 Files selected for processing (9)
  • test/e2e/autoscaling_test.go
  • test/e2e/azure_scheduler_test.go
  • test/e2e/cilium_network_policy_test.go
  • test/e2e/control_plane_upgrade_test.go
  • test/e2e/create_cluster_test.go
  • test/e2e/e2e_test.go
  • test/e2e/util/cilium.go
  • test/e2e/util/hypershift_framework.go
  • test/e2e/util/util.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/e2e_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/create_cluster_test.go
  • test/e2e/cilium_network_policy_test.go
  • test/e2e/azure_scheduler_test.go
  • test/e2e/control_plane_upgrade_test.go
  • test/e2e/util/hypershift_framework.go
  • test/e2e/util/util.go
  • test/e2e/autoscaling_test.go
  • test/e2e/util/cilium.go
🧬 Code graph analysis (7)
test/e2e/create_cluster_test.go (1)
karpenter-operator/controllers/karpenter/assets/assets.go (1)
  • ReadFile (23-25)
test/e2e/cilium_network_policy_test.go (3)
test/e2e/util/cilium.go (2)
  • CiliumCNIProvider (40-40)
  • EnsureCiliumConnectivityTestResources (336-486)
test/e2e/util/hypershift_framework.go (1)
  • NewHypershiftTest (68-80)
support/azureutil/azureutil.go (1)
  • IsAroHCP (217-219)
test/e2e/azure_scheduler_test.go (2)
karpenter-operator/controllers/karpenter/assets/assets.go (1)
  • ReadFile (23-25)
contrib/gomaxprocs-webhook/cmd/root.go (1)
  • Execute (29-36)
test/e2e/control_plane_upgrade_test.go (2)
karpenter-operator/controllers/karpenter/assets/assets.go (1)
  • ReadFile (23-25)
contrib/gomaxprocs-webhook/cmd/root.go (1)
  • Execute (29-36)
test/e2e/util/hypershift_framework.go (3)
support/assets/readasset.go (1)
  • AssetReader (17-17)
test/e2e/util/cilium.go (2)
  • CiliumCNIProvider (40-40)
  • InstallCilium (139-332)
test/e2e/util/util.go (1)
  • WaitForGuestClient (325-373)
test/e2e/autoscaling_test.go (1)
karpenter-operator/controllers/karpenter/assets/assets.go (1)
  • ReadFile (23-25)
test/e2e/util/cilium.go (1)
support/azureutil/azureutil.go (1)
  • IsAroHCP (217-219)
🪛 golangci-lint (2.5.0)
test/e2e/util/cilium.go

[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test]
sync-global-pullsecret/sync-global-pullsecret_test.go:228:23: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:234:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:247:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:257:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:270:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:283:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:296:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:309:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:327:12: undefined: NewMockdbusConn

(typecheck)

🔇 Additional comments (7)
test/e2e/util/hypershift_framework.go (1)

57-85: WithAssetReader(...) is a clean injection point.
This keeps asset loading out of global state and makes tests easier to compose.

test/e2e/util/cilium.go (1)

217-330: > Likely an incorrect or invalid review comment.

test/e2e/cilium_network_policy_test.go (1)

5-52: Since the referenced file and packages cannot be verified in the available sources, manual verification by the developer or access to the actual branch/repository containing this code is required to confirm the build-break and validate the suggested fixes.

test/e2e/autoscaling_test.go (1)

79-79: LGTM!

The asset reader is correctly wired into the test builder chain, consistent with the broader refactoring pattern across e2e tests.

test/e2e/create_cluster_test.go (2)

2393-2394: LGTM!

The asset reader integration follows the established pattern. The line break improves readability.


2623-2623: LGTM!

Consistent with the asset reader pattern applied across the e2e test suite.

test/e2e/control_plane_upgrade_test.go (1)

76-76: LGTM!

The asset reader is properly integrated into the control plane upgrade test, maintaining consistency with other e2e tests in this PR.

Comment thread test/e2e/util/cilium.go
Comment on lines +33 to +48
var (
// CiliumVersion is read from CILIUM_VERSION at runtime. When empty, the Cilium tests should skip.
CiliumVersion = os.Getenv("CILIUM_VERSION")
)

const (
// CiliumCNIProvider is the name of the Cilium CNI provider.
CiliumCNIProvider = "cilium"
// Generic timeouts and intervals for Cilium tests
CiliumDefaultTimeout = 10 * time.Minute
CiliumLongTimeout = 20 * time.Minute
CiliumShortTimeout = 2 * time.Minute
CiliumDefaultPollInterval = 10 * time.Second
CiliumLongPollInterval = 15 * time.Second
CiliumConnectivityWaitDuration = 60 * time.Second
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Guard CiliumVersion (env) before building asset paths; MustAsset will panic otherwise.

Proposed fix
 func InstallCilium(t *testing.T, ctx context.Context, guestClient crclient.Client, hostedCluster *hyperv1.HostedCluster, reader assets.AssetReader) {
 	t.Run("InstallCilium", func(t *testing.T) {
+		if CiliumVersion == "" {
+			t.Skip("CILIUM_VERSION is not set; skipping Cilium installation")
+		}
+		if reader == nil {
+			t.Fatalf("asset reader is required for Cilium installation")
+		}
 		if !azureutil.IsAroHCP() {
 			t.Skip("test only supported on ARO HCP clusters")
 		}
@@
 			for _, filename := range manifestFiles {
 				manifestPath := fmt.Sprintf("assets/cilium/v%s/%s", CiliumVersion, filename)
 				t.Logf("Applying manifest from %s", manifestPath)
-				yamlContent := assets.MustAsset(reader, manifestPath)
+				yamlContent := assets.MustAsset(reader, manifestPath)
 				err := ApplyYAMLBytes(ctx, guestClient, yamlContent)
 				g.Expect(err).ToNot(HaveOccurred(), "failed to apply manifest from %s", manifestPath)
 			}

Also applies to: 137-205

🤖 Prompt for AI Agents
In @test/e2e/util/cilium.go around lines 33 - 48, The code uses CiliumVersion
(env) to build asset paths via MustAsset which will panic if the asset path is
invalid; update places that construct assets (any code calling MustAsset or
building paths with CiliumVersion, e.g., the functions/blocks around the current
top-level CiliumVersion var and the usages later at lines ~137-205) to first
guard that CiliumVersion != "" and fail fast or skip tests when empty;
specifically, before any call to MustAsset or any asset path construction that
interpolates CiliumVersion, add a check that returns an error/skips the test (or
uses a safe lookup that returns an error) instead of calling MustAsset directly
so you avoid a panic.

Comment thread test/e2e/util/cilium.go
Comment on lines +334 to +486
// EnsureCiliumConnectivityTestResources performs the Cilium connectivity tests.
// It returns a cleanup function.
func EnsureCiliumConnectivityTestResources(t *testing.T, ctx context.Context, guestClient crclient.Client, reader assets.AssetReader) func() {
t.Run("CheckCiliumPodsRunning", func(t *testing.T) {
g := NewWithT(t)

t.Log("Check cilium pods to be running")
g.Eventually(func(g Gomega) {
podList := &corev1.PodList{}
err := guestClient.List(ctx, podList, crclient.InNamespace(ciliumNamespace))
g.Expect(err).NotTo(HaveOccurred(), "failed to list Cilium pods")

g.Expect(podList.Items).NotTo(BeEmpty(), "no Cilium pods found yet")
t.Logf("Found %d Cilium pods", len(podList.Items))

// Check all pods are Ready
g.Expect(podList.Items).To(HaveEach(
HaveField("Status.Conditions", ContainElement(
And(
HaveField("Type", corev1.PodReady),
HaveField("Status", corev1.ConditionTrue),
),
)),
), "not all Cilium pods are ready")
}, CiliumLongTimeout, CiliumDefaultPollInterval).Should(Succeed(), "all Cilium pods should be running")

t.Log("All Cilium pods are running")
})

t.Run("CreateSecurityContextConstraints", func(t *testing.T) {
g := NewWithT(t)

t.Log("Creating SecurityContextConstraints for Cilium connectivity test")
scc := &securityv1.SecurityContextConstraints{
ObjectMeta: metav1.ObjectMeta{
Name: "cilium-test",
},
AllowHostPorts: true,
AllowHostNetwork: true,
AllowHostDirVolumePlugin: false,
AllowHostIPC: false,
AllowHostPID: false,
AllowPrivilegeEscalation: ptrBool(false),
AllowPrivilegedContainer: false,
ReadOnlyRootFilesystem: false,
RequiredDropCapabilities: []corev1.Capability{},
RunAsUser: securityv1.RunAsUserStrategyOptions{Type: securityv1.RunAsUserStrategyMustRunAsRange},
SELinuxContext: securityv1.SELinuxContextStrategyOptions{Type: securityv1.SELinuxStrategyMustRunAs},
Users: []string{fmt.Sprintf("system:serviceaccount:%s:%s", ciliumTestNamespace, ciliumTestServiceAccount)},
}

err := guestClient.Create(ctx, scc)
if err != nil && !apierrors.IsAlreadyExists(err) {
g.Expect(err).ToNot(HaveOccurred(), "failed to create SCC")
}

t.Log("SecurityContextConstraints created successfully")
})

t.Run("CreateTestNamespace", func(t *testing.T) {
g := NewWithT(t)

t.Log("Creating cilium-test namespace")
ns := &corev1.Namespace{
ObjectMeta: metav1.ObjectMeta{
Name: ciliumTestNamespace,
Labels: map[string]string{"security.openshift.io/scc.podSecurityLabelSync": "false",
"pod-security.kubernetes.io/enforce": "privileged",
"pod-security.kubernetes.io/audit": "privileged",
"pod-security.kubernetes.io/warn": "privileged",
},
},
}

err := guestClient.Create(ctx, ns)
if err != nil && !apierrors.IsAlreadyExists(err) {
g.Expect(err).ToNot(HaveOccurred(), "failed to create namespace")
}

t.Log("Test namespace created successfully")
})

t.Run("DeployCiliumConnectivityTest", func(t *testing.T) {
g := NewWithT(t)

t.Logf("Deploying Cilium connectivity test from version %s", CiliumVersion)

manifestPath := fmt.Sprintf("assets/cilium/v%s/connectivity-check.yaml", CiliumVersion)
yamlContent := assets.MustAsset(reader, manifestPath)

err := ApplyYAMLBytes(ctx, guestClient, yamlContent, ciliumTestNamespace)
g.Expect(err).ToNot(HaveOccurred(), "failed to apply connectivity test manifest")

t.Log("Connectivity test manifests applied successfully")
})

t.Run("WaitForConnectivityTestPodsReady", func(t *testing.T) {
g := NewWithT(t)

t.Log("Waiting for connectivity test pods to be ready")
g.Eventually(func(g Gomega) {
podList := &corev1.PodList{}
err := guestClient.List(ctx, podList, crclient.InNamespace(ciliumTestNamespace))
g.Expect(err).NotTo(HaveOccurred(), "failed to list pods")
g.Expect(podList.Items).NotTo(BeEmpty(), "no pods found in cilium-test namespace yet")

t.Logf("Found %d pods in cilium-test namespace", len(podList.Items))

// Check all pods are Ready
g.Expect(podList.Items).To(HaveEach(
HaveField("Status.Conditions", ContainElement(
And(
HaveField("Type", corev1.PodReady),
HaveField("Status", corev1.ConditionTrue),
),
)),
), "some pods are not ready")
}, CiliumDefaultTimeout, CiliumLongPollInterval).Should(Succeed(), "all connectivity test pods should be ready")

t.Log("All connectivity test pods are ready")
})

t.Run("WaitForConnectivityTestCompletion", func(t *testing.T) {
g := NewWithT(t)

t.Logf("Waiting %v for connectivity tests to run", CiliumConnectivityWaitDuration)
time.Sleep(CiliumConnectivityWaitDuration)

t.Log("Verifying all test pods are still running")
podList := &corev1.PodList{}
err := guestClient.List(ctx, podList, crclient.InNamespace(ciliumTestNamespace))
g.Expect(err).ToNot(HaveOccurred(), "should be able to list test pods")

failedPods := []string{}
for _, pod := range podList.Items {
if pod.Status.Phase != corev1.PodRunning {
failedPods = append(failedPods, fmt.Sprintf("%s (phase: %s)", pod.Name, pod.Status.Phase))
}
}

if len(failedPods) > 0 {
t.Errorf("Found %d pods not in Running phase: %v", len(failedPods), failedPods)
} else {
t.Logf("All %d connectivity test pods are running successfully", len(podList.Items))
}
}) // Closes the anonymous function for "WaitForConnectivityTestCompletion"

t.Log("Cilium connectivity test completed successfully")

return func() {
CleanupCiliumConnectivityTestResources(ctx, t, guestClient)
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# First, find the connectivity-check.yaml file referenced in the code
find . -name "connectivity-check.yaml" -type f 2>/dev/null

Repository: openshift/hypershift

Length of output: 184


🌐 Web query:

Cilium connectivity-check.yaml manifest Jobs Pods expected behavior

💡 Result:

Summary of what connectivity-check.yaml does and the expected behavior

  • What the manifest deploys

    • A dedicated test namespace (commonly cilium-test) plus multiple Pods/Deployments that act as servers, clients, and host-netns helpers, Services for those servers, and CiliumNetworkPolicy objects used by the tests. It also creates Kubernetes Jobs (or test controller Pods) that orchestrate the connectivity scenarios and verify results. [1][2]
  • Expected runtime behavior

    • Server Pods start and become Running and Ready (1/1). Services become addressable. Client Pods are created and attempt connections (curl, ping) to the servers across the different scenarios (pod-to-pod, pod-to-service, inter-node, encryption, etc.). Host-netns/helper Pods run tcpdump/host checks where needed. The test Jobs run the sequence of checks and complete with success if all checks pass; their Pods exit successfully (Job completes). If all tests pass, you should see all test Pods Running/Ready during checks and Jobs showing Completed. [1][2][3]
  • How to interpret results

    • Jobs Completed (exit code 0) = connectivity scenarios passed.
    • Failed Jobs or Pods in CrashLoop/Failed = specific scenario failures (look at Job/Pod logs for failing curl/ping/tcpdump actions).
    • If Pods never become Ready or tcpdump/host-netns fails, the test will fail. Typical causes: network policies present in the namespace, Cilium not managing pod networking (pods started before Cilium), missing node-to-node ports/firewall rules, or cluster-wide policies interfering. [1][2][4]
  • Useful debugging steps

    • Run in an isolated namespace with no other NetworkPolicy/Cilium policies.
    • kubectl -n get pods, get jobs, kubectl logs and kubectl describe on failing pods/jobs.
    • Check Cilium agent logs and that node-to-node ports (e.g., VXLAN/Geneve port) and ICMP are allowed if required. Use cilium connectivity test (CLI) for higher-level orchestration which also collects debugging info. [1][3][4]

Sources
[1] Cilium docs — Connectivity tests / troubleshooting.
[2] Cilium example connectivity-check/connectivity-check.yaml (examples & description).
[3] cilium connectivity test CLI docs.
[4] Troubleshooting notes about common failures (Cilium docs).


Connectivity test check incorrectly fails on completed Jobs.
The connectivity-check.yaml manifest deploys test Jobs that complete successfully (Succeeded phase) after all checks pass. However, the code at lines 434–443 reports any pod not in Running phase as a failure, which will incorrectly flag successfully completed test Jobs as failures. Update the pod-phase check to accept Running, Succeeded, or completed states as valid outcomes, or exclude completed Jobs from the failure check.

🤖 Prompt for AI Agents
In @test/e2e/util/cilium.go around lines 334 - 486, In
EnsureCiliumConnectivityTestResources inside the
"WaitForConnectivityTestCompletion" block update the pod phase validation (the
loop iterating over podList.Items) so it does not treat completed Job pods as
failures: change the failure condition that currently checks pod.Status.Phase !=
corev1.PodRunning to accept corev1.PodSucceeded as a valid terminal success
(e.g., treat pods with Phase Running OR Succeeded as OK), or alternatively skip
pods owned by Job resources; adjust the construction of failedPods accordingly
so only pods not in Running or Succeeded are reported as failures.

Comment thread test/e2e/util/util.go Outdated
Comment on lines +4162 to +4190
// WaitForDeploymentAvailable waits for a deployment to be ready.
func WaitForDeploymentAvailable(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
g := NewWithT(t)
t.Logf("Waiting for deployment %s/%s to be ready", namespace, name)
g.Eventually(func() bool {
deployment := &appsv1.Deployment{}
err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, deployment)
if err != nil {
t.Logf("Failed to get deployment %s/%s: %v", namespace, name, err)
return false
}
return deployment.Status.ReadyReplicas > 0
}, timeout, interval).Should(BeTrue(), fmt.Sprintf("deployment %s/%s should be ready", namespace, name))
}

// WaitForDaemonSetReady waits for a DaemonSet to be ready.
func WaitForDaemonSetReady(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
g := NewWithT(t)
t.Logf("Waiting for DaemonSet %s/%s to be ready", namespace, name)
g.Eventually(func() bool {
ds := &appsv1.DaemonSet{}
err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, ds)
if err != nil {
t.Logf("Failed to get DaemonSet %s/%s: %v", namespace, name, err)
return false
}
return ds.Status.NumberReady > 0
}, timeout, interval).Should(BeTrue(), fmt.Sprintf("DaemonSet %s/%s should be ready", namespace, name))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Wait helpers are too weak (> 0 readiness) and can mask partial rollouts.
These helpers are used as “readiness gates”; they should typically assert full availability (observed generation, desired vs ready).

Proposed fix
 func WaitForDeploymentAvailable(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
 	g := NewWithT(t)
 	t.Logf("Waiting for deployment %s/%s to be ready", namespace, name)
 	g.Eventually(func() bool {
 		deployment := &appsv1.Deployment{}
 		err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, deployment)
 		if err != nil {
 			t.Logf("Failed to get deployment %s/%s: %v", namespace, name, err)
 			return false
 		}
-		return deployment.Status.ReadyReplicas > 0
+		desired := int32(1)
+		if deployment.Spec.Replicas != nil {
+			desired = *deployment.Spec.Replicas
+		}
+		return deployment.Status.ObservedGeneration >= deployment.Generation &&
+			deployment.Status.AvailableReplicas >= desired
 	}, timeout, interval).Should(BeTrue(), fmt.Sprintf("deployment %s/%s should be ready", namespace, name))
 }
@@
 func WaitForDaemonSetReady(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
 	g := NewWithT(t)
 	t.Logf("Waiting for DaemonSet %s/%s to be ready", namespace, name)
 	g.Eventually(func() bool {
 		ds := &appsv1.DaemonSet{}
 		err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, ds)
 		if err != nil {
 			t.Logf("Failed to get DaemonSet %s/%s: %v", namespace, name, err)
 			return false
 		}
-		return ds.Status.NumberReady > 0
+		return ds.Status.ObservedGeneration >= ds.Generation &&
+			ds.Status.UpdatedNumberScheduled == ds.Status.DesiredNumberScheduled &&
+			ds.Status.NumberReady == ds.Status.DesiredNumberScheduled
 	}, timeout, interval).Should(BeTrue(), fmt.Sprintf("DaemonSet %s/%s should be ready", namespace, name))
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// WaitForDeploymentAvailable waits for a deployment to be ready.
func WaitForDeploymentAvailable(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
g := NewWithT(t)
t.Logf("Waiting for deployment %s/%s to be ready", namespace, name)
g.Eventually(func() bool {
deployment := &appsv1.Deployment{}
err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, deployment)
if err != nil {
t.Logf("Failed to get deployment %s/%s: %v", namespace, name, err)
return false
}
return deployment.Status.ReadyReplicas > 0
}, timeout, interval).Should(BeTrue(), fmt.Sprintf("deployment %s/%s should be ready", namespace, name))
}
// WaitForDaemonSetReady waits for a DaemonSet to be ready.
func WaitForDaemonSetReady(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
g := NewWithT(t)
t.Logf("Waiting for DaemonSet %s/%s to be ready", namespace, name)
g.Eventually(func() bool {
ds := &appsv1.DaemonSet{}
err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, ds)
if err != nil {
t.Logf("Failed to get DaemonSet %s/%s: %v", namespace, name, err)
return false
}
return ds.Status.NumberReady > 0
}, timeout, interval).Should(BeTrue(), fmt.Sprintf("DaemonSet %s/%s should be ready", namespace, name))
}
// WaitForDeploymentAvailable waits for a deployment to be ready.
func WaitForDeploymentAvailable(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
g := NewWithT(t)
t.Logf("Waiting for deployment %s/%s to be ready", namespace, name)
g.Eventually(func() bool {
deployment := &appsv1.Deployment{}
err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, deployment)
if err != nil {
t.Logf("Failed to get deployment %s/%s: %v", namespace, name, err)
return false
}
desired := int32(1)
if deployment.Spec.Replicas != nil {
desired = *deployment.Spec.Replicas
}
return deployment.Status.ObservedGeneration >= deployment.Generation &&
deployment.Status.AvailableReplicas >= desired
}, timeout, interval).Should(BeTrue(), fmt.Sprintf("deployment %s/%s should be ready", namespace, name))
}
// WaitForDaemonSetReady waits for a DaemonSet to be ready.
func WaitForDaemonSetReady(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) {
g := NewWithT(t)
t.Logf("Waiting for DaemonSet %s/%s to be ready", namespace, name)
g.Eventually(func() bool {
ds := &appsv1.DaemonSet{}
err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, ds)
if err != nil {
t.Logf("Failed to get DaemonSet %s/%s: %v", namespace, name, err)
return false
}
return ds.Status.ObservedGeneration >= ds.Generation &&
ds.Status.UpdatedNumberScheduled == ds.Status.DesiredNumberScheduled &&
ds.Status.NumberReady == ds.Status.DesiredNumberScheduled
}, timeout, interval).Should(BeTrue(), fmt.Sprintf("DaemonSet %s/%s should be ready", namespace, name))
}
🤖 Prompt for AI Agents
In @test/e2e/util/util.go around lines 4162 - 4190, The current wait helpers
only check for >0 readiness which can hide partial rollouts; update
WaitForDeploymentAvailable to fetch the Deployment, compute desiredReplicas :=
int32(1) or use *deployment.Spec.Replicas if set, and assert
deployment.Status.ObservedGeneration >= deployment.Generation AND
deployment.Status.AvailableReplicas == desiredReplicas (or ReadyReplicas ==
desiredReplicas), returning true only when fully available; similarly update
WaitForDaemonSetReady to assert ds.Status.ObservedGeneration >= ds.Generation
AND ds.Status.NumberReady == ds.Status.DesiredNumberScheduled (or
NumberAvailable == DesiredNumberScheduled) so partial readiness fails; include
the error and status details in the t.Logf messages for easier debugging and
keep the NewWithT(...).Eventually checks but replace the >0 checks with these
full-availability conditions.

Comment thread test/e2e/util/util.go Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
test/e2e/util/hypershift_framework.go (1)

160-165: Use h.ctx instead of context.Background() for proper cancellation propagation.

These calls use context.Background() which bypasses the test framework's cancellation. This means operations won't respect test timeouts or abort signals, potentially leaving resources orphaned if the test is cancelled.

🔧 Proposed fix
-			guestClient := WaitForGuestClient(t, context.Background(), h.client, hostedCluster)
-			InstallCilium(t, context.Background(), guestClient, hostedCluster, h.assetReader)
+			guestClient := WaitForGuestClient(t, h.ctx, h.client, hostedCluster)
+			InstallCilium(t, h.ctx, guestClient, hostedCluster, h.assetReader)
 			// wait hosted cluster ready
-			WaitForNReadyNodes(t, context.Background(), guestClient, opts.NodePoolReplicas, platform)
-			WaitForImageRollout(t, context.Background(), h.client, hostedCluster)
-			ValidateHostedClusterConditions(t, context.Background(), h.client, hostedCluster, true, 10*time.Minute)
+			WaitForNReadyNodes(t, h.ctx, guestClient, opts.NodePoolReplicas, platform)
+			WaitForImageRollout(t, h.ctx, h.client, hostedCluster)
+			ValidateHostedClusterConditions(t, h.ctx, h.client, hostedCluster, true, 10*time.Minute)
test/e2e/util/cilium.go (2)

33-36: Guard ciliumVersion before use to prevent panics.

ciliumVersion is read from environment at package init time. If CILIUM_VERSION is unset, assets.MustAsset will panic when constructing paths like assets/cilium/v/.... The skip at line 141-143 only checks IsAroHCP(), not the version.

🔧 Proposed fix - add guard in InstallCilium
 func InstallCilium(t *testing.T, ctx context.Context, guestClient crclient.Client, hostedCluster *hyperv1.HostedCluster, reader assets.AssetReader) {
 	t.Run("InstallCilium", func(t *testing.T) {
+		if ciliumVersion == "" {
+			t.Skip("CILIUM_VERSION is not set; skipping Cilium installation")
+		}
 		if !azureutil.IsAroHCP() {
 			t.Skip("test only supported on ARO HCP clusters")
 		}

467-478: Pod phase check incorrectly fails completed Jobs.

The connectivity-check.yaml manifest includes Kubernetes Jobs that complete successfully with Succeeded phase. The current check at line 469 reports any pod not in Running phase as a failure, which incorrectly flags successfully completed test Jobs.

🔧 Proposed fix - accept Succeeded phase for Job pods
 		failedPods := []string{}
 		for _, pod := range podList.Items {
-			if pod.Status.Phase != corev1.PodRunning {
+			if pod.Status.Phase != corev1.PodRunning && pod.Status.Phase != corev1.PodSucceeded {
 				failedPods = append(failedPods, fmt.Sprintf("%s (phase: %s)", pod.Name, pod.Status.Phase))
 			}
 		}
🧹 Nitpick comments (3)
test/e2e/util/cilium.go (1)

542-544: Use k8s.io/utils/ptr instead of custom ptrBool helper.

The codebase already imports k8s.io/utils/ptr (line 28) and uses ptr.To(true) at line 93. Replace this custom helper with ptr.To[bool] for consistency.

♻️ Proposed fix
-func ptrBool(b bool) *bool {
-	return &b
-}

Then update line 376:

-			AllowPrivilegeEscalation: ptrBool(false),
+			AllowPrivilegeEscalation: ptr.To(false),
test/e2e/cilium_network_policy_test.go (2)

39-41: Use the ciliumCNIProvider constant instead of string literal.

The constant ciliumCNIProvider is defined in cilium.go (line 40) but is unexported. Either export it as CiliumCNIProvider and use it here, or ensure consistency across the codebase.

♻️ Suggested approach

In cilium.go, export the constant:

-	ciliumCNIProvider = "cilium"
+	CiliumCNIProvider = "cilium"

Then in this file:

-	if globalOpts.ExternalCNIProvider != "cilium" {
-		t.Skipf("skip cilium connection test if e2e.external-cni-provider is not %s", "cilium")
+	if globalOpts.ExternalCNIProvider != e2eutil.CiliumCNIProvider {
+		t.Skipf("skip cilium connection test if e2e.external-cni-provider is not %s", e2eutil.CiliumCNIProvider)
 	}

43-46: Redundant platform check - already verified at line 35.

The IsAroHCP() check at line 44 is performed after the Azure platform check at line 35. If the intent is to also verify ARO HCP specifically, consider combining these checks or clarifying why both are needed.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 54307b7 and fbbd763.

📒 Files selected for processing (8)
  • .tekton/hypershift-cli-mce-211-pull-request.yaml
  • .tekton/hypershift-cli-mce-211-push.yaml
  • .tekton/hypershift-release-mce-211-pull-request.yaml
  • .tekton/hypershift-release-mce-211-push.yaml
  • test/e2e/cilium_network_policy_test.go
  • test/e2e/e2e_test.go
  • test/e2e/util/cilium.go
  • test/e2e/util/hypershift_framework.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/cilium_network_policy_test.go
  • test/e2e/util/hypershift_framework.go
  • test/e2e/e2e_test.go
  • test/e2e/util/cilium.go
🧬 Code graph analysis (1)
test/e2e/util/hypershift_framework.go (3)
test/e2e/util/external_oidc.go (1)
  • ExtOIDCConfig (53-70)
test/e2e/util/util.go (3)
  • WaitForGuestClient (324-372)
  • WaitForNReadyNodes (444-446)
  • WaitForImageRollout (590-631)
test/e2e/util/cilium.go (1)
  • InstallCilium (139-332)
🪛 golangci-lint (2.5.0)
test/e2e/util/cilium.go

[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test]
sync-global-pullsecret/sync-global-pullsecret_test.go:228:23: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:234:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:247:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:257:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:270:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:283:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:296:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:309:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:327:12: undefined: NewMockdbusConn

(typecheck)

🔇 Additional comments (8)
.tekton/hypershift-cli-mce-211-push.yaml (1)

40-48: LGTM — consistent with other pipeline updates in this PR.

.tekton/hypershift-release-mce-211-push.yaml (1)

43-51: LGTM — consistent with other pipeline updates in this PR.

.tekton/hypershift-release-mce-211-pull-request.yaml (1)

40-48: LGTM — consistent with other pipeline updates in this PR.

All four MCE 2.11 Tekton pipeline configurations now consistently reference pipelines/common.yaml.

.tekton/hypershift-cli-mce-211-pull-request.yaml (1)

40-48: LGTM — migration to maintained common pipeline.

The switch from pipelines/common_mce_2.11.yaml to pipelines/common.yaml is valid. The external pipeline reference (https://github.com/stolostron/konflux-build-catalog/main/pipelines/common.yaml) exists and is a well-maintained Tekton pipeline for multi-arch container builds.

test/e2e/util/hypershift_framework.go (1)

82-85: LGTM on the fluent builder pattern.

The WithAssetReader method follows idiomatic Go patterns for optional configuration and integrates cleanly with the existing test harness.

test/e2e/util/cilium.go (1)

67-116: LGTM on namespace and SCC manifest definitions.

The PodSecurity labels and SCC configuration follow OpenShift best practices for privileged workloads like Cilium.

test/e2e/e2e_test.go (1)

93-93: LGTM on the new CLI flag.

The flag binding is consistent with other platform-specific flags and the description clearly indicates supported providers.

test/e2e/cilium_network_policy_test.go (1)

49-51: No issues found. The code correctly uses content.ReadFile, which is a package-level variable declared in test/e2e/assets.go and accessible throughout the e2e package without an explicit import.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment thread test/e2e/util/cilium.go
Comment on lines +416 to +428
t.Run("DeployCiliumConnectivityTest", func(t *testing.T) {
g := NewWithT(t)

t.Logf("Deploying Cilium connectivity test from version %s", ciliumVersion)

manifestPath := fmt.Sprintf("assets/cilium/v%s/connectivity-check.yaml", ciliumVersion)
yamlContent := assets.MustAsset(reader, manifestPath)

err := ApplyYAMLBytes(ctx, guestClient, yamlContent, ciliumTestNamespace)
g.Expect(err).ToNot(HaveOccurred(), "failed to apply connectivity test manifest")

t.Log("Connectivity test manifests applied successfully")
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Add ciliumVersion guard before accessing assets in connectivity test.

Similar to InstallCilium, this function uses ciliumVersion at line 421 without validation, which will cause a panic if the environment variable is unset.

🔧 Proposed fix
 	t.Run("DeployCiliumConnectivityTest", func(t *testing.T) {
 		g := NewWithT(t)
 
+		if ciliumVersion == "" {
+			t.Skip("CILIUM_VERSION is not set; skipping connectivity test deployment")
+		}
+
 		t.Logf("Deploying Cilium connectivity test from version %s", ciliumVersion)
 
 		manifestPath := fmt.Sprintf("assets/cilium/v%s/connectivity-check.yaml", ciliumVersion)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
t.Run("DeployCiliumConnectivityTest", func(t *testing.T) {
g := NewWithT(t)
t.Logf("Deploying Cilium connectivity test from version %s", ciliumVersion)
manifestPath := fmt.Sprintf("assets/cilium/v%s/connectivity-check.yaml", ciliumVersion)
yamlContent := assets.MustAsset(reader, manifestPath)
err := ApplyYAMLBytes(ctx, guestClient, yamlContent, ciliumTestNamespace)
g.Expect(err).ToNot(HaveOccurred(), "failed to apply connectivity test manifest")
t.Log("Connectivity test manifests applied successfully")
})
t.Run("DeployCiliumConnectivityTest", func(t *testing.T) {
g := NewWithT(t)
if ciliumVersion == "" {
t.Skip("CILIUM_VERSION is not set; skipping connectivity test deployment")
}
t.Logf("Deploying Cilium connectivity test from version %s", ciliumVersion)
manifestPath := fmt.Sprintf("assets/cilium/v%s/connectivity-check.yaml", ciliumVersion)
yamlContent := assets.MustAsset(reader, manifestPath)
err := ApplyYAMLBytes(ctx, guestClient, yamlContent, ciliumTestNamespace)
g.Expect(err).ToNot(HaveOccurred(), "failed to apply connectivity test manifest")
t.Log("Connectivity test manifests applied successfully")
})

@wewang58 wewang58 force-pushed the cntr_1137 branch 2 times, most recently from 4770c6d to bba7a39 Compare January 14, 2026 14:48
@mgencur
Copy link
Copy Markdown
Contributor

mgencur commented Jan 14, 2026

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jan 14, 2026
@mgencur
Copy link
Copy Markdown
Contributor

mgencur commented Jan 14, 2026

@wewang58 Thanks for the updates!

@openshift-ci-robot
Copy link
Copy Markdown

@wewang58: This PR has been marked as verified by @wewang58 with debug pr(https://github.com/openshift/release/pull/72642).

Details

In response to this:

/verified by @wewang58 with debug pr(openshift/release#72642)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

/retest-required

Remaining retests: 0 against base HEAD dec3e5e and 2 for PR HEAD 664cd2e in total

@wewang58
Copy link
Copy Markdown
Contributor Author

wewang58 commented Jan 28, 2026

/override "Red Hat Konflux / hypershift-operator-main-enterprise-contract / hypershift-operator-main"
/override "Red Hat Konflux / hypershift-operator-enterprise-contract / hypershift-operator-main"
/override "Red Hat Konflux / enterprise-contract-mce-211 / hypershift-cli-mce-211"

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 28, 2026

@wewang58: wewang58 unauthorized: /override is restricted to Repo administrators, approvers in top level OWNERS file, and the following github teams:openshift: openshift-release-oversight openshift-staff-engineers openshift-sustaining-engineers.

Details

In response to this:

/override "Red Hat Konflux / hypershift-operator-main-enterprise-contract / hypershift-operator-main"
/override "Red Hat Konflux / hypershift-operator-enterprise-contract / hypershift-operator-main"
/override "[Red Hat Konflux / enterprise-contract-mce-211 / hypershift-cli-mce-211"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@wewang58
Copy link
Copy Markdown
Contributor Author

wewang58 commented Jan 28, 2026

@jparrill @mgencur Seems need to override the failed jobs, then the pr can be merged, do you have permission to help do it? thanks!

@wewang58
Copy link
Copy Markdown
Contributor Author

/retest

@mgencur
Copy link
Copy Markdown
Contributor

mgencur commented Jan 29, 2026

@wewang58 I don't have permissions. @celebdor has.

@wewang58
Copy link
Copy Markdown
Contributor Author

/retest-required

@bryan-cox
Copy link
Copy Markdown
Member

@wewang58 if you rebase, those konflux jobs should pass

@wewang58
Copy link
Copy Markdown
Contributor Author

@wewang58 if you rebase, those konflux jobs should pass

It will lose approve and lgtm right? will ask others again

@mgencur
Copy link
Copy Markdown
Contributor

mgencur commented Jan 29, 2026

Let's rebase. I will put lgtm after that.

@wewang58
Copy link
Copy Markdown
Contributor Author

Let's rebase. I will put lgtm after that.

Thanks, i will rebase it.

Why:
To enable e2e testing of HyperShift clusters using Cilium as the CNI provider.

How:
- Added `ExternalCNIProvider` flag and option to the e2e framework.
- Implemented logic to install Cilium and skip default network validation when enabled.
- Added utility functions `ApplyYAMLFromURL` and wait helpers for Deployments/DaemonSets.
- Included Cilium-specific installation logic and network policy tests.
…utilities

Refactors Cilium test utilities to use support/assets.AssetReader pattern
instead of direct file path manipulation. Makes internal constants and
helper functions private, and adds WithAssetReader() method to the test
framework for proper asset injection.
The previous pipeline 'pipelines/common_mce_2.11.yaml' uses tasks
that are now expired/unsupported. Switched to
'pipelines/common.yaml' which is up-to-date.
…plication

Consolidate wait functions and use Eventually patterns for more
reliable test polling.

Signed-off-by: wen wang <wewang@redhat.com>
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Jan 29, 2026
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Jan 29, 2026
@mgencur
Copy link
Copy Markdown
Contributor

mgencur commented Jan 29, 2026

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jan 29, 2026
@bryan-cox
Copy link
Copy Markdown
Member

/retest-required

@bryan-cox
Copy link
Copy Markdown
Member

/test e2e-aws-4-21

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 30, 2026

@wewang58: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@wewang58
Copy link
Copy Markdown
Contributor Author

/verified by @wewang58

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jan 30, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@wewang58: This PR has been marked as verified by @wewang58.

Details

In response to this:

/verified by @wewang58

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot openshift-merge-bot Bot merged commit 100d202 into openshift:main Jan 30, 2026
20 checks passed
@mgencur
Copy link
Copy Markdown
Contributor

mgencur commented Jan 30, 2026

Yay!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/platform/azure PR/issue for Azure (AzurePlatform) platform area/testing Indicates the PR includes changes for e2e testing jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants