Skip to content

CNTRLPLANE-2888: Add guest cluster client to v2 TestContext#7877

Draft
csrwng wants to merge 1 commit intoopenshift:mainfrom
csrwng:guest-cluster-client
Draft

CNTRLPLANE-2888: Add guest cluster client to v2 TestContext#7877
csrwng wants to merge 1 commit intoopenshift:mainfrom
csrwng:guest-cluster-client

Conversation

@csrwng
Copy link
Contributor

@csrwng csrwng commented Mar 6, 2026

What this PR does / why we need it:

Adds a lazy-loaded guest cluster client (GetGuestClient()) to the v2 TestContext so that tests requiring access to the guest (hosted) cluster API can obtain a crclient.Client without each test reimplementing the kubeconfig retrieval logic.

Also adds a GetClientFromConfig() helper to e2eutil for creating a client from an explicit REST config, and refactors GetClient() to use it.

Key design decisions:

  • sync.Once lazy-loading: Follows the existing GetHostedCluster() pattern
  • Short timeout (2 min): Unlike v1 (10 min), the cluster is already running; we only need to handle DNS propagation
  • QPS: -1, Burst: -1: Matches v1 pattern — we are the only client in test, so no throttling needed
  • Panic on failure: Matches GetHostedCluster() pattern — a test cannot proceed without a guest client
  • Discovery API connectivity check: Simple ServerVersion() call verifies the API server is reachable
  • Last error preservation: Timeout panic includes the last concrete error for easier CI triage

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-2888

Special notes for your reviewer:

  • GetClientFromConfig() creates a client from a REST config while reusing the standard test scheme, with nil-config validation
  • GetClient() was refactored to delegate to GetClientFromConfig() to reduce duplication
  • This is a prerequisite for CNTRLPLANE-2864, CNTRLPLANE-2865, and Epic 4 lifecycle tests

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Tests
    • Enhanced test infrastructure for cluster client initialization, improving reliability of end-to-end testing for multi-cluster scenarios.

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 6, 2026
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 6, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 6, 2026

@csrwng: This pull request references CNTRLPLANE-2888 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Adds a lazy-loaded guest cluster client (GetGuestClient()) to the v2 TestContext so that tests requiring access to the guest (hosted) cluster API can obtain a crclient.Client without each test reimplementing the kubeconfig retrieval logic.

Key design decisions:

  • sync.Once lazy-loading: Follows the existing GetHostedCluster() pattern
  • Short timeout (2 min): Unlike v1 (10 min), the cluster is already running; we only need to handle DNS propagation
  • QPS: -1, Burst: -1: Matches v1 pattern — we are the only client in test, so no throttling needed
  • Panic on failure: Matches GetHostedCluster() pattern — a test cannot proceed without a guest client
  • Discovery API connectivity check: Simple ServerVersion() call verifies the API server is reachable

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-2888

Special notes for your reviewer:

  • The GetClientFromConfig() helper was added to e2eutil to create a client from a REST config while reusing the standard test scheme
  • A smoke test (guest_cluster_test.go) validates the guest client by listing namespaces on the guest cluster
  • This is a prerequisite for CNTRLPLANE-2864, CNTRLPLANE-2865, and Epic 4 lifecycle tests

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

🤖 Generated with Claude Code via /jira:solve [CNTRLPLANE-2888](https://issues.redhat.com/browse/CNTRLPLANE-2888)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 6, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

Walkthrough

The changes add Kubernetes client initialization helpers for end-to-end testing. A new GetClientFromConfig function extracts client creation logic in the util package, while a GetGuestClient method on TestContext implements lazy-initialized guest cluster client retrieval with DNS readiness checks and retry logic.

Changes

Cohort / File(s) Summary
Client Helper Extraction
test/e2e/util/client.go
Adds GetClientFromConfig(config *rest.Config) function to construct a controller-runtime client from REST config with nil validation. Existing GetClient function refactored to delegate to this new helper.
Guest Cluster Client Initialization
test/e2e/v2/internal/test_context.go
Adds GetGuestClient() method to TestContext with lazy initialization (sync.Once). Retrieves HostedCluster kubeconfig secret, builds REST config, and creates controller-runtime client with DNS readiness polling and retry logic. Introduces new dependencies and struct fields (guestClient, guestClientOnce).

Sequence Diagram(s)

sequenceDiagram
    participant Test as Test Code
    participant TC as TestContext
    participant Secret as Kubernetes API
    participant Config as REST Config Builder
    participant DNS as DNS Check
    participant Client as Client Initializer

    Test->>TC: GetGuestClient()
    activate TC
    
    rect rgba(100, 200, 255, 0.5)
        Note over TC: Lazy Init (sync.Once)
    end
    
    TC->>Secret: Retrieve HostedCluster kubeconfig secret
    activate Secret
    Secret-->>TC: kubeconfig data
    deactivate Secret
    
    TC->>Config: Build REST config from kubeconfig
    activate Config
    Config-->>TC: *rest.Config
    deactivate Config
    
    rect rgba(255, 200, 100, 0.5)
        Note over TC,DNS: Retry loop with polling
    end
    
    TC->>DNS: Check DNS readiness
    activate DNS
    DNS-->>TC: ready/not ready
    deactivate DNS
    
    TC->>Client: Create controller-runtime client
    activate Client
    Client-->>TC: crclient.Client
    deactivate Client
    
    TC-->>Test: Return guest cluster client
    deactivate TC
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes


Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (2 inconclusive)

Check name Status Explanation Resolution
Stable And Deterministic Test Names ❓ Inconclusive PR objectives mention a smoke test (guest_cluster_test.go) but this test file does not exist in the repository; only utility files are modified. Clarify whether guest_cluster_test.go is included in this PR or planned for a separate PR, then verify test names follow stability requirements.
Test Structure And Quality ❓ Inconclusive PR summary indicates guest_cluster_test.go should be added, but this test file does not exist in the repository for assessment. Verify if guest_cluster_test.go is part of this PR or if the summary was generated from outdated information. Once the test file is available, assess against quality criteria.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and specifically describes the main change: adding a guest cluster client method to the v2 TestContext, which aligns with the primary objective and the core additions in both modified files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added area/testing Indicates the PR includes changes for e2e testing and removed do-not-merge/needs-area labels Mar 6, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 6, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: csrwng

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 6, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 6, 2026

@csrwng: This pull request references CNTRLPLANE-2888 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Adds a lazy-loaded guest cluster client (GetGuestClient()) to the v2 TestContext so that tests requiring access to the guest (hosted) cluster API can obtain a crclient.Client without each test reimplementing the kubeconfig retrieval logic.

Key design decisions:

  • sync.Once lazy-loading: Follows the existing GetHostedCluster() pattern
  • Short timeout (2 min): Unlike v1 (10 min), the cluster is already running; we only need to handle DNS propagation
  • QPS: -1, Burst: -1: Matches v1 pattern — we are the only client in test, so no throttling needed
  • Panic on failure: Matches GetHostedCluster() pattern — a test cannot proceed without a guest client
  • Discovery API connectivity check: Simple ServerVersion() call verifies the API server is reachable

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-2888

Special notes for your reviewer:

  • The GetClientFromConfig() helper was added to e2eutil to create a client from a REST config while reusing the standard test scheme
  • A smoke test (guest_cluster_test.go) validates the guest client by listing namespaces on the guest cluster
  • This is a prerequisite for CNTRLPLANE-2864, CNTRLPLANE-2865, and Epic 4 lifecycle tests

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

🤖 Generated with Claude Code via /jira:solve [CNTRLPLANE-2888](https://issues.redhat.com/browse/CNTRLPLANE-2888)

Summary by CodeRabbit

  • Tests
  • Added support for guest cluster connectivity testing with lazy initialization and retry logic.
  • Introduced helper utilities for constructing Kubernetes clients from REST configurations.
  • Added end-to-end test for guest cluster namespace listing functionality.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
test/e2e/v2/internal/test_context.go (1)

95-99: Use tc.Context for secret retrieval instead of context.Background().

Line 95 should honor suite cancellation/timeouts consistently with the rest of GetGuestClient.

Proposed change
-		err := tc.MgmtClient.Get(context.Background(), crclient.ObjectKey{
+		err := tc.MgmtClient.Get(tc.Context, crclient.ObjectKey{
 			Namespace: hc.Namespace,
 			Name:      hc.Status.KubeConfig.Name,
 		}, &secret)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/v2/internal/test_context.go` around lines 95 - 99, Replace the use
of context.Background() when calling tc.MgmtClient.Get for the secret with the
test context tc.Context so the secret retrieval honors suite
cancellation/timeouts; update the call inside GetGuestClient (the
tc.MgmtClient.Get invocation that populates the local `secret` variable using
hc.Namespace and hc.Status.KubeConfig.Name) to pass tc.Context instead of
context.Background().
test/e2e/v2/tests/guest_cluster_test.go (1)

51-52: Use test context instead of context.Background() for the guest-cluster List call.

Line 51 bypasses suite cancellation/deadline propagation. Using testCtx.Context makes the call terminate with the spec lifecycle.

Proposed change
-			err := guestClient.List(context.Background(), namespaceList, &crclient.ListOptions{})
+			err := guestClient.List(testCtx.Context, namespaceList, &crclient.ListOptions{})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/v2/tests/guest_cluster_test.go` around lines 51 - 52, Replace the
use of context.Background() in the guest cluster list call with the test suite
context so the request observes spec cancellation/deadlines: update the call to
guestClient.List(context.Background(), namespaceList, &crclient.ListOptions{})
to use testCtx.Context (i.e., guestClient.List(testCtx.Context, namespaceList,
&crclient.ListOptions{}) or testCtx.Context() depending on the test helper) so
the List invocation uses the suite's context and terminates with the spec
lifecycle; ensure you reference the same namespaceList and crclient.ListOptions
arguments unchanged and run tests to confirm proper cancellation behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/util/client.go`:
- Around line 42-47: Add an explicit nil check for the incoming config at the
top of GetClientFromConfig: if config == nil return a clear error (e.g.,
fmt.Errorf("config is nil")) before calling crclient.New so callers get an
immediate, descriptive failure instead of a deeper construction error; update
error message to mention GetClientFromConfig and reference the config parameter
to aid debugging.

In `@test/e2e/v2/internal/test_context.go`:
- Around line 116-128: The retry loop using wait.PollUntilContextTimeout
currently swallows concrete errors from GetClientFromConfig and ServerVersion;
modify the anonymous polling func (the closure passed to
wait.PollUntilContextTimeout) to record the last non-nil error into a
surrounding variable (e.g., lastErr) whenever err or apiErr is non-nil, return
false with the error (or at least record it) so it’s preserved, and after
PollUntilContextTimeout returns include that lastErr in the panic message
(replace the current panic(fmt.Sprintf("failed to connect to guest cluster: %v",
err)) with one that prints the timeout plus lastErr). Ensure you reference the
existing symbols: wait.PollUntilContextTimeout, GetClientFromConfig,
discovery.NewDiscoveryClientForConfigOrDie(...).ServerVersion, guestClient, and
err/lastErr.

---

Nitpick comments:
In `@test/e2e/v2/internal/test_context.go`:
- Around line 95-99: Replace the use of context.Background() when calling
tc.MgmtClient.Get for the secret with the test context tc.Context so the secret
retrieval honors suite cancellation/timeouts; update the call inside
GetGuestClient (the tc.MgmtClient.Get invocation that populates the local
`secret` variable using hc.Namespace and hc.Status.KubeConfig.Name) to pass
tc.Context instead of context.Background().

In `@test/e2e/v2/tests/guest_cluster_test.go`:
- Around line 51-52: Replace the use of context.Background() in the guest
cluster list call with the test suite context so the request observes spec
cancellation/deadlines: update the call to
guestClient.List(context.Background(), namespaceList, &crclient.ListOptions{})
to use testCtx.Context (i.e., guestClient.List(testCtx.Context, namespaceList,
&crclient.ListOptions{}) or testCtx.Context() depending on the test helper) so
the List invocation uses the suite's context and terminates with the spec
lifecycle; ensure you reference the same namespaceList and crclient.ListOptions
arguments unchanged and run tests to confirm proper cancellation behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d9b80c80-c824-491d-9552-80795ac9a0ec

📥 Commits

Reviewing files that changed from the base of the PR and between cca8038 and 35c58e8.

📒 Files selected for processing (3)
  • test/e2e/util/client.go
  • test/e2e/v2/internal/test_context.go
  • test/e2e/v2/tests/guest_cluster_test.go

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 6, 2026

@csrwng: This pull request references CNTRLPLANE-2888 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Adds a lazy-loaded guest cluster client (GetGuestClient()) to the v2 TestContext so that tests requiring access to the guest (hosted) cluster API can obtain a crclient.Client without each test reimplementing the kubeconfig retrieval logic.

Key design decisions:

  • sync.Once lazy-loading: Follows the existing GetHostedCluster() pattern
  • Short timeout (2 min): Unlike v1 (10 min), the cluster is already running; we only need to handle DNS propagation
  • QPS: -1, Burst: -1: Matches v1 pattern — we are the only client in test, so no throttling needed
  • Panic on failure: Matches GetHostedCluster() pattern — a test cannot proceed without a guest client
  • Discovery API connectivity check: Simple ServerVersion() call verifies the API server is reachable

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-2888

Special notes for your reviewer:

  • The GetClientFromConfig() helper was added to e2eutil to create a client from a REST config while reusing the standard test scheme
  • A smoke test (guest_cluster_test.go) validates the guest client by listing namespaces on the guest cluster
  • This is a prerequisite for CNTRLPLANE-2864, CNTRLPLANE-2865, and Epic 4 lifecycle tests

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

🤖 Generated with Claude Code via /jira:solve [CNTRLPLANE-2888](https://issues.redhat.com/browse/CNTRLPLANE-2888)

Summary by CodeRabbit

  • Chores
  • Added a public helper to construct Kubernetes clients from REST configs with explicit input validation and clear error reporting.
  • Tests
  • Added support for guest-cluster connectivity testing with lazy initialization, retry and discovery checks.
  • Introduced an end-to-end test that verifies namespace listing against the guest cluster.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/e2e/v2/internal/test_context.go (2)

95-99: Use the test context instead of context.Background() for secret fetch.

At Line 95, using context.Background() bypasses cancellation/deadline from the suite context and can prolong stuck API calls. Since TestContext embeds context.Context, use tc directly instead.

Suggested patch
-		err := tc.MgmtClient.Get(context.Background(), crclient.ObjectKey{
+		err := tc.MgmtClient.Get(tc, crclient.ObjectKey{
 			Namespace: hc.Namespace,
 			Name:      hc.Status.KubeConfig.Name,
 		}, &secret)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/v2/internal/test_context.go` around lines 95 - 99, The call to
tc.MgmtClient.Get uses context.Background(), which ignores the test suite's
cancellation/deadline; replace context.Background() with the test context (tc)
so the call uses the suite's embedded context and will honour
cancellation/timeouts — update the invocation of MgmtClient.Get (where the
secret is fetched) to pass tc instead of context.Background().

123-126: Avoid OrDie in the retry path.

At line 123, NewDiscoveryClientForConfigOrDie can panic and escape the retry loop; use NewDiscoveryClientForConfig instead to properly report the error via lastErr and continue retrying.

Suggested patch
-			_, apiErr := discovery.NewDiscoveryClientForConfigOrDie(restConfig).ServerVersion()
+			discoveryClient, discErr := discovery.NewDiscoveryClientForConfig(restConfig)
+			if discErr != nil {
+				lastErr = fmt.Errorf("build discovery client: %w", discErr)
+				return false, nil
+			}
+			_, apiErr := discoveryClient.ServerVersion()
 			if apiErr != nil {
 				lastErr = fmt.Errorf("discover guest API server version: %w", apiErr)
 				return false, nil
 			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/v2/internal/test_context.go` around lines 123 - 126, The retry loop
uses discovery.NewDiscoveryClientForConfigOrDie which can panic and break
retries; replace that call with
discovery.NewDiscoveryClientForConfig(restConfig), check its returned (client,
err) and on error assign lastErr (e.g. fmt.Errorf("create discovery client: %w",
err)) and return false, nil so the loop continues; if client creation succeeds,
call client.ServerVersion() and keep the existing apiErr handling (assign
lastErr and return false, nil) so all errors are reported via lastErr instead of
aborting with a panic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@test/e2e/v2/internal/test_context.go`:
- Around line 95-99: The call to tc.MgmtClient.Get uses context.Background(),
which ignores the test suite's cancellation/deadline; replace
context.Background() with the test context (tc) so the call uses the suite's
embedded context and will honour cancellation/timeouts — update the invocation
of MgmtClient.Get (where the secret is fetched) to pass tc instead of
context.Background().
- Around line 123-126: The retry loop uses
discovery.NewDiscoveryClientForConfigOrDie which can panic and break retries;
replace that call with discovery.NewDiscoveryClientForConfig(restConfig), check
its returned (client, err) and on error assign lastErr (e.g. fmt.Errorf("create
discovery client: %w", err)) and return false, nil so the loop continues; if
client creation succeeds, call client.ServerVersion() and keep the existing
apiErr handling (assign lastErr and return false, nil) so all errors are
reported via lastErr instead of aborting with a panic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e73aae54-824e-4dc6-8d18-87eea3a87cbb

📥 Commits

Reviewing files that changed from the base of the PR and between 35c58e8 and d82d806.

📒 Files selected for processing (2)
  • test/e2e/util/client.go
  • test/e2e/v2/internal/test_context.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/util/client.go

Add GetGuestClient() to TestContext with sync.Once lazy-loading,
kubeconfig retrieval from HostedCluster status, and retry-based
connectivity checks. Add GetClientFromConfig() helper to e2eutil
and refactor GetClient() to use it.

Fixes: CNTRLPLANE-2888

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@csrwng csrwng force-pushed the guest-cluster-client branch from a3cac11 to 55b04dc Compare March 6, 2026 19:24
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 6, 2026

@csrwng: This pull request references CNTRLPLANE-2888 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Adds a lazy-loaded guest cluster client (GetGuestClient()) to the v2 TestContext so that tests requiring access to the guest (hosted) cluster API can obtain a crclient.Client without each test reimplementing the kubeconfig retrieval logic.

Also adds a GetClientFromConfig() helper to e2eutil for creating a client from an explicit REST config, and refactors GetClient() to use it.

Key design decisions:

  • sync.Once lazy-loading: Follows the existing GetHostedCluster() pattern
  • Short timeout (2 min): Unlike v1 (10 min), the cluster is already running; we only need to handle DNS propagation
  • QPS: -1, Burst: -1: Matches v1 pattern — we are the only client in test, so no throttling needed
  • Panic on failure: Matches GetHostedCluster() pattern — a test cannot proceed without a guest client
  • Discovery API connectivity check: Simple ServerVersion() call verifies the API server is reachable
  • Last error preservation: Timeout panic includes the last concrete error for easier CI triage

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-2888

Special notes for your reviewer:

  • GetClientFromConfig() creates a client from a REST config while reusing the standard test scheme, with nil-config validation
  • GetClient() was refactored to delegate to GetClientFromConfig() to reduce duplication
  • This is a prerequisite for CNTRLPLANE-2864, CNTRLPLANE-2865, and Epic 4 lifecycle tests

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 6, 2026

@csrwng: This pull request references CNTRLPLANE-2888 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Adds a lazy-loaded guest cluster client (GetGuestClient()) to the v2 TestContext so that tests requiring access to the guest (hosted) cluster API can obtain a crclient.Client without each test reimplementing the kubeconfig retrieval logic.

Also adds a GetClientFromConfig() helper to e2eutil for creating a client from an explicit REST config, and refactors GetClient() to use it.

Key design decisions:

  • sync.Once lazy-loading: Follows the existing GetHostedCluster() pattern
  • Short timeout (2 min): Unlike v1 (10 min), the cluster is already running; we only need to handle DNS propagation
  • QPS: -1, Burst: -1: Matches v1 pattern — we are the only client in test, so no throttling needed
  • Panic on failure: Matches GetHostedCluster() pattern — a test cannot proceed without a guest client
  • Discovery API connectivity check: Simple ServerVersion() call verifies the API server is reachable
  • Last error preservation: Timeout panic includes the last concrete error for easier CI triage

Which issue(s) this PR fixes:

Fixes CNTRLPLANE-2888

Special notes for your reviewer:

  • GetClientFromConfig() creates a client from a REST config while reusing the standard test scheme, with nil-config validation
  • GetClient() was refactored to delegate to GetClientFromConfig() to reduce duplication
  • This is a prerequisite for CNTRLPLANE-2864, CNTRLPLANE-2865, and Epic 4 lifecycle tests

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • Tests
  • Enhanced test infrastructure for cluster client initialization, improving reliability of end-to-end testing for multi-cluster scenarios.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/e2e/v2/internal/test_context.go`:
- Around line 90-98: The precondition check in test_context.go conflates a nil
KubeConfig with a missing secret name; update the guard around hc and
hc.Status.KubeConfig (in the code surrounding variable hc and the call to
tc.MgmtClient.Get) to explicitly check and fail fast with distinct messages:
first panic if hc == nil, then panic if hc.Status.KubeConfig == nil, then panic
if hc.Status.KubeConfig.Name == "" (empty string) before attempting the secret
lookup via tc.MgmtClient.Get; ensure the panic/log messages clearly mention
which condition failed (nil HostedCluster, nil KubeConfig, or empty
KubeConfig.Name) to make triage easier.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e459a2ef-0167-4da2-bae7-ea1cf726ca52

📥 Commits

Reviewing files that changed from the base of the PR and between d82d806 and 55b04dc.

📒 Files selected for processing (2)
  • test/e2e/util/client.go
  • test/e2e/v2/internal/test_context.go

Comment on lines +90 to +98
if hc == nil || hc.Status.KubeConfig == nil {
panic("HostedCluster has no kubeconfig in status")
}

var secret corev1.Secret
err := tc.MgmtClient.Get(context.Background(), crclient.ObjectKey{
Namespace: hc.Namespace,
Name: hc.Status.KubeConfig.Name,
}, &secret)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Tighten precondition checks before secret lookup.

The current guard conflates different failures and does not validate an empty kubeconfig secret name, which makes triage harder when this fails.

Suggested change
 		hc := tc.GetHostedCluster()
-		if hc == nil || hc.Status.KubeConfig == nil {
-			panic("HostedCluster has no kubeconfig in status")
+		if hc == nil {
+			panic("HostedCluster is not available in test context")
+		}
+		if hc.Status.KubeConfig == nil || hc.Status.KubeConfig.Name == "" {
+			panic("HostedCluster status kubeconfig secret reference is missing")
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/v2/internal/test_context.go` around lines 90 - 98, The precondition
check in test_context.go conflates a nil KubeConfig with a missing secret name;
update the guard around hc and hc.Status.KubeConfig (in the code surrounding
variable hc and the call to tc.MgmtClient.Get) to explicitly check and fail fast
with distinct messages: first panic if hc == nil, then panic if
hc.Status.KubeConfig == nil, then panic if hc.Status.KubeConfig.Name == ""
(empty string) before attempting the secret lookup via tc.MgmtClient.Get; ensure
the panic/log messages clearly mention which condition failed (nil
HostedCluster, nil KubeConfig, or empty KubeConfig.Name) to make triage easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/testing Indicates the PR includes changes for e2e testing do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants