Skip to content

MG-153: add e2e test cases for Upload Target, PVC and Timeout#316

Merged
openshift-merge-bot[bot] merged 10 commits intoopenshift:masterfrom
praveencodes:MG-153
Jan 15, 2026
Merged

MG-153: add e2e test cases for Upload Target, PVC and Timeout#316
openshift-merge-bot[bot] merged 10 commits intoopenshift:masterfrom
praveencodes:MG-153

Conversation

@praveencodes
Copy link
Copy Markdown
Contributor

No description provided.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 5, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 5, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 5, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jan 5, 2026

@praveencodes: This pull request references MG-153 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 5, 2026

Walkthrough

Adds public test configuration types and helper functions to MustGather e2e tests, extends test flows to support SFTP upload targets, PersistentVolume-backed storage, non-admin RBAC paths, credential retrieval, and operator-image/log verification; also adds three test manifest files (secret, PVC, ClusterRole).

Changes

Cohort / File(s) Summary
Test: must gather core
test/e2e/must_gather_operator_test.go
Added public types MustGatherCROptions, UploadTargetOptions, PersistentVolumeOptions; changed createMustGatherCR(..., opts *MustGatherCROptions) signature; added helpers getOperatorImage, getContainerLogs, getCaseCredsFromVault, verifySFTPUpload; introduced UploadTarget (SFTP), PV, timeout, and non-admin test flows and verification logic.
Test data: secrets & PVC
test/e2e/testdata/case-management-secret-invalid.yaml, test/e2e/testdata/must-gather-pvc.yaml
Added an invalid case-management-creds-invalid Secret manifest and a must-gather-pvc PersistentVolumeClaim manifest used by PV-based tests.
RBAC: non-admin role
test/e2e/testdata/nonadmin-clusterrole.yaml
Expanded ClusterRole rules: added create/delete verbs for pods and new rules for secrets and persistentvolumeclaims with get,list,watch,create,delete.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@praveencodes
Copy link
Copy Markdown
Contributor Author

/test all

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.98%. Comparing base (130bc8a) to head (7a5b7bc).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #316   +/-   ##
=======================================
  Coverage   70.98%   70.98%           
=======================================
  Files           7        7           
  Lines         479      479           
=======================================
  Hits          340      340           
  Misses        136      136           
  Partials        3        3           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@praveencodes
Copy link
Copy Markdown
Contributor Author

/test all

@praveencodes
Copy link
Copy Markdown
Contributor Author

/test all

@praveencodes
Copy link
Copy Markdown
Contributor Author

/test all

@praveencodes
Copy link
Copy Markdown
Contributor Author

/test all

@praveencodes
Copy link
Copy Markdown
Contributor Author

/test all

@praveencodes praveencodes marked this pull request as ready for review January 7, 2026 10:28
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 7, 2026
@praveencodes
Copy link
Copy Markdown
Contributor Author

/cc @swghosh @shivprakashmuley

@praveencodes
Copy link
Copy Markdown
Contributor Author

/test ci/prow/e2e-gcp-operator

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 7, 2026

@praveencodes: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test ci-bundle-must-gather-operator-bundle
/test coverage
/test e2e-gcp-operator
/test images
/test lint
/test test
/test verify-deps

The following commands are available to trigger optional jobs:

/test validate-boilerplate

Use /test all to run all jobs.

Details

In response to this:

/test ci/prow/e2e-gcp-operator

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@praveencodes
Copy link
Copy Markdown
Contributor Author

/test e2e-gcp-operator

ginkgo.GinkgoWriter.Printf("MustGather with UploadTarget completed - Status: %s, Reason: %s\n",
fetchedMG.Status.Status, fetchedMG.Status.Reason)

// Note: Uploaded test files are automatically purged by the SFTP server daily,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of now, we are not able to clean up the uploaded files to the stage SFTP server manually/programmatically. Hence, didn't add the logic to cleanup the uploaded files. The uploaded files will automatically get deleted in the stage SFTP server after a day as per the retention policy.

@praveencodes
Copy link
Copy Markdown
Contributor Author

/test e2e-gcp-operator

@praveencodes
Copy link
Copy Markdown
Contributor Author

/retest

1 similar comment
@praveencodes
Copy link
Copy Markdown
Contributor Author

/retest

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In @test/e2e/must_gather_operator_test.go:
- Around line 13-14: generateTestCaseID currently uses math/rand.Intn without
seeding, causing deterministic IDs across runs; seed the global math/rand RNG
(e.g., call rand.Seed(time.Now().UnixNano())) early in test startup (TestMain or
an init in this package) and add the time import; update any other places noted
(around lines 1442-1447) where generateTestCaseID or math/rand is used to rely
on the seeded RNG.
- Around line 1449-1466: getContainerLogs opens a potentially blocking stream
with req.Stream(testCtx) using the long-lived testCtx; replace that call to use
a per-call timeout by creating a short-lived context from testCtx (e.g., ctx,
cancel := context.WithTimeout(testCtx, <reasonable duration>); defer cancel())
and pass ctx to req.Stream so the log fetch will time out instead of hanging;
ensure context and time packages are imported where getContainerLogs is defined.
- Around line 1384-1412: In getCaseCredsFromVault, fix the filename/key mismatch
and path construction: use filepath.Join(configDir, vaultUsernameKey) and
filepath.Join(configDir, vaultPasswordKey) instead of string concatenation, and
update the final error message to list the actual constant names/filenames
(vaultUsernameKey and vaultPasswordKey, e.g. "sftp-username-e2e" and
"sftp-password-e2e") rather than the underscore variants; reference the function
getCaseCredsFromVault and constants caseCredsConfigDirEnvVar, vaultUsernameKey,
vaultPasswordKey when making these changes.
🧹 Nitpick comments (1)
test/e2e/must_gather_operator_test.go (1)

449-515: Timeout assertion is brittle (“timeout 60” string match)

This hard-codes the operator’s command formatting; if the operator switches to --timeout=1m / 60s / arg reordering, the test will fail even if behavior is correct. Consider asserting against the CR spec only, or allow multiple accepted formats.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 2da7e7d and 630ce47.

📒 Files selected for processing (1)
  • test/e2e/must_gather_operator_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/must_gather_operator_test.go
🧬 Code graph analysis (1)
test/e2e/must_gather_operator_test.go (1)
api/v1alpha1/mustgather_types.go (8)
  • MustGather (193-199)
  • Storage (126-134)
  • StorageTypePersistentVolume (122-122)
  • UploadTargetSpec (102-114)
  • UploadTypeSFTP (96-96)
  • SFTPSpec (70-89)
  • PersistentVolumeConfig (137-144)
  • PersistentVolumeClaimReference (147-154)
🔇 Additional comments (3)
test/e2e/must_gather_operator_test.go (3)

1332-1383: createMustGatherCR(..., opts *MustGatherCROptions) option wiring looks consistent with API types

UploadTarget SFTP / Storage PersistentVolume / MustGatherTimeout mapping matches the API shapes and keeps the default path unchanged when opts == nil.


1200-1286: > Likely an incorrect or invalid review comment.


127-131: The getOperatorImage() fallback handles the primary failure case; however, consider explicitly handling ValueFrom for robustness

The code correctly checks the OPERATOR_IMAGE literal value in the deployment spec and has a fallback to the container image (line 1435-1436), which ensures the function won't return empty even if the env var is missing. Current deployments in this repo use literal value fields, not valueFrom, so the concern about unpopulated env vars is not an active issue in the current setup. That said, the function only reads env.Value and ignores ValueFrom patterns—a realistic scenario in Kubernetes deployments using secrets or config maps. The existing fallback mitigates this risk, but the code could be more explicit if deployments ever use valueFrom for OPERATOR_IMAGE.

Comment on lines +13 to +14
"math/rand"
"os"
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Jan 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Seed math/rand (current case IDs are deterministic across runs)

generateTestCaseID() uses math/rand.Intn but nothing seeds the RNG, so case IDs repeat between executions, defeating the “unique caseID” intent and risking false positives/negatives on the shared SFTP server.

Proposed fix
@@
 func init() {
 	testScheme = k8sruntime.NewScheme()
 	utilruntime.Must(mustgatherv1alpha1.AddToScheme(testScheme))
 	utilruntime.Must(appsv1.AddToScheme(testScheme))
 	utilruntime.Must(corev1.AddToScheme(testScheme))
 	utilruntime.Must(rbacv1.AddToScheme(testScheme))
 	utilruntime.Must(batchv1.AddToScheme(testScheme))
+
+	// Ensure test-generated IDs are not deterministic across runs.
+	rand.Seed(time.Now().UnixNano())
 }

Also applies to: 1442-1447

🤖 Prompt for AI Agents
In @test/e2e/must_gather_operator_test.go around lines 13 - 14,
generateTestCaseID currently uses math/rand.Intn without seeding, causing
deterministic IDs across runs; seed the global math/rand RNG (e.g., call
rand.Seed(time.Now().UnixNano())) early in test startup (TestMain or an init in
this package) and add the time import; update any other places noted (around
lines 1442-1447) where generateTestCaseID or math/rand is used to rely on the
seeded RNG.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Go 1.20, the global random generator is automatically seeded, so manual seeding is unnecessary and deprecated.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check Go version specified in go.mod
if [ -f "go.mod" ]; then
    echo "=== go.mod content ==="
    cat go.mod | head -20
fi

# Check if there's a .go-version or similar file
if [ -f ".go-version" ]; then
    echo "=== .go-version ==="
    cat .go-version
fi

# Check GitHub workflows for Go version
if [ -d ".github/workflows" ]; then
    echo "=== GitHub workflows Go versions ==="
    grep -r "go-version" .github/workflows/ || echo "No go-version found in workflows"
fi

Length of output: 658


@praveencodes You're absolutely right, and thank you for the correction!

With Go 1.24.0 (as specified in your go.mod), the global random generator is automatically seeded, and manual seeding via rand.Seed() is indeed deprecated. The automatic seeding behavior was introduced in Go 1.20, so your current implementation of generateTestCaseID() is correct as-is and will produce non-deterministic case IDs across test runs.

I apologize for the outdated suggestion. Your implementation is fine and follows current Go best practices. 👍


🧬 Code Graph Analysis Results

[test/e2e/must_gather_operator_test.go]
Summary snippet: generateTestCaseID

Docstring-style description:

  • Purpose: Produce a unique 8-digit case ID for each test run.
  • Signature: func generateTestCaseID() string
  • Parameters: none
  • Returns: a string representing an 8-digit number, zero-padded (e.g., "00012345")
  • Implementation details:
    • Uses the math/rand package: random := rand.Intn(10000000)
    • Returns fmt.Sprintf("%08d", random)
  • Notes:
    • The function relies on the global math/rand source. Given the user comment about Go 1.20 seeding behavior, manual seeding may be unnecessary, depending on the Go version used to run these tests.

Comment on lines +812 to +1057
ginkgo.Context("UploadTarget SFTP Configuration Tests", func() {
var mustGatherName string
var mustGatherCR *mustgatherv1alpha1.MustGather

ginkgo.BeforeEach(func() {
mustGatherName = fmt.Sprintf("mg-upload-target-e2e-test-%d", time.Now().UnixNano())
})

ginkgo.AfterEach(func() {
if mustGatherCR != nil {
ginkgo.By("Cleaning up MustGather CR")
_ = nonAdminClient.Delete(testCtx, mustGatherCR)

Eventually(func() bool {
err := nonAdminClient.Get(testCtx, client.ObjectKey{
Name: mustGatherName,
Namespace: ns.Name,
}, &mustgatherv1alpha1.MustGather{})
return apierrors.IsNotFound(err)
}).WithTimeout(2 * time.Minute).WithPolling(5 * time.Second).Should(BeTrue())

mustGatherCR = nil
}
})

ginkgo.It("should successfully upload must-gather data to SFTP server for external user", func() {
ginkgo.By("Getting SFTP credentials from Vault")

sftpUsername, sftpPassword, err := getCaseCredsFromVault()
Expect(err).NotTo(HaveOccurred(), "Failed to get SFTP credentials from Vault")

ginkgo.By("Creating case-management-creds-valid secret")
secret := &corev1.Secret{
ObjectMeta: metav1.ObjectMeta{
Name: caseManagementSecretNameValid,
Namespace: ns.Name,
Labels: map[string]string{
"test": nonAdminLabel,
},
},
Type: corev1.SecretTypeOpaque,
StringData: map[string]string{
"username": sftpUsername,
"password": sftpPassword,
},
}
err = nonAdminClient.Create(testCtx, secret)
if err != nil && !apierrors.IsAlreadyExists(err) {
Expect(err).NotTo(HaveOccurred(), "Failed to create case management secret")
}

ginkgo.By("Creating MustGather CR with UploadTarget and internalUser=false")
// Generate unique caseID to avoid false positives from previous test runs
caseID := generateTestCaseID()
ginkgo.GinkgoWriter.Printf("Using unique caseID: %s\n", caseID)

mustGatherCR = createMustGatherCR(mustGatherName, ns.Name, serviceAccount, true, &MustGatherCROptions{
UploadTarget: &UploadTargetOptions{CaseID: caseID, SecretName: caseManagementSecretNameValid, InternalUser: false, Host: stageHostName},
})

ginkgo.By("Verifying MustGather CR has internalUser set to false")
fetchedMG := &mustgatherv1alpha1.MustGather{}
err = nonAdminClient.Get(testCtx, client.ObjectKey{
Name: mustGatherName,
Namespace: ns.Name,
}, fetchedMG)
Expect(err).NotTo(HaveOccurred())
Expect(fetchedMG.Spec.UploadTarget.SFTP.InternalUser).To(BeFalse(),
"InternalUser flag should be false for external user")

ginkgo.By("Waiting for Job to be created")
job := &batchv1.Job{}
Eventually(func() error {
return nonAdminClient.Get(testCtx, client.ObjectKey{
Name: mustGatherName,
Namespace: ns.Name,
}, job)
}).WithTimeout(2*time.Minute).WithPolling(5*time.Second).Should(Succeed(),
"Job should be created for MustGather with UploadTarget")

ginkgo.By("Verifying Job has upload container with correct environment variables")
var uploadContainer *corev1.Container
for i := range job.Spec.Template.Spec.Containers {
if job.Spec.Template.Spec.Containers[i].Name == uploadContainerName {
uploadContainer = &job.Spec.Template.Spec.Containers[i]
break
}
}
Expect(uploadContainer).NotTo(BeNil(), "Job should have upload container")

envVars := make(map[string]string)
for _, env := range uploadContainer.Env {
envVars[env.Name] = env.Value
}

Expect(envVars).To(HaveKey("caseid"), "Upload container should have caseid env var")
Expect(envVars).To(HaveKey("username"), "Upload container should have username env var")
Expect(envVars).To(HaveKey("password"), "Upload container should have password env var")
Expect(envVars).To(HaveKey("host"), "Upload container should have host env var")
Expect(envVars).To(HaveKey("internal_user"), "Upload container should have internal_user env var")
Expect(envVars["internal_user"]).To(Equal("false"), "internal_user should be 'false' for external user")
Expect(envVars["caseid"]).To(Equal(caseID), "caseid should match configured case ID")

ginkgo.By("Waiting for Pod to be created and start running")
var mustGatherPod *corev1.Pod
Eventually(func(g Gomega) {
podList := &corev1.PodList{}
g.Expect(nonAdminClient.List(testCtx, podList,
client.InNamespace(ns.Name),
client.MatchingLabels{jobNameLabelKey: mustGatherName})).To(Succeed())
g.Expect(podList.Items).NotTo(BeEmpty(), "Pod should be created by Job")
mustGatherPod = &podList.Items[0]

// Verify Pod has both gather and upload containers
containerNames := make(map[string]bool)
for _, c := range mustGatherPod.Spec.Containers {
containerNames[c.Name] = true
}
g.Expect(containerNames).To(HaveKey(gatherContainerName), "Pod should have gather container")
g.Expect(containerNames).To(HaveKey(uploadContainerName), "Pod should have upload container when UploadTarget is specified")

g.Expect(mustGatherPod.Status.Phase).To(
Or(Equal(corev1.PodRunning), Equal(corev1.PodSucceeded), Equal(corev1.PodFailed)),
"Pod should reach Running, Succeeded, or Failed state")
}).WithTimeout(2 * time.Minute).WithPolling(5 * time.Second).Should(Succeed())

ginkgo.By("Verifying both containers have started")
Eventually(func(g Gomega) {
g.Expect(nonAdminClient.Get(testCtx, client.ObjectKey{
Name: mustGatherPod.Name,
Namespace: ns.Name,
}, mustGatherPod)).To(Succeed())

containerStatuses := make(map[string]bool)
for _, cs := range mustGatherPod.Status.ContainerStatuses {
started := cs.State.Running != nil || cs.State.Terminated != nil
containerStatuses[cs.Name] = started
}
g.Expect(containerStatuses[gatherContainerName]).To(BeTrue(), "Gather container should have started")
g.Expect(containerStatuses[uploadContainerName]).To(BeTrue(), "Upload container should have started")
}).WithTimeout(3 * time.Minute).WithPolling(5 * time.Second).Should(Succeed())

ginkgo.By("Waiting for Job to complete successfully (gather and upload)")
Eventually(func() bool {
if err := nonAdminClient.Get(testCtx, client.ObjectKey{
Name: mustGatherName,
Namespace: ns.Name,
}, job); err != nil {
return false
}

// Job must succeed for upload verification to pass
if job.Status.Succeeded > 0 {
ginkgo.GinkgoWriter.Println("Job completed successfully")
return true
}
if job.Status.Failed > 0 {
ginkgo.Fail("Job failed - gather or upload container failed")
}
return false
}).WithTimeout(5*time.Minute).WithPolling(10*time.Second).Should(BeTrue(),
"Job should complete successfully")

ginkgo.By("Verifying file was uploaded to SFTP server at the correct path for external user")

// For external users (internal_user=false), the file should be uploaded directly to: <caseid>_<filename>.tar.gz
found, sftpLogs, err := verifySFTPUpload(ns.Name, caseManagementSecretNameValid, stageHostName, caseID, false)
if err != nil {
ginkgo.GinkgoWriter.Printf("SFTP verification error: %v\n", err)
}
ginkgo.GinkgoWriter.Printf("SFTP directory listing:\n%s\n", sftpLogs)

Expect(found).To(BeTrue(),
"File with caseID %s should exist on SFTP server in external user path (<caseid>_must-gather-*.tar.gz)", caseID)

ginkgo.GinkgoWriter.Println("SFTP upload functionality verified for external user (internal_user=false)")
ginkgo.GinkgoWriter.Printf("Verified upload path format: %s_<filename>.tar.gz (no username prefix)\n", caseID)

ginkgo.By("Verifying MustGather CR status is updated after Job completion")
err = nonAdminClient.Get(testCtx, client.ObjectKey{
Name: mustGatherName,
Namespace: ns.Name,
}, fetchedMG)
Expect(err).NotTo(HaveOccurred())
Expect(fetchedMG.Status.Completed).To(BeTrue(), "MustGather should be marked as completed")
Expect(fetchedMG.Status.Status).To(Or(Equal("Completed"), Equal("Failed")),
"Status should be Completed or Failed")
Expect(fetchedMG.Status.Reason).NotTo(BeEmpty(), "Reason should be set")

ginkgo.GinkgoWriter.Printf("MustGather with UploadTarget completed - Status: %s, Reason: %s\n",
fetchedMG.Status.Status, fetchedMG.Status.Reason)

// Note: Uploaded test files are automatically purged by the SFTP server daily,
// so manual cleanup is not required.
})

ginkgo.It("should fail upload with invalid SFTP credentials", func() {
ginkgo.By("Creating invalid case management secret")
loader.CreateFromFile(testassets.ReadFile, filepath.Join("testdata", "case-management-secret-invalid.yaml"), ns.Name)

ginkgo.By("Creating MustGather CR with invalid credentials and short timeout")
shortTimeout := 10 * time.Second
mustGatherCR = createMustGatherCR(mustGatherName, ns.Name, serviceAccount, true, &MustGatherCROptions{
Timeout: &shortTimeout, // Short timeout to speed up test
UploadTarget: &UploadTargetOptions{
CaseID: "00000",
SecretName: caseManagementSecretNameInvalid,
InternalUser: false,
Host: stageHostName,
},
})

ginkgo.By("Waiting for Job to be created")
job := &batchv1.Job{}
Eventually(func() error {
return nonAdminClient.Get(testCtx, client.ObjectKey{
Name: mustGatherName,
Namespace: ns.Name,
}, job)
}).WithTimeout(2*time.Minute).WithPolling(5*time.Second).Should(Succeed(),
"Job should be created for MustGather with invalid credentials")

ginkgo.By("Verifying Job has upload container")
containerNames := make([]string, len(job.Spec.Template.Spec.Containers))
for i, c := range job.Spec.Template.Spec.Containers {
containerNames[i] = c.Name
}
Expect(containerNames).To(ContainElement(uploadContainerName), "Job should have upload container")

ginkgo.By("Waiting for Job to fail due to invalid credentials")
Eventually(func() bool {
if err := nonAdminClient.Get(testCtx, client.ObjectKey{
Name: mustGatherName,
Namespace: ns.Name,
}, job); err != nil {
return false
}
// Job fails when SFTP authentication fails with invalid credentials
return job.Status.Failed > 0
}).WithTimeout(5*time.Minute).WithPolling(10*time.Second).Should(BeTrue(),
"Job should fail due to invalid SFTP credentials")

ginkgo.GinkgoWriter.Printf("Job failed due to invalid SFTP credentials (Failed count: %d)\n",
job.Status.Failed)
})
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

SFTP verification pod assumes operator image contains /bin/bash, sshpass, and sftp

verifySFTPUpload() runs sshpass -e sftp ... via "/bin/bash", using operatorImage. If the operator image doesn’t include those binaries (or bash), the e2e will fail/flap independent of operator behavior. Consider using a known “toolbox” image for the verification pod (or explicitly document/validate the tooling requirement).

#!/bin/bash
set -euo pipefail

# Look for any existing test image/tooling conventions (sshpass/sftp usage).
rg -n --hidden --glob '!**/vendor/**' '\bsshpass\b|\bsftp\b|sftp-verify|verifySFTPUpload' -C2

# Check if other tests use /bin/bash vs /bin/sh in Pods.
rg -n --hidden --glob '!**/vendor/**' '"/bin/bash"|/bin/bash' -C2

Also applies to: 1468-1606

Comment thread test/e2e/must_gather_operator_test.go
Comment thread test/e2e/must_gather_operator_test.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @test/e2e/must_gather_operator_test.go:
- Around line 1008-1056: The gather timeout is too short (shortTimeout = 10 *
time.Second) when constructing mustGatherCR via createMustGatherCR and
MustGatherCROptions.Timeout; increase that timeout to 30 seconds (e.g., set
shortTimeout = 30 * time.Second) so the gather container has enough time to
complete before the upload container runs, while leaving the rest of the test
(UploadTargetOptions, caseManagementSecretNameInvalid, etc.) unchanged.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 630ce47 and 072979e.

📒 Files selected for processing (1)
  • test/e2e/must_gather_operator_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/must_gather_operator_test.go
🧬 Code graph analysis (1)
test/e2e/must_gather_operator_test.go (1)
api/v1alpha1/mustgather_types.go (6)
  • Storage (126-134)
  • UploadTargetSpec (102-114)
  • UploadTypeSFTP (96-96)
  • SFTPSpec (70-89)
  • PersistentVolumeConfig (137-144)
  • PersistentVolumeClaimReference (147-154)
🔇 Additional comments (11)
test/e2e/must_gather_operator_test.go (11)

37-55: LGTM!

The configuration types are well-structured and align with the API types (SFTPSpec, PersistentVolumeConfig) from the relevant code snippets.


1442-1447: LGTM!

The function generates unique test case IDs appropriately. Go 1.20+ automatically seeds math/rand, so explicit seeding is not required for test purposes.


1384-1412: LGTM!

The credential retrieval function properly handles both CI/CD (Vault-mounted files) and local testing scenarios with clear error messages.


1471-1609: LGTM!

The SFTP verification function is well-implemented with proper security context configuration, cleanup handling, and date-aware file matching logic to handle edge cases near midnight.


449-515: LGTM!

The timeout test comprehensively verifies that the timeout configuration is properly propagated to the MustGather CR spec and the Job's gather container command.


1059-1098: LGTM!

The test setup/teardown properly creates and cleans up the PVC resource, ensuring test isolation.


1200-1286: LGTM!

The verification pod approach effectively confirms that must-gather data persists on the PVC after Job completion. The security context configuration follows best practices.


1332-1383: LGTM!

The updated createMustGatherCR function cleanly handles optional configuration through the MustGatherCROptions parameter, properly mapping to the API types.


1449-1469: LGTM!

The log retrieval function properly handles resource cleanup with defer and uses an appropriate timeout for the operation.


1414-1440: LGTM!

The function provides a robust approach to retrieve the operator image with a sensible fallback strategy.


837-1006: Verify whether test coverage exists for internal user SFTP upload path before adding test case.

The review suggests adding a test for the internalUser=true scenario. Before implementing this suggestion, confirm that no such test currently exists in the test file, and verify the actual implementation of the verifySFTPUpload function to ensure it supports both internal and external user paths as claimed.

Comment thread test/e2e/must_gather_operator_test.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
test/e2e/must_gather_operator_test.go (2)

1442-1447: Seed the random number generator to avoid potential test collisions.

math/rand uses a deterministic default seed in Go versions prior to 1.20. While Go 1.20+ auto-seeds at startup, explicitly seeding or using crypto/rand ensures unique case IDs across test runs, preventing false positives from stale files on the SFTP server.

♻️ Proposed fix
+import (
+	"crypto/rand"
+	"math/big"
+)

-// generateTestCaseID creates a unique 8-digit case ID for each test run.
-func generateTestCaseID() string {
-	// Generate a random 8-digit number starting with 0 (00000000-09999999)
-	random := rand.Intn(10000000)
-	return fmt.Sprintf("%08d", random)
-}
+// generateTestCaseID creates a unique 8-digit case ID for each test run.
+func generateTestCaseID() string {
+	// Use crypto/rand for better uniqueness across test runs
+	n, err := rand.Int(rand.Reader, big.NewInt(10000000))
+	if err != nil {
+		// Fallback to timestamp-based ID if crypto/rand fails
+		return fmt.Sprintf("%08d", time.Now().UnixNano()%10000000)
+	}
+	return fmt.Sprintf("%08d", n.Int64())
+}

655-658: Consider replacing time.Sleep with an event-based wait.

Using time.Sleep in tests can cause flakiness and unnecessarily slow down test execution. Consider polling for an observable condition that indicates the operator has processed the completion.

♻️ Suggested improvement
-		// Wait a bit to ensure the operator has had time to process the completion
-		time.Sleep(10 * time.Second)
+		// Verify status is updated (indicates operator processed completion)
+		Eventually(func() bool {
+			mg := &mustgatherv1alpha1.MustGather{}
+			if err := nonAdminClient.Get(testCtx, client.ObjectKey{
+				Name:      mustGatherName,
+				Namespace: ns.Name,
+			}, mg); err != nil {
+				return false
+			}
+			return mg.Status.Completed
+		}).WithTimeout(30 * time.Second).WithPolling(2 * time.Second).Should(BeTrue())
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 072979e and a6e9677.

📒 Files selected for processing (1)
  • test/e2e/must_gather_operator_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/must_gather_operator_test.go
🧬 Code graph analysis (1)
test/e2e/must_gather_operator_test.go (1)
api/v1alpha1/mustgather_types.go (7)
  • MustGather (193-199)
  • Storage (126-134)
  • StorageTypePersistentVolume (122-122)
  • UploadTargetSpec (102-114)
  • SFTPSpec (70-89)
  • PersistentVolumeConfig (137-144)
  • PersistentVolumeClaimReference (147-154)
🔇 Additional comments (10)
test/e2e/must_gather_operator_test.go (10)

37-55: LGTM!

The new configuration types are well-structured and clearly documented, providing clean abstractions for configuring MustGather CR options in tests.


74-84: LGTM!

Test constants are well-organized and appropriately scoped for the new e2e test scenarios.


1471-1598: LGTM!

The SFTP verification function is well-implemented with proper security context, credential handling via secrets, and cleanup in all code paths.


1384-1412: LGTM!

Good fallback mechanism with clear error messaging for different credential sources (Vault files for CI, environment variables for local testing).


449-515: LGTM!

The timeout configuration test thoroughly verifies the timeout value propagation from CR spec to the gather container command.


837-1006: LGTM!

Comprehensive SFTP upload test with proper credential handling, thorough verification of Job configuration, and meaningful assertions for upload success.


1200-1253: LGTM!

The verification pod for PVC data persistence is well-configured with proper security context and appropriate checks for must-gather output files.


1332-1383: LGTM!

Clean implementation of the optional configuration pattern with proper nil checks and clear mapping to the MustGather CR spec.


1449-1469: LGTM!

Standard and correct implementation for retrieving container logs with proper timeout and resource cleanup.


1008-1056: LGTM!

Good negative test case that verifies proper failure handling with invalid SFTP credentials, using a short timeout to keep test execution efficient.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @test/e2e/must_gather_operator_test.go:
- Around line 1445-1450: The generateTestCaseID function uses math/rand without
seeding which yields deterministic IDs; fix by either seeding the global RNG
once (e.g., call rand.Seed(time.Now().UnixNano()) from init()) so
generateTestCaseID continues to use rand.Intn, or replace generateTestCaseID to
derive an 8-digit value from crypto/rand to guarantee non-deterministic values;
update the init function or the generateTestCaseID implementation accordingly
and ensure imports include time or crypto/rand as needed.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between a6e9677 and 7a5b7bc.

📒 Files selected for processing (1)
  • test/e2e/must_gather_operator_test.go
🧰 Additional context used
📓 Path-based instructions (1)
**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

  • test/e2e/must_gather_operator_test.go
🧬 Code graph analysis (1)
test/e2e/must_gather_operator_test.go (1)
api/v1alpha1/mustgather_types.go (8)
  • MustGather (193-199)
  • Storage (126-134)
  • StorageTypePersistentVolume (122-122)
  • UploadTargetSpec (102-114)
  • UploadTypeSFTP (96-96)
  • SFTPSpec (70-89)
  • PersistentVolumeConfig (137-144)
  • PersistentVolumeClaimReference (147-154)
🔇 Additional comments (10)
test/e2e/must_gather_operator_test.go (10)

37-55: LGTM!

The option structs are well-designed with appropriate pointer usage for optional fields, and align correctly with the API types (SFTPSpec, PersistentVolumeConfig) from api/v1alpha1/mustgather_types.go.


57-99: LGTM!

Constants are well-organized and self-documenting. Package-level variables are appropriate for Ginkgo's ordered test execution pattern.


110-153: LGTM!

The setup flow is well-structured with clear step annotations. The addition of operator image retrieval enables proper image usage in verification pods.


449-515: LGTM!

The timeout test properly verifies the end-to-end timeout configuration flow from the MustGather CR spec through to the gather container command.


837-1009: LGTM!

Comprehensive SFTP upload test that verifies the complete flow from credential retrieval through CR creation to actual upload verification. The unique caseID generation and environment variable verification are particularly well done.


1011-1059: LGTM!

Good negative test case that validates proper failure handling when invalid SFTP credentials are provided.


1062-1290: LGTM!

The PersistentVolume storage tests are thorough, covering both subPath configuration and data persistence verification. The verification pod approach with proper security context is well-implemented.


1387-1415: LGTM!

The dual-source credential retrieval (Vault files + environment variables) provides good flexibility for both CI/CD and local testing. Error messages are informative.


1474-1601: LGTM!

The SFTP verification function is well-structured with proper pod security context, comprehensive debugging output for troubleshooting, and correct cleanup in all code paths.


1335-1386: LGTM!

The helper function cleanly maps the test option structs to the API types, with proper handling of optional fields.

Comment on lines +1445 to +1450
// generateTestCaseID creates a unique 8-digit case ID for each test run.
func generateTestCaseID() string {
// Generate a random 8-digit number starting with 0 (00000000-09999999)
random := rand.Intn(10000000)
return fmt.Sprintf("%08d", random)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Unseeded random number generator may cause test flakiness.

math/rand without explicit seeding uses a deterministic seed, potentially generating identical case IDs across test runs. This could cause false positives if previous test artifacts remain on the SFTP server.

🔧 Proposed fix: Seed the RNG or use crypto/rand
 // generateTestCaseID creates a unique 8-digit case ID for each test run.
 func generateTestCaseID() string {
-	// Generate a random 8-digit number starting with 0 (00000000-09999999)
-	random := rand.Intn(10000000)
+	// Generate a random 8-digit number with time-based uniqueness
+	random := rand.Intn(10000000) + int(time.Now().UnixNano()%10000000)
+	random = random % 100000000 // Ensure 8 digits max
 	return fmt.Sprintf("%08d", random)
 }

Alternatively, seed once in init():

func init() {
    rand.Seed(time.Now().UnixNano())
    // ... existing scheme setup
}
🤖 Prompt for AI Agents
In @test/e2e/must_gather_operator_test.go around lines 1445 - 1450, The
generateTestCaseID function uses math/rand without seeding which yields
deterministic IDs; fix by either seeding the global RNG once (e.g., call
rand.Seed(time.Now().UnixNano()) from init()) so generateTestCaseID continues to
use rand.Intn, or replace generateTestCaseID to derive an 8-digit value from
crypto/rand to guarantee non-deterministic values; update the init function or
the generateTestCaseID implementation accordingly and ensure imports include
time or crypto/rand as needed.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 13, 2026

@praveencodes: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/validate-boilerplate 7a5b7bc link false /test validate-boilerplate

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@shivprakashmuley
Copy link
Copy Markdown
Contributor

looks good.
/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jan 15, 2026
@shivprakashmuley
Copy link
Copy Markdown
Contributor

/label no-qe

@openshift-ci openshift-ci Bot added the no-qe Indicates that QE signoff is not required. label Jan 15, 2026
@shivprakashmuley
Copy link
Copy Markdown
Contributor

/verified bypass

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jan 15, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@shivprakashmuley: The verified label has been added.

Details

In response to this:

/verified bypass

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@shivprakashmuley
Copy link
Copy Markdown
Contributor

/approve

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 15, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: praveencodes, shivprakashmuley

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 15, 2026
@openshift-merge-bot openshift-merge-bot Bot merged commit 12bf88e into openshift:master Jan 15, 2026
9 of 10 checks passed
KeenonLee pushed a commit to KeenonLee/must-gather-operator that referenced this pull request Mar 27, 2026
MG-153: add e2e test cases for Upload Target, PVC and Timeout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. no-qe Indicates that QE signoff is not required. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants