Skip to content

Conversation

@zetaab
Copy link
Member

@zetaab zetaab commented Oct 6, 2025

What this PR does / why we need it:

Which issue this PR fixes(if applicable):
fixes #

Special notes for reviewers:

Release note:

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 6, 2025
@k8s-ci-robot k8s-ci-robot requested review from dulek and kayrus October 6, 2025 17:41
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Oct 6, 2025
@zetaab
Copy link
Member Author

zetaab commented Oct 6, 2025

cc @stephenfin

@zetaab
Copy link
Member Author

zetaab commented Oct 7, 2025

/retest

@stephenfin
Copy link
Member

/test openstack-cloud-csi-manila-e2e-test

@stephenfin
Copy link
Member

stephenfin commented Oct 7, 2025

/lgtm
/approve
/hold

I can't see anything obviously wrong with this, but the Manila job had been passing pretty consistently recently so I wonder if we've broken something in Gophercloud v2.8.0? I'm going to propose a bump of just that and see if we can reproduce the issue.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 7, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 7, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: stephenfin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 7, 2025
@stephenfin
Copy link
Member

stephenfin commented Oct 7, 2025

I can't see anything obviously wrong with this, but the Manila job had been passing pretty consistently recently so I wonder if we've broken something in Gophercloud v2.8.0? I'm going to propose a bump of just that and see if we can reproduce the issue.

I have proposed two PRs, #3005 and #3006. Depending on the outcome of the Manila job on those PRs, we should learn whether (a) the job is permafailing, (b) we've broken something with gophercloud v2.8.0, or (c) we've broken the job with another changes in this PR.

EDIT: So those other PRs both passed, and it failed again here, which suggests something is genuinely wrong here.

Do we want to merge those two PRs and then iterate on this one? Would I be correct in saying that a bump in k8s will also result in a bump in the e2e tests? I wonder if something got stricter, if so?

@kayrus
Copy link
Contributor

kayrus commented Oct 7, 2025

/test openstack-cloud-csi-manila-e2e-test

k8s.io/legacy-cloud-providers => k8s.io/legacy-cloud-providers v0.33.3
k8s.io/sample-apiserver => k8s.io/sample-apiserver v0.33.3
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to bump these replaces also?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, yes

@stephenfin
Copy link
Member

/test openstack-cloud-csi-manila-e2e-test

I'll keep an eye on this but I suspect there isn't any point in rechecking this. While the first test run (link) failed due to a deployment issue, the next two (link, link) both failed with the same two failed tests.

Summarizing 2 Failures:
  [FAIL] [sig-storage] [manila-csi-e2e] CSI Volumes [[Driver: nfs.manila.csi.openstack.org]] [Testpattern: Dynamic PV (default fs)(allowExpansion)] volume-expand [It] should resize volume when PVC is edited while pod is using it [sig-storage]
  /root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:316
  [FAIL] [sig-storage] [manila-csi-e2e] CSI Volumes [[Driver: nfs.manila.csi.openstack.org]] [Testpattern: Dynamic PV (default fs)(allowExpansion)] volume-expand [It] should resize volume when PVC is edited and the pod is re-created on the same node after controller resize is finished [sig-storage]
  /root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:389

@k8s-ci-robot
Copy link
Contributor

@zetaab: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
openstack-cloud-csi-manila-e2e-test 2dab213 link true /test openstack-cloud-csi-manila-e2e-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@stephenfin
Copy link
Member

stephenfin commented Oct 7, 2025

Yeah, exact same two failures again:

 Summarizing 2 Failures:
  [FAIL] [sig-storage] [manila-csi-e2e] CSI Volumes [[Driver: nfs.manila.csi.openstack.org]] [Testpattern: Dynamic PV (default fs)(allowExpansion)] volume-expand [It] should resize volume when PVC is edited and the pod is re-created on the same node after controller resize is finished [sig-storage]
  /root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:389
  [FAIL] [sig-storage] [manila-csi-e2e] CSI Volumes [[Driver: nfs.manila.csi.openstack.org]] [Testpattern: Dynamic PV (default fs)(allowExpansion)] volume-expand [It] should resize volume when PVC is edited while pod is using it [sig-storage]
  /root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:316

(link)

This is the full failure for both:

[sig-storage] [manila-csi-e2e] CSI Volumes [[Driver: nfs.manila.csi.openstack.org]] [Testpattern: Dynamic PV (default fs)(allowExpansion)] volume-expand should resize volume when PVC is edited while pod is using it [sig-storage]
/root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:267
  STEP: Creating a kubernetes client @ 10/07/25 16:01:26.114
  I1007 16:01:26.114138 120458 util.go:454] >>> kubeConfig: /root/.kube/config
  STEP: Building a namespace api object, basename volume-expand @ 10/07/25 16:01:26.114
  STEP: Waiting for a default service account to be provisioned in namespace @ 10/07/25 16:01:26.198
  STEP: Waiting for kube-root-ca.crt to be provisioned in namespace @ 10/07/25 16:01:26.201
  I1007 16:01:26.204477 120458 volume_resource.go:98] Creating resource for dynamic PV
  I1007 16:01:26.204523 120458 volume_resource.go:104] Using claimSize:1Gi, test suite supported size:{ 1Gi}, driver(nfs.manila.csi.openstack.org) supported size:{ 1Gi} 
  STEP: creating a StorageClass volume-expand-8933hvbpl @ 10/07/25 16:01:26.204
  STEP: creating a claim @ 10/07/25 16:01:26.209
  I1007 16:01:26.209270 120458 pv.go:646] Warning: Making PVC: VolumeMode specified as invalid empty string, treating as nil
  I1007 16:01:26.219311 120458 pv.go:790] Waiting up to timeout=5m0s for PersistentVolumeClaims [nfs.manila.csi.openstack.orgfnn4f] to have phase Bound
  I1007 16:01:26.233414 120458 pv.go:806] PersistentVolumeClaim nfs.manila.csi.openstack.orgfnn4f found but phase is Pending instead of Bound.
  I1007 16:01:28.237494 120458 pv.go:806] PersistentVolumeClaim nfs.manila.csi.openstack.orgfnn4f found but phase is Pending instead of Bound.
  I1007 16:01:30.242550 120458 pv.go:801] PersistentVolumeClaim nfs.manila.csi.openstack.orgfnn4f found and phase=Bound (4.023193551s)
  STEP: Creating a pod with dynamically provisioned volume @ 10/07/25 16:01:30.249
  STEP: Expanding current pvc @ 10/07/25 16:01:32.272
  I1007 16:01:32.272834 120458 volume_expand.go:293] currentPvcSize {{1073741824 0} {<nil>} 1Gi BinarySI}, newSize {{2147483648 0} {<nil>}  BinarySI}
  STEP: Waiting for cloudprovider resize to finish @ 10/07/25 16:01:32.28
  STEP: Waiting for file system resize to finish @ 10/07/25 16:01:36.285
  I1007 16:01:36.288387 120458 volume_expand.go:316] Unexpected error: while verifying recovery related fields: 
      <*errors.errorString | 0xc0002be340>: 
      pvc "nfs.manila.csi.openstack.orgfnn4f" had 0 allocated size, expected 2Gi
      {
          s: "pvc \"nfs.manila.csi.openstack.orgfnn4f\" had 0 allocated size, expected 2Gi",
      }
  [FAILED] in [It] - /root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:316 @ 10/07/25 16:01:36.288
  I1007 16:01:36.288689 120458 delete.go:78] Deleting pod "pod-b2748390-5b78-4b06-ae38-4fbbd24edf88" in namespace "volume-expand-8933"
  I1007 16:01:36.295264 120458 delete.go:86] Wait up to 5m0s for pod "pod-b2748390-5b78-4b06-ae38-4fbbd24edf88" to be fully deleted
  STEP: Deleting pod @ 10/07/25 16:01:38.303
  I1007 16:01:38.303120 120458 delete.go:78] Deleting pod "pod-b2748390-5b78-4b06-ae38-4fbbd24edf88" in namespace "volume-expand-8933"
  STEP: Deleting pvc @ 10/07/25 16:01:38.305
  I1007 16:01:38.305942 120458 pv.go:205] Deleting PersistentVolumeClaim "nfs.manila.csi.openstack.orgfnn4f"
  I1007 16:01:38.312399 120458 pv.go:863] Waiting up to 20m0s for PersistentVolume pvc-1f2463ff-522d-4590-af9a-7cda265f9a4c to get deleted
  I1007 16:01:38.322807 120458 pv.go:867] PersistentVolume pvc-1f2463ff-522d-4590-af9a-7cda265f9a4c found and phase=Bound (10.372399ms)
  I1007 16:01:43.330071 120458 pv.go:871] PersistentVolume pvc-1f2463ff-522d-4590-af9a-7cda265f9a4c was removed
  STEP: Deleting sc @ 10/07/25 16:01:43.33
  STEP: Destroying namespace "volume-expand-8933" for this suite. @ 10/07/25 16:01:43.334
• [FAILED] [17.226 seconds]
[sig-storage] [manila-csi-e2e] CSI Volumes [[Driver: nfs.manila.csi.openstack.org]] [Testpattern: Dynamic PV (default fs)(allowExpansion)] volume-expand [It] should resize volume when PVC is edited while pod is using it [sig-storage]
/root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:267

  [FAILED] while verifying recovery related fields: pvc "nfs.manila.csi.openstack.orgfnn4f" had 0 allocated size, expected 2Gi
  In [It] at: /root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:316 @ 10/07/25 16:01:36.288

  Full Stack Trace
    k8s.io/kubernetes/test/e2e/storage/testsuites.(*volumeExpandTestSuite).DefineTests.func5({0x2ea6c90, 0xc000de5ce0})
    	/root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:316 +0xad3
[sig-storage] [manila-csi-e2e] CSI Volumes [[Driver: nfs.manila.csi.openstack.org]] [Testpattern: Dynamic PV (default fs)(allowExpansion)] volume-expand should resize volume when PVC is edited and the pod is re-created on the same node after controller resize is finished [sig-storage]
/root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:319
  STEP: Creating a kubernetes client @ 10/07/25 15:51:52.088
  I1007 15:51:52.088640 120458 util.go:454] >>> kubeConfig: /root/.kube/config
  STEP: Building a namespace api object, basename volume-expand @ 10/07/25 15:51:52.089
  STEP: Waiting for a default service account to be provisioned in namespace @ 10/07/25 15:51:52.103
  STEP: Waiting for kube-root-ca.crt to be provisioned in namespace @ 10/07/25 15:51:52.107
  I1007 15:51:52.110573 120458 volume_resource.go:98] Creating resource for dynamic PV
  I1007 15:51:52.110629 120458 volume_resource.go:104] Using claimSize:1Gi, test suite supported size:{ 1Gi}, driver(nfs.manila.csi.openstack.org) supported size:{ 1Gi} 
  STEP: creating a StorageClass volume-expand-3847dv8cc @ 10/07/25 15:51:52.11
  STEP: creating a claim @ 10/07/25 15:51:52.116
  I1007 15:51:52.116961 120458 pv.go:646] Warning: Making PVC: VolumeMode specified as invalid empty string, treating as nil
  I1007 15:51:52.123973 120458 pv.go:790] Waiting up to timeout=5m0s for PersistentVolumeClaims [nfs.manila.csi.openstack.orgb57s8] to have phase Bound
  I1007 15:51:52.130676 120458 pv.go:806] PersistentVolumeClaim nfs.manila.csi.openstack.orgb57s8 found but phase is Pending instead of Bound.
  I1007 15:51:54.135023 120458 pv.go:806] PersistentVolumeClaim nfs.manila.csi.openstack.orgb57s8 found but phase is Pending instead of Bound.
  I1007 15:51:56.139460 120458 pv.go:801] PersistentVolumeClaim nfs.manila.csi.openstack.orgb57s8 found and phase=Bound (4.015451554s)
  STEP: Creating a pod with dynamically provisioned volume @ 10/07/25 15:51:56.144
  STEP: Expanding current pvc @ 10/07/25 15:51:58.159
  I1007 15:51:58.159574 120458 volume_expand.go:345] currentPvcSize {{1073741824 0} {<nil>} 1Gi BinarySI}, newSize {{2147483648 0} {<nil>}  BinarySI}
  STEP: Waiting for cloudprovider resize to finish @ 10/07/25 15:51:58.165
  STEP: Deleting the pod @ 10/07/25 15:52:02.168
  I1007 15:52:02.168882 120458 delete.go:78] Deleting pod "pod-8581d343-af59-4a9f-9f7e-44b8e5547f28" in namespace "volume-expand-3847"
  I1007 15:52:02.174501 120458 delete.go:86] Wait up to 5m0s for pod "pod-8581d343-af59-4a9f-9f7e-44b8e5547f28" to be fully deleted
  STEP: Creating a new pod with same volume on the same node @ 10/07/25 15:52:04.184
  STEP: Waiting for file system resize to finish @ 10/07/25 15:52:06.198
  I1007 15:52:06.202590 120458 volume_expand.go:389] Unexpected error: while verifying recovery related fields: 
      <*errors.errorString | 0xc0007215c0>: 
      pvc "nfs.manila.csi.openstack.orgb57s8" had 0 allocated size, expected 2Gi
      {
          s: "pvc \"nfs.manila.csi.openstack.orgb57s8\" had 0 allocated size, expected 2Gi",
      }
  [FAILED] in [It] - /root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:389 @ 10/07/25 15:52:06.202
  I1007 15:52:06.202869 120458 delete.go:78] Deleting pod "pod-5d23bb4e-fcaa-4858-85a7-eb370d8cebce" in namespace "volume-expand-3847"
  I1007 15:52:06.207676 120458 delete.go:86] Wait up to 5m0s for pod "pod-5d23bb4e-fcaa-4858-85a7-eb370d8cebce" to be fully deleted
  I1007 15:52:08.214762 120458 delete.go:78] Deleting pod "pod-8581d343-af59-4a9f-9f7e-44b8e5547f28" in namespace "volume-expand-3847"
  STEP: Deleting pod2 @ 10/07/25 15:52:08.217
  I1007 15:52:08.217659 120458 delete.go:78] Deleting pod "pod-5d23bb4e-fcaa-4858-85a7-eb370d8cebce" in namespace "volume-expand-3847"
  STEP: Deleting pvc @ 10/07/25 15:52:08.22
  I1007 15:52:08.220083 120458 pv.go:205] Deleting PersistentVolumeClaim "nfs.manila.csi.openstack.orgb57s8"
  I1007 15:52:08.224044 120458 pv.go:863] Waiting up to 20m0s for PersistentVolume pvc-9176f41b-bd8c-4c71-8ef8-27ea7bfe2cb5 to get deleted
  I1007 15:52:08.228205 120458 pv.go:867] PersistentVolume pvc-9176f41b-bd8c-4c71-8ef8-27ea7bfe2cb5 found and phase=Bound (4.121128ms)
  I1007 15:52:13.235639 120458 pv.go:871] PersistentVolume pvc-9176f41b-bd8c-4c71-8ef8-27ea7bfe2cb5 was removed
  STEP: Deleting sc @ 10/07/25 15:52:13.235
  STEP: Destroying namespace "volume-expand-3847" for this suite. @ 10/07/25 15:52:13.239
• [FAILED] [21.156 seconds]
[sig-storage] [manila-csi-e2e] CSI Volumes [[Driver: nfs.manila.csi.openstack.org]] [Testpattern: Dynamic PV (default fs)(allowExpansion)] volume-expand [It] should resize volume when PVC is edited and the pod is re-created on the same node after controller resize is finished [sig-storage]
/root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:319

  [FAILED] while verifying recovery related fields: pvc "nfs.manila.csi.openstack.orgb57s8" had 0 allocated size, expected 2Gi
  In [It] at: /root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:389 @ 10/07/25 15:52:06.202

  Full Stack Trace
    k8s.io/kubernetes/test/e2e/storage/testsuites.(*volumeExpandTestSuite).DefineTests.func6({0x2ea6c90, 0xc000b83200})
    	/root/pkg/mod/k8s.io/kubernetes@v1.34.1/test/e2e/storage/testsuites/volume_expand.go:389 +0xe33

@stephenfin
Copy link
Member

stephenfin commented Oct 7, 2025

Aha, git-blame shows a single change to these tests in recent months: this one 👉 kubernetes/kubernetes@784c589

EDIT: I have checked with @gnufied (the author of the commit) and he indicated that this not related to the CSI driver but rather a sign that our version of external-snapshotter is old. Looking at the manifests, I see that this is a case (v8.1.0 was released August 2024 and the latest version is v8.3.0) and tbh most of the side cars could do with a bump. We should probably tackle that first. We probably also want to add this to our list of things to do with a k8s bump if we haven't done so already.

@stephenfin
Copy link
Member

Aha, git-blame shows a single change to these tests in recent months: this one 👉 kubernetes/kubernetes@784c589

EDIT: I have checked with @gnufied (the author of the commit) and he indicated that this not related to the CSI driver but rather a sign that our version of external-snapshotter is old. Looking at the manifests, I see that this is a case (v8.1.0 was released August 2024 and the latest version is v8.3.0) and tbh most of the side cars could do with a bump. We should probably tackle that first. We probably also want to add this to our list of things to do with a k8s bump if we haven't done so already.

#3008 should fix this.

I would also prefer if we could merge the following before this patch, since they make this PR smaller and allow me to use it as a case study for how to bump the kubernetes version.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 8, 2025
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@zetaab zetaab closed this Oct 8, 2025
@zetaab
Copy link
Member Author

zetaab commented Oct 8, 2025

replaced with #3010

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants