Skip to content

Conversation

@jalev
Copy link
Contributor

@jalev jalev commented Oct 27, 2025

By submitting a PR to this repository, you agree to the terms within the CloudZero Code of Conduct. Please see the contributing guidelines for how to create and submit a high-quality PR for this repo.

Please note that changes to the cloudzero-agent Helm chart should be made instead in the helm directory in the cloudzero-agent repository, and will automatically be mirrored to this repository as soon as they are merged.

Description

When the PVC feature is enabled you will run into the following issue when upgrading the chart:

Multi-Attach error for volume "pvc-401c5500-a013-4853-815d-0bfd7c43ad1e" Volume is already used by pod(s) cloudzero-agent-server-7f8cdfbb85-wjkdn

This is because the PVC claim is already being mounted by the previous version of the pod, while the new version of the pod is waiting for the claim to get released!

A quick fix for this is to change the rollout strategy for the deployment to Recreate. This will terminate the previously running pod and free up the PVC claim mount for use in the new pod.

References

Include any links supporting this change such as a:

  • GitHub Issue/PR number addressed or fixed
  • StackOverflow post
  • Support forum thread
  • Related pull requests/issues from other repos

If there are no references, simply delete this section.

Testing

  1. Enable PVC feature
  2. Upgrade the helm chart
  3. New agent-server container will work
  4. Upgrade helm chart again (e.g. add label change)
  5. New agent-server container will be blocked until previous container is deleted

Describe how this can be tested by reviewers. Be specific about anything not tested and reasons why. If this library has unit and/or integration testing, tests should be added for new functionality and existing tests should complete without errors.

Please include any manual steps for testing end-to-end or functionality not covered by unit/integration tests.

Also include details of the environment this PR was developed in (language/platform/browser version).

  • This change adds test coverage for new/changed/fixed functionality

Checklist

  • I have added documentation for new/changed functionality in this PR
  • All active GitHub checks for tests, formatting, and security are passing
  • The correct base branch is being used, if not main

@jalev jalev requested a review from a team as a code owner October 27, 2025 13:43
@jalev jalev closed this Oct 27, 2025
@jalev jalev deleted the patch-1 branch October 27, 2025 13:45
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR fixes a Multi-Attach error that occurs during Helm upgrades when the PVC feature is enabled. The change adds a Recreate deployment strategy when server.persistentVolume.enabled is true.

Key changes:

  • Conditionally sets strategy.type: Recreate for the agent deployment when PVC is enabled
  • Resolves the Multi-Attach error caused by ReadWriteOnce access mode during rolling updates
  • Ensures the old pod terminates and releases the PVC before the new pod attempts to mount it

Impact:

  • The fix is scoped correctly - only applies when PVC feature is enabled
  • During upgrades with PVC enabled, there will be brief downtime as the old pod must terminate before the new one starts
  • This is the standard Kubernetes pattern for deployments using ReadWriteOnce PVCs

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The change is a standard Kubernetes pattern for handling ReadWriteOnce PVCs in deployments. The implementation is correct, properly scoped with a conditional, and directly addresses the documented Multi-Attach error. The deployment is hardcoded to 1 replica, so there are no HA concerns.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
charts/cloudzero-agent/templates/agent-deploy.yaml 5/5 Added Recreate deployment strategy when PVC is enabled to fix Multi-Attach error with ReadWriteOnce volumes

Sequence Diagram

sequenceDiagram
    participant Helm as Helm Upgrade
    participant K8s as Kubernetes API
    participant OldPod as Old Pod
    participant PVC as PersistentVolumeClaim
    participant NewPod as New Pod
    
    Note over Helm,NewPod: With Recreate Strategy (New Behavior)
    Helm->>K8s: Apply deployment with strategy: Recreate
    K8s->>OldPod: Terminate old pod
    OldPod->>PVC: Unmount volume
    OldPod-->>K8s: Pod terminated
    K8s->>NewPod: Create new pod
    NewPod->>PVC: Mount volume (ReadWriteOnce)
    PVC-->>NewPod: Volume mounted successfully
    NewPod-->>K8s: Pod ready
    
    Note over Helm,NewPod: Without Recreate Strategy (Old Behavior)
    Helm->>K8s: Apply deployment with RollingUpdate
    K8s->>NewPod: Create new pod (before terminating old)
    NewPod->>PVC: Attempt to mount volume
    PVC-->>NewPod: ERROR: Multi-Attach - volume already in use
    Note over NewPod,OldPod: Old pod still using ReadWriteOnce volume
Loading

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant