Skip to content

[controller] Fix LVMVolumeGroup spec.blockDeviceSelector update Issue#130

Draft
AleksZimin wants to merge 2 commits into
mainfrom
fix-lvg-discover
Draft

[controller] Fix LVMVolumeGroup spec.blockDeviceSelector update Issue#130
AleksZimin wants to merge 2 commits into
mainfrom
fix-lvg-discover

Conversation

@AleksZimin

@AleksZimin AleksZimin commented Feb 26, 2025

Copy link
Copy Markdown
Member

Description

This PR addresses an issue where the spec.blockDeviceSelector field of an existing LVMVolumeGroup resource was not updating during reconciliation, even when new physical volumes (PVs) were manually added on the node to the corresponding LVM Volume Group. With this fix, the spec.blockDeviceSelector now correctly reflects all associated BlockDevice resources, ensuring accurate representation of the volume group's state.

Why do we need it, and what problem does it solve?

Previously, when new BlockDevice resources were manually introduced to an existing LVM Volume Group on a node, the spec.blockDeviceSelector field in the corresponding LVMVolumeGroup resource remained unchanged. This discrepancy could lead to mismanagement of storage resources and potential inconsistencies in volume group configurations. By ensuring that spec.blockDeviceSelector is updated appropriately, we maintain consistency between the actual state of the LVM Volume Group and its representation within Kubernetes.

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

@AleksZimin AleksZimin self-assigned this Feb 26, 2025
@AleksZimin AleksZimin changed the title Fix lvg discover [controller] Fix LVMVolumeGroup spec.blockDeviceSelector update Issue Feb 26, 2025
@AleksZimin AleksZimin requested review from astef and duckhawk February 26, 2025 20:35
@AleksZimin AleksZimin marked this pull request as ready for review February 26, 2025 20:35
@AleksZimin AleksZimin added the bug Something isn't working label Feb 26, 2025
Comment thread images/agent/src/internal/controller/lvg/discoverer.go Outdated
Comment thread images/agent/src/internal/controller/lvg/discoverer.go Outdated
Comment thread images/agent/src/internal/controller/lvg/discoverer.go Outdated
Comment thread images/agent/src/internal/controller/lvg/discoverer.go
@astef astef assigned asergunov and unassigned astef Apr 4, 2025
@asergunov asergunov requested a review from astef April 8, 2025 10:05
Comment thread images/agent/internal/controller/lvg/discoverer.go Outdated
@asergunov asergunov marked this pull request as draft April 11, 2025 04:41
@asergunov

Copy link
Copy Markdown
Contributor

This is a quick fix. We are going to implement proper one.

Update spec.blockDeviceSelector during reconciliation when new
BlockDevices are added to the LVM Volume Group on the node, so the
field reflects the actual set of devices in the VG.

* Filter BlockDevices by node before computing candidates.
* Add notMatchedBlockDeviceNames helper that uses k8s.io/apimachinery
  labels selector to determine which device names are not matched.
* Add appendDeviceNamesToLabelSelector and updateBlockDeviceSelectorIfNeeded
  to extend an existing matchExpressions/matchLabels selector or create
  one from scratch when nil.
* Include selector mismatch in hasLVMVolumeGroupDiff so the
  discoverer triggers an update when the selector falls behind the
  actual device set.
* Add unit tests covering: no-op when up to date, creation from nil,
  appending to matchExpressions, migrating from matchLabels, and
  mixed selectors with unrelated keys.

Co-authored-by: Anton Sergunov <anton.sergunov@flant.com>
Co-authored-by: Aleksandr Stefurishin <aleksandr.stefurishin@flant.com>
Signed-off-by: Aleksandr Zimin <alexandr.zimin@flant.com>
@duckhawk duckhawk force-pushed the fix-lvg-discover branch from ffe2f5d to f7fb6fe Compare May 13, 2026 04:48
The previous attempt called updateBlockDeviceSelectorIfNeeded(...) and
discarded the returned selector, then called only Status().Update(), so
the change to Spec.BlockDeviceSelector was never sent to the API server:

* When the existing selector was nil, the freshly-built selector was
  dropped on the floor (function return value assigned to "_").
* When the existing selector was non-nil, the in-memory mutation was
  performed but Status().Update() does not write the spec subresource,
  so the API state was never updated.

As a side-effect, hasLVMVolumeGroupDiff kept returning true on every
discoverer cycle (the selector mismatch never resolved), causing
UpdateLVMVolumeGroupByCandidate to run forever.

Fix:

* Capture the returned selector and assign it to lvg.Spec.BlockDeviceSelector
  when non-nil.
* Call cl.Update(ctx, lvg) to persist the spec change. controller-runtime
  refreshes lvg.ResourceVersion in place, so the subsequent
  Status().Update() picks up the new revision without a refetch.
* Move the spec Update before populating the status fields. The API
  server (and the fake client used in tests) preserves status on
  Update(), so doing the spec write after we'd populated lvg.Status
  would wipe those fields back to whatever the server had stored.

Add two regression tests:

* UpdateLVMVolumeGroup_persists_spec_blockDeviceSelector covers the
  "selector existed but is missing new devices" path: candidate brings
  three devices; we read the LVG back from the fake client and assert
  notMatchedBlockDeviceNames(stored.Spec.BlockDeviceSelector, ...)
  is empty, plus that status was also persisted.
* UpdateLVMVolumeGroup_creates_spec_blockDeviceSelector_when_nil
  covers the legacy LVG path: an LVG with nil selector gets a new one
  built from candidate.BlockDevicesNames.

Both regressions fail without the fix above.

Signed-off-by: Aleksandr Zimin <alexandr.zimin@flant.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants