Skip to content

fix(lvg): align thin-pool and PV resize sizing with LVM#223

Draft
IvanOgurchenok wants to merge 16 commits into
mainfrom
fix-thin-pool-size-calculation
Draft

fix(lvg): align thin-pool and PV resize sizing with LVM#223
IvanOgurchenok wants to merge 16 commits into
mainfrom
fix-thin-pool-size-calculation

Conversation

@IvanOgurchenok

@IvanOgurchenok IvanOgurchenok commented Apr 8, 2026

Copy link
Copy Markdown

Description

This PR fixes LVMVolumeGroup sizing regressions introduced after efa5f61c53801579ea84f696f2a81292e3de47c6 removed the old resizeDelta tolerance and switched the controller to strict LVM-derived size comparisons.

The change aligns thin-pool percentage sizing, PV resize detection, and validation tolerance with the actual usable LVM sizes reported by vgs/pvs. It also extends the regression coverage around 100% thin-pool requests, extent rounding, and PV resize behavior. The E2E cluster configuration uses RedOS for the primary nodes so the scenario is exercised on the target OS image.

Why do we need it, and what problem does it solve?

LVM reports usable PV/VG size after metadata overhead and extent alignment, while the controller still used raw block device sizes in several paths. That mismatch could reject valid percentage-based thin-pool requests, repeatedly trigger pvresize on unchanged devices, or fail validation only because the aligned size was rounded by an extent.

What is the expected result?

After applying this PR:

  • percentage-based thin pools such as 99% and 100% are calculated from the real created VG size;
  • unchanged devices no longer cause repeated pvresize attempts caused by the normal LVM metadata gap;
  • validation tolerates expected extent rounding for requested thin pools;
  • E2E coverage checks that the current disk resize causes a new successful pvresize log entry and then stabilizes.

Checklist

  • The code is covered by unit tests.
  • E2E regression coverage is updated.
  • Full E2E run passed.
  • Documentation updated according to the changes. N/A for user-facing docs.

@IvanOgurchenok IvanOgurchenok self-assigned this Apr 8, 2026
@IvanOgurchenok IvanOgurchenok changed the title fix(lvg): calculate thin-pool requested size based on actual VG size fix: calculate thin-pool requested size based on actual VG size, resize size fix, 100% new size fix Apr 8, 2026
@IvanOgurchenok IvanOgurchenok force-pushed the fix-thin-pool-size-calculation branch from d848281 to 5336a94 Compare April 13, 2026 23:55
@IvanOgurchenok IvanOgurchenok marked this pull request as ready for review April 14, 2026 00:22
@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ❌

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-24373992421
Status: Tests failed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ❌

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-24551988804
Status: Tests failed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ❌

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-24555453703
Status: Tests failed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

@IvanOgurchenok IvanOgurchenok force-pushed the fix-thin-pool-size-calculation branch from 2a7beba to 89f107c Compare April 17, 2026 08:50
@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ❌

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-24551988804
Status: Tests failed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ✅

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-24556940685
Status: Tests passed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

@szhem szhem left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(review withdrawn by author)

@szhem szhem left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Reviewing the fixes for Bugs 1–3 from BUG_REPORT_THINPOOL_SIZE.md (percentage thin-pool off the real VGSize, exact pe_start in resizePVIfNeeded, per-pool extent tolerance in validateLVGFor*Func). The logic of all three fixes is correct and well explained in the bug report and the commit messages. The comments below group the findings that should ideally be addressed before merge.

Severity overview

Severity Count
Critical 1
High 5
Medium 6
Low 2
Nit (PR-wide, below) 3

PR-wide nits (not attached inline)

  • PR checklist empty. The "The code is covered by unit tests / e2e tests passed / Documentation updated / Changes were tested manually" boxes are still unchecked. Tick those that apply and mark docs N/A if nothing changed — it's the easiest signal reviewers have that the author walked through the list.
  • Explicit link to efa5f61. The PR description references resizeDelta and commit efa5f61. GitHub resolves the hash, but a full URL (https://github.com/deckhouse/sds-node-configurator/commit/efa5f61) is friendlier in changelogs and IDEs.

Positive observations

  • Each of the three fixes lives in its own focused commit (9449b2f, 1cfb2d3, e9261c1) with a clear message.
  • Using pv.PEStart from pvs instead of the heuristic DevSize - PVSize > extent is the right side of the LVM contract — pe_start is exactly the offset where physical extents begin, and DevSize - (PVSize + peStart) >= extent correctly signals unallocated space beyond the current PV.
  • vgSize in reconcileLVGCreateFunc is hoisted out of the per-pool loop; the vgAfterCreate != nil branch and the fallback Warning keep the code path visible in logs (see the inline comment on that fallback for a follow-up).
  • PVData.PEStart resource.Quantity matches the rest of the type and round-trips through pvs --units B --nosuffix correctly.
  • Table-driven sub-tests with_100_percent_thin_pools_returns_true keep the existing test style (see the inline comments on how to make them actually load-bearing).

The rest is style / trade-off and can be discussed inline.

Comment thread e2e/tests/sds_node_configurator_test.go Outdated
Comment thread e2e/tests/sds_node_configurator_test.go Outdated
Comment thread images/agent/internal/controller/lvg/reconciler_test.go
Comment thread images/agent/internal/controller/lvg/reconciler_test.go
Comment thread images/agent/internal/controller/lvg/reconciler.go
Comment thread images/agent/internal/controller/lvg/reconciler.go Outdated
Comment thread images/agent/internal/controller/lvg/reconciler.go Outdated
Comment thread images/agent/internal/controller/lvg/reconciler.go Outdated
Comment thread images/agent/internal/utils/commands.go
Comment thread images/agent/internal/controller/lvg/reconciler.go Outdated
@IvanOgurchenok IvanOgurchenok force-pushed the fix-thin-pool-size-calculation branch from 532fac1 to 4991879 Compare April 21, 2026 23:39
duckhawk
duckhawk previously approved these changes Apr 22, 2026
@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ❌

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-24818686607
Status: Tests failed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ❌

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-24879352586
Status: Tests failed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

@IvanOgurchenok IvanOgurchenok changed the title fix: calculate thin-pool requested size based on actual VG size, resize size fix, 100% new size fix fix(lvg): align thin-pool and PV resize sizing with LVM May 27, 2026
@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ✅

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-26496357612
Status: Tests passed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

Since commit efa5f61 removed the resizeDelta tolerance in favor of exact
extent-aligned size comparisons, the controller started failing validation
checks when applying percentage-based thin-pool sizes. It originally
calculated the requested bytes from raw BlockDevice capacities (and redundantly
re-calculated it on every loop iteration) instead of the actual VG space.
The VG space is smaller due to metadata reservations and extent alignments.

This updates reconcileLVGCreateFunc to query the actual VG size from LVM
after creation, and evaluates requested pool bytes based on this real size
before creating pools. This eliminates discrepancies between the creation
and validation stages. It also moves the fallback calculation outside of
the thin-pools loop.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
This addresses additional regressions caused by the removal of resizeDelta:
1. Infinite pvresize loop: The controller falsely interpreted physical
loss to LVM metadata as raw disk expansion. It now fetches exact pe_start
values from LVM to accurately determine the PV metadata header size and
verify if physical BlockDevices genuinely expanded.
2. False-positive validation on 100% pools: Added an extent-sized tolerance
multiplier to pre-creation validation, allowing mathematically rounded
full-disk requests to pass without failing "Required space" checks.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Add unit coverage for 100% thin-pool validation in create and update flows, and extend the E2E scenario to verify VirtualDisk expansion updates BlockDevice and VG size.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Improve LVG validation and create/update flow safety around extent tolerance,
cached VG extent usage, and transient GetVG failures. Split pvresize E2E into
a dedicated spec and add regression coverage for tolerance and pe_start parsing.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Rebase the LVG pvresize scenario onto the current main test structure while keeping the review-driven checks for PV growth, absence of PVResizeFailed, and stable pvresize convergence.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Align PVData fields with gofmt/gci expectations so the agent linter no longer fails on the PEStart-related type update.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Add regression tests for reconcileLVGCreateFunc so percent thin-pool sizing is driven by the actual VG size after vgcreate and transient GetVG failures trigger a safe retry instead of continuing on raw device size.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Added missing resources used by Common Scheduler + Sds Node Configurator
E2E tests in nested cluster:
- localstorageclasses.storage.deckhouse.io
- core: namespaces, events, configmaps, secrets, leases
- deckhouse.io: modules, moduleconfigs, modulepulloverrides
This eliminates potential RBAC issues during setup/cleanup.
Nested cluster is ephemeral, so broad permissions are safe.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Use the selected BlockDevice metadata label as the stable selector and include LVG conditions in the readiness failure message so CI failures expose the pending reason.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Move pvresize helper functions into the split E2E helper file and remove the duplicate pvresize scenario left by rebasing the old monolithic test layout.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Remove duplicated e2e validation scenario, keep e2e RBAC aligned with main for verification, and normalize test comments before rerunning the full e2e suite.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Run pvs with sudo in the pvresize e2e check and include command output in failures so CI logs show the real LVM or permission error.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ✅

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-27333288320
Status: Tests passed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

Allow existing thin pools to be up to one extent above the aligned requested size so LVM rounding does not leave LVMVolumeGroups unapplied. Extend e2e coverage for VGConfigurationApplied and percent LVMLogicalVolume growth after vgextend.

Signed-off-by: Ivan Ogurchenok <ivan.ogurchenok@flant.com>
@github-actions

Copy link
Copy Markdown
Contributor

E2E Smoke Tests Results ❌

Module: sds-node-configurator
Framework: storage-e2e
Namespace: e2e-sds-node-configurator-pr223-27693519191
Status: Tests failed

View Details

Tests were executed via storage-e2e (TestSdsNodeConfigurator: Common Scheduler Extender, then Sds Node Configurator).

Artifacts: Test logs are available in workflow artifacts.

Comment on lines +800 to +806
extentTolerance := int64(len(lvg.Spec.ThinPools)) * extentSizeVal
allowedSpace := totalVGSize.Value() + extentTolerance

if totalThinPoolSize > allowedSpace {
r.log.Trace(fmt.Sprintf("[validateLVGForCreateFunc] total thin pool size: %s, allowed space: %d", resource.NewQuantity(totalThinPoolSize, resource.BinarySI).String(), allowedSpace))
r.log.Warning(fmt.Sprintf("[validateLVGForCreateFunc] requested thin pool size is more than VG total size for the LVMVolumeGroup %s", lvg.Name))
reason.WriteString(fmt.Sprintf("Required space for thin-pools %d is more than VG size %d.", totalThinPoolSize, totalVGSize.Value()))
reason.WriteString(fmt.Sprintf("Required space for thin-pools %d is more than allowed space %d (VG size %d + extent tolerance %d). ", totalThinPoolSize, allowedSpace, totalVGSize.Value(), extentTolerance))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architectural concern — extent tolerance reintroduced into validation.

These create/update validators gate thin-pool sizing with an extent-sized tolerance layered on top of the deterministic AlignSizeToExtent:

  • create: allowedSpace = totalVGSize + len(lvg.Spec.ThinPools)*extent (here)
  • update, adding pools: allowedFreeSpace = totalFreeSpace + addingPoolsCount*extent (L964–L967)
  • update, existing pool: allowedActualSize = alignedTpSize + extent (L940)

The extent-alignment refactoring (#201) deliberately replaced the fixed resizeDelta tolerance with deterministic up-alignment, precisely to stop relying on a slack value. The design rationale explicitly rejected the "just increase the tolerance" alternative, because a tolerance masks real discrepancies and scales with the number of objects.

In the create path allowedSpace grows with pool count, so N pools can each overshoot by up to one extent and still pass validation — only to fail later at lvcreate. That is the same scaling-slack behavior the refactoring set out to remove.

Question: was a deterministic alternative considered instead of a tolerance — e.g. comparing the sum of aligned pool sizes against the (aligned) VG size, or capping each pool at the remaining VG (min(aligned, remaining)), mirroring the create path's alignedTpSize >= vgSize → CreateThinPoolFullVGSpace? That keeps the request→validate→create round-trip deterministic and avoids reintroducing the slack #201 eliminated. If the tolerance is intentional, a short comment explaining why one extent per pool is the correct, bounded allowance would help future readers.

@szhem

szhem commented Jun 17, 2026

Copy link
Copy Markdown

Scope question: are relative sizes of LVMLogicalVolumes in scope here, or only thin-pool (LVG) sizing + PV resize?

This PR changes the LVG reconciler (thin-pool create/validate), adds PVData.PEStart, and reworks PV resize. The LVMLogicalVolume paths — images/agent/internal/controller/llv/reconciler.go (reconcileLLVCreateFunc / reconcileLLVUpdateFunc) and llv_extender/reconciler.go (ReconcileLVMLogicalVolumeExtension) — are not touched. Those were already aligned to extents in #201, so this is not a regression.

I'm asking to make the scope explicit: if "relative LV sizes" is also meant to cover percent-sized LLVs (e.g. lvm-thick / lvm-thin PVCs with Spec.Size in %), that path is outside this PR's diff and would warrant its own coverage — in particular on a non-default physical extent size (e.g. 16 MiB), where percentage rounding behaves differently.

If the intent is strictly thin-pool sizing + PV resize, this is fine as-is — just flagging so reviewers don't assume the LLV side is covered by this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants