fix: add clusterCapacity support for drive-sharing clusters (OP-329)#2586
Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
How to use the Graphite Merge QueueAdd the label main-merge-queue to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has required the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
Claude finished @kristina-solovyova's task in 6m 50s —— View job Claude PR Review — clusterCapacity for drive-sharing (OP-329)Reviewed the full diff (37 files, ~4070 LOC). This is a well-structured, heavily-tested addition — the planner is split cleanly into driver/ephemeral/explicit/assignment modules, error propagation is generally correct, and edge cases (zero nodes, zero QLC FDs, pure-QLC clusters) are mostly guarded. Findings below, ordered by severity. The submodule 🔴 HIGH1. Error swallowing in info, err := nodeInfoGetter(ctx, nodeName)
if err != nil {
// Log and skip — a missing annotation is not fatal for planning.
continue
}The comment says "Log and skip," but nothing is logged, and 2. Pool-assignment regression for drive-sharing clusters with no cluster-level allSameType := false
if cluster.Spec.Dynamic != nil && cluster.Spec.Dynamic.DriveTypesRatio != nil &&
(cluster.Spec.Dynamic.DriveTypesRatio.Qlc == 0 || cluster.Spec.Dynamic.DriveTypesRatio.Tlc == 0) {
allSameType = true
}The 🟡 MEDIUM3. Explicit-mode anchor recognition breaks when a pinned node falls out of inventory 4. Wrong QLC denominator in covered fast-path PerFDGiB: util.CeilDiv(tlcRaw, numTLC) + util.CeilDiv(qlcRaw, numTLC),QLC is divided by 5. Three new unguarded 🟢 LOW / confirm
✅ Verified correct (no action)
Bottom line: Findings #1 (error swallow) and #2 (pool default) are the two I'd want resolved/confirmed before merge; #3 is a real idempotency edge case worth fixing; the rest are confirm/cleanup. Nice test coverage overall. |
21b73b5 to
ca5ac92
Compare
|
Claude finished @kristina-solovyova's task in 1m 50s —— View job Claude PR Review (re-review) — clusterCapacity for drive-sharing (OP-329)The branch was updated after my first review (commit
✅ Resolved since last review#1 (HIGH) Error swallowing in #2 (HIGH) Pool-assignment default for drive-sharing — fixed. #4 (MEDIUM) #5 (MEDIUM) Unguarded 🟡 Still open#3 (MEDIUM) Explicit-mode anchor recognition still keyed only on live for _, e := range existing {
fd := nodeFD[e.PinnedNode]
if fd == "" {
continue
}
...
}The ephemeral planner was updated to key idempotency on the persisted The fields needed for the robust fix already exist on the Minor (still present): the dead guard at Notes
Bottom line: Strong iteration — the two HIGH findings and two of the MEDIUMs are cleanly resolved. #3 (explicit-mode anchor recognition) is the one remaining correctness edge case I'd want closed before merge; the dead-guard cleanup pairs with it. |

No description provided.