Skip to content

Add early returns on resource changes in ServerReconciler#539

Open
nagadeesh-nagaraja wants to merge 6 commits intomainfrom
consmeticImp
Open

Add early returns on resource changes in ServerReconciler#539
nagadeesh-nagaraja wants to merge 6 commits intomainfrom
consmeticImp

Conversation

@nagadeesh-nagaraja
Copy link
Copy Markdown
Contributor

@nagadeesh-nagaraja nagadeesh-nagaraja commented Nov 25, 2025

Proposed Changes

Server Reconciler avoid double reconciliation logic. This helps avoid overloading BMC, multiple reconcile during sequencial l update

Summary by CodeRabbit

  • New Features

    • System information now includes BIOS version tracking.
  • Improvements

    • Enhanced server status updates and boot configuration management.
    • Improved detection and handling of server configuration changes.
    • Better server state transition management with more efficient update processing.

✏️ Tip: You can customize this high-level summary in your review settings.

@Nuckal777
Copy link
Copy Markdown
Contributor

Thanks, could you please remove the changes of #540 from this PR?

@nagadeesh-nagaraja
Copy link
Copy Markdown
Contributor Author

Thanks, could you please remove the changes of #540 from this PR?

I will rebase once that PR merges. this PR will fail unit tests without the test fixes in the other PR :D

@github-actions github-actions Bot added size/L and removed size/XL labels Nov 27, 2025
@nagadeesh-nagaraja nagadeesh-nagaraja changed the title Fix unwanted reconcile in Server and enqueue logic in maintenance Fix unwanted reconcile in Server Dec 4, 2025
@nagadeesh-nagaraja nagadeesh-nagaraja changed the title Fix unwanted reconcile in Server Fix unwanted reconcile in Server Controller Dec 4, 2025
Comment on lines +241 to +243
if modified {
return ctrl.Result{}, nil
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not doing this right after the ensureServerStateTransition call?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest doing something like that here:

if modified, err := r.ensureServerStateTransition(ctx, log, bmcClient, server); err != nil || modified {
   return ctrl.Result{}, err
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because we like to update the status field before ending the reconcile... as use the status field to make decision in beginning of reconcile loop, i wanted to update the status before next reconcile.
I know updating status might modify the CRD by itself.. but I think that next improvement to tackle, what do you think?

@afritzler afritzler changed the title Fix unwanted reconcile in Server Controller Add early returns on resource changes in ServerReconciler Dec 9, 2025
@afritzler afritzler added enhancement New feature or request and removed bug Something isn't working labels Dec 9, 2025
@afritzler
Copy link
Copy Markdown
Member

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 15, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 15, 2026

Walkthrough

Extends SystemInfo with a BIOSVersion field in the BMC layer and refactors the server controller to use CreateOrPatch pattern with modification signaling, enabling callers to detect and conditionally requeue based on whether reconciliation steps mutated server state.

Changes

Cohort / File(s) Summary
BMC Layer
bmc/bmc.go, bmc/redfish.go
Added BIOSVersion field to SystemInfo struct and populated it in GetSystemInfo() from upstream system data.
Server Controller
internal/controller/server_controller.go
Refactored reconciliation methods to return (bool, error) instead of error only, replacing direct Patch() calls with CreateOrPatch() to detect mutations. Affected functions: ensureServerBootConfigRef(), extractServerDetailsFromRegistry(), applyBootConfigurationAndIgnitionForDiscovery(), updateServerStatusFromSystemInfo(), ensureInitialBootConfigurationIsDeleted(), patchServerState(), ensureServerPowerState(), and status update functions. Controller now immediately requeues on detected modifications during state transitions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

api-change

Suggested reviewers

  • Nuckal777
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Description check ❓ Inconclusive The description is incomplete and contains grammatical errors. It lacks detailed explanation of changes, fixes section, and structured information compared to the template. Expand description with specific changes made, clarify the problem being solved, include a 'Fixes #' section, and correct grammatical issues ('consmeticImp' branch name, 'sequencial l update').
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: adding early returns on resource changes in ServerReconciler to avoid unnecessary reconciliation cycles.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In `@internal/controller/server_controller.go`:
- Around line 1044-1067: The updatePowerOffCondition function is using the
PoweringOnCondition constant while setting reason "ServerPowerOff" and message
"Server is powering off", which is semantically inconsistent; change the
condition type used in updatePowerOffCondition (the call to acc.UpdateSlice)
from PoweringOnCondition to a clearly named constant such as
PoweringOffCondition (or a generic PowerStateCondition) and ensure that the
chosen constant is declared and used consistently (and updated in any related
readers/guards) so the condition type matches the reason/message set by
updatePowerOffCondition.
- Around line 239-243: Fix the typo in the comment inside the Server state
transition block: change "there there as an event to process the new state" to
"there is an event to process the new state" near the modified check (the block
that returns ctrl.Result{}, nil when modified is true) so the comment reads
clearly.
- Around line 407-409: The current return ignores the modification flag from
ensureInitialBootConfigurationIsDeleted; change the return to propagate the
modified boolean (i.e., return modified, err) so callers receive the true/false
modification signal when err is nil, matching the file's existing pattern and
enabling the expected early-return behavior in the surrounding function.
- Around line 1019-1042: The updatePowerOnCondition function is incorrectly
using controllerutil.CreateOrPatch (which doesn't update the status
subresource); change it to use the Status writer: deep-copy the original server
(e.g., serverBase := server.DeepCopy()), mutate only server.Status.Conditions
with conditionutils.NewAccessor and acc.UpdateSlice (as you already do), then
call r.Status().Patch(ctx, server, client.MergeFrom(serverBase)) and handle the
error; return true when the patch succeeds (or after comparing pre/post
condition slices to detect a real change) and return false on no-op or error.
Apply the same pattern to updatePowerOffCondition.
- Around line 367-384: Replace status-only CreateOrPatch calls with
Status().Patch using the same merge-from pattern used elsewhere: take a copy of
the existing server object (e.g., serverBase), modify
server.Status.NetworkInterfaces and server.Status.State, then call
r.Status().Patch(ctx, server, client.MergeFrom(serverBase)) instead of
controllerutil.CreateOrPatch; apply this change for the occurrence in the block
updating server status (currently using controllerutil.CreateOrPatch) and the
other listed occurrences (around lines noted in the review), keeping the
surrounding error handling and the subsequent call to
r.invalidateRegistryEntryForServer unchanged.
♻️ Duplicate comments (1)
internal/controller/server_controller.go (1)

557-614: Same issue: CreateOrPatch used for status-only updates.

This function updates exclusively status fields (BIOSVersion, PowerState, SerialNumber, Processors, Storages, etc.) but uses CreateOrPatch which operates on the main resource, not the status subresource. This is the same issue flagged earlier and should use Status().Patch() instead.

🧹 Nitpick comments (1)
internal/controller/server_controller.go (1)

907-924: Redundant status assignment before CreateOrPatch.

Line 911 sets server.Status.State = state but this assignment is immediately overwritten by the CreateOrPatch mutate function at line 914. The assignment at line 911 should be removed.

Additionally, this function updates a status field via CreateOrPatch which has the same concern noted earlier.

Suggested fix
 func (r *ServerReconciler) patchServerState(ctx context.Context, server *metalv1alpha1.Server, state metalv1alpha1.ServerState) (bool, error) {
 	if server.Status.State == state {
 		return false, nil
 	}
-	server.Status.State = state

 	opResult, err := controllerutil.CreateOrPatch(ctx, r.Client, server, func() error {
 		server.Status.State = state
 		return nil
 	})
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 493d587 and 9f791d3.

📒 Files selected for processing (3)
  • bmc/bmc.go
  • bmc/redfish.go
  • internal/controller/server_controller.go
🧰 Additional context used
📓 Path-based instructions (1)
**/*.go

📄 CodeRabbit inference engine (AGENTS.md)

**/*.go: Code must follow standard Go formatting and idioms; use clear, explicit error handling with minimal global state
Use small, focused functions; introduce interfaces only when justified

Files:

  • bmc/bmc.go
  • bmc/redfish.go
  • internal/controller/server_controller.go
🧠 Learnings (8)
📚 Learning: 2026-01-09T15:31:43.862Z
Learnt from: CR
Repo: ironcore-dev/metal-operator PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-09T15:31:43.862Z
Learning: Applies to **/controllers/**/*_controller.go : Flag non-idempotent reconcile logic, Spec mutations during reconciliation, and custom patterns duplicating standard Kubernetes behavior

Applied to files:

  • internal/controller/server_controller.go
📚 Learning: 2026-01-09T15:31:43.862Z
Learnt from: CR
Repo: ironcore-dev/metal-operator PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-09T15:31:43.862Z
Learning: Applies to **/controllers/**/*_controller.go : Prevent hot loops caused by status-only changes triggering reconciliation

Applied to files:

  • internal/controller/server_controller.go
📚 Learning: 2026-01-09T15:31:43.862Z
Learnt from: CR
Repo: ironcore-dev/metal-operator PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-09T15:31:43.862Z
Learning: Applies to **/controllers/**/*_controller.go : Status updates must be performed via `Status().Update()` or `Status().Patch()` and separated from spec mutations

Applied to files:

  • internal/controller/server_controller.go
📚 Learning: 2026-01-09T15:31:43.862Z
Learnt from: CR
Repo: ironcore-dev/metal-operator PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-09T15:31:43.862Z
Learning: Applies to **/controllers/**/*_controller.go : Reconciliation must be based on current cluster state, not assumptions; handle NotFound errors correctly and gracefully manage partial failures

Applied to files:

  • internal/controller/server_controller.go
📚 Learning: 2026-01-09T15:31:43.862Z
Learnt from: CR
Repo: ironcore-dev/metal-operator PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-09T15:31:43.862Z
Learning: Applies to **/controllers/**/*_controller.go : Use `client.Client`, `controllerutil.CreateOrUpdate`, and OwnerReferences from controller-runtime rather than direct API calls

Applied to files:

  • internal/controller/server_controller.go
📚 Learning: 2026-01-09T15:31:43.862Z
Learnt from: CR
Repo: ironcore-dev/metal-operator PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-09T15:31:43.862Z
Learning: Applies to **/controllers/**/*_controller.go : Status must reflect observed state, not desired state; conditions must be updated consistently and deterministically

Applied to files:

  • internal/controller/server_controller.go
📚 Learning: 2026-01-09T15:31:43.862Z
Learnt from: CR
Repo: ironcore-dev/metal-operator PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-09T15:31:43.862Z
Learning: Applies to **/controllers/**/*_controller.go : Reconcile logic must be idempotent and safe to re-run at any time without side effects

Applied to files:

  • internal/controller/server_controller.go
📚 Learning: 2026-01-09T15:31:43.862Z
Learnt from: CR
Repo: ironcore-dev/metal-operator PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-09T15:31:43.862Z
Learning: Applies to **/controllers/**/*.go : Include relevant identifiers in logs; emit events for meaningful user-facing state changes

Applied to files:

  • internal/controller/server_controller.go
🧬 Code graph analysis (3)
bmc/bmc.go (1)
api/v1alpha1/biosversion_types.go (1)
  • BIOSVersion (130-136)
bmc/redfish.go (1)
api/v1alpha1/biosversion_types.go (1)
  • BIOSVersion (130-136)
internal/controller/server_controller.go (5)
bmc/bmc.go (7)
  • BMC (36-123)
  • Server (182-189)
  • PowerState (233-233)
  • Manufacturer (19-19)
  • Processor (252-271)
  • Storage (222-230)
  • NetworkInterface (176-180)
api/v1alpha1/server_types.go (8)
  • Server (428-434)
  • Power (13-13)
  • PowerOn (17-17)
  • PowerOff (20-20)
  • ServerPowerState (30-30)
  • Processor (249-285)
  • Storage (395-411)
  • NetworkInterface (288-315)
internal/api/registry/server.go (2)
  • Server (17-26)
  • NetworkInterface (8-14)
internal/registry/server.go (1)
  • Server (25-31)
api/v1alpha1/serverbootconfiguration_types.go (1)
  • ServerBootConfiguration (62-68)
🔇 Additional comments (8)
bmc/bmc.go (1)

286-287: LGTM!

The BIOSVersion field addition to SystemInfo struct follows the existing pattern for system metadata fields and enables BIOS version tracking from the BMC.

bmc/redfish.go (1)

297-297: LGTM!

The BIOSVersion field is correctly populated from system.BIOSVersion, consistent with the existing GetBiosVersion method and the struct field addition.

internal/controller/server_controller.go (6)

833-905: LGTM!

The refactored function signature returning ([]metalv1alpha1.NetworkInterface, error) separates data fetching from object mutation, improving code clarity and testability. The caller correctly handles the nil case when the server is not found in the registry.


502-521: LGTM!

Correct usage of CreateOrPatch for updating server.Spec.BootConfigurationRef with proper modification signaling.


624-653: LGTM!

Correct usage of CreateOrPatch for ServerBootConfiguration (a separate resource) with proper early return on modification. The modification signaling pattern is correctly implemented.


954-1017: LGTM!

The modification signaling correctly distinguishes between BMC power operations (which don't modify K8s resources) and condition updates (which do). The early return pattern is properly implemented.


449-460: LGTM!

Correct usage of CreateOrPatch for removing ServerClaimRef with proper modification signaling.


389-399: LGTM!

Correct usage of CreateOrPatch for server.Spec.Power (Spec field) with proper early return on modification.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +239 to +243
// Server has beem modified during state transition, requeue to process the new state
// no need to requeAfter duration as there there as an event to process the new state
if modified {
return ctrl.Result{}, nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor typo in comment.

Line 240 has a typo: "there there" should be "there is".

Suggested fix
 	// Server has beem modified during state transition, requeue to process the new state
-	// no need to requeAfter duration as there there as an event to process the new state
+	// no need to requeue after duration as there is an event to process the new state
 	if modified {
 		return ctrl.Result{}, nil
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Server has beem modified during state transition, requeue to process the new state
// no need to requeAfter duration as there there as an event to process the new state
if modified {
return ctrl.Result{}, nil
}
// Server has beem modified during state transition, requeue to process the new state
// no need to requeue after duration as there is an event to process the new state
if modified {
return ctrl.Result{}, nil
}
🤖 Prompt for AI Agents
In `@internal/controller/server_controller.go` around lines 239 - 243, Fix the
typo in the comment inside the Server state transition block: change "there
there as an event to process the new state" to "there is an event to process the
new state" near the modified check (the block that returns ctrl.Result{}, nil
when modified is true) so the comment reads clearly.

Comment on lines +367 to 384
opResult, err = controllerutil.CreateOrPatch(ctx, r.Client, server, func() error {
server.Status.NetworkInterfaces = nics
server.Status.State = metalv1alpha1.ServerStateAvailable
return nil
})
if err != nil {
return false, fmt.Errorf("failed to patch server status: %w", err)
}
if err := r.invalidateRegistryEntryForServer(log, server); err != nil {
return false, fmt.Errorf("failed to invalidate registry entry for server: %w", err)
}
log.V(1).Info("Removed Server from Registry")

log.V(1).Info("Setting Server state set to available")
if modified, err := r.patchServerState(ctx, server, metalv1alpha1.ServerStateAvailable); err != nil || modified {
return false, err
// make sure to update the Status before reconciling again
if opResult != controllerutil.OperationResultNone {
log.V(1).Info("Patched Server status with network interfaces")
return true, nil
}
return false, nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check how status updates are handled elsewhere in the codebase
echo "=== Checking Status().Patch() and Status().Update() patterns ==="
rg -n 'Status\(\)\.(Update|Patch)' internal/controller/ --type go

echo ""
echo "=== Checking CreateOrPatch usage patterns ==="
rg -n 'CreateOrPatch' internal/controller/ --type go -A 5 -B 2

echo ""
echo "=== Finding server_controller.go and examining lines 367-384 ==="
fd 'server_controller.go' internal/controller/ --exec wc -l {}

echo ""
echo "=== Examining the exact code at lines 360-390 ==="
fd 'server_controller.go' internal/controller/ --exec sed -n '360,390p' {}

Repository: ironcore-dev/metal-operator

Length of output: 29122


Use Status().Patch() instead of CreateOrPatch for status-only updates.

Lines 367-370 use CreateOrPatch to update server.Status.NetworkInterfaces and server.Status.State. In Kubernetes, status is a subresource that requires Status().Patch() or Status().Update() to persist correctly. The same file demonstrates the correct pattern at line 536: r.Status().Patch(ctx, server, client.MergeFrom(serverBase)). Replace this usage and similar instances throughout the file (lines 557, 913, 1020, 1045) with the Status().Patch() approach to ensure status changes are handled properly and consistently with Kubernetes conventions.

🤖 Prompt for AI Agents
In `@internal/controller/server_controller.go` around lines 367 - 384, Replace
status-only CreateOrPatch calls with Status().Patch using the same merge-from
pattern used elsewhere: take a copy of the existing server object (e.g.,
serverBase), modify server.Status.NetworkInterfaces and server.Status.State,
then call r.Status().Patch(ctx, server, client.MergeFrom(serverBase)) instead of
controllerutil.CreateOrPatch; apply this change for the occurrence in the block
updating server status (currently using controllerutil.CreateOrPatch) and the
other listed occurrences (around lines noted in the review), keeping the
surrounding error handling and the subsequent call to
r.invalidateRegistryEntryForServer unchanged.

Comment on lines +407 to 409
if modified, err := r.ensureInitialBootConfigurationIsDeleted(ctx, server); err != nil || modified {
return false, err
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Inconsistent return pattern: modification flag not propagated.

The pattern return false, err at line 408 ignores the modified flag when there's no error. If ensureInitialBootConfigurationIsDeleted returns (true, nil), this code returns (false, nil), losing the modification signal and preventing the expected early return behavior.

Should be return modified, err for consistency with other patterns in this file.

Suggested fix
-	if modified, err := r.ensureInitialBootConfigurationIsDeleted(ctx, server); err != nil || modified {
-		return false, err
-	}
+	if modified, err := r.ensureInitialBootConfigurationIsDeleted(ctx, server); err != nil || modified {
+		return modified, err
+	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if modified, err := r.ensureInitialBootConfigurationIsDeleted(ctx, server); err != nil || modified {
return false, err
}
if modified, err := r.ensureInitialBootConfigurationIsDeleted(ctx, server); err != nil || modified {
return modified, err
}
🤖 Prompt for AI Agents
In `@internal/controller/server_controller.go` around lines 407 - 409, The current
return ignores the modification flag from
ensureInitialBootConfigurationIsDeleted; change the return to propagate the
modified boolean (i.e., return modified, err) so callers receive the true/false
modification signal when err is nil, matching the file's existing pattern and
enabling the expected early-return behavior in the surrounding function.

Comment on lines +1019 to 1042
func (r *ServerReconciler) updatePowerOnCondition(ctx context.Context, server *metalv1alpha1.Server) (bool, error) {
opResult, err := controllerutil.CreateOrPatch(ctx, r.Client, server, func() error {
acc := conditionutils.NewAccessor(conditionutils.AccessorOptions{})
err := acc.UpdateSlice(
&server.Status.Conditions,
PoweringOnCondition,
conditionutils.UpdateStatus(metav1.ConditionTrue),
conditionutils.UpdateReason("ServerPowerOn"),
conditionutils.UpdateMessage("Server is powering on"),
conditionutils.UpdateObserved(server),
)
if err != nil {
return fmt.Errorf("failed to update powering on condition: %w", err)
}
return nil
})
if err != nil {
return false, fmt.Errorf("failed to create or patch server powering on condition: %w", err)
}
if opResult == controllerutil.OperationResultUpdatedStatus || opResult == controllerutil.OperationResultUpdatedStatusOnly {
return true, nil
}
return false, nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

controller-runtime CreateOrPatch status subresource updates Kubernetes

💡 Result:

Short answer: CreateOrPatch (controllerutil.CreateOrPatch) does not update arbitrary subresources. It ignores non-status subresources, and it will only apply status changes if the object already exists — it will not create an object and set its status in the same call. To update status use the client's Status() writer (Status().Update or Status().Patch) after creation (or when the object exists). Requeue the reconcile if you need to set status immediately after create. [1][2]

References:

  • controllerutil.CreateOrPatch behaviour and status-note. [1]
  • controller-runtime client Status() (StatusWriter) usage (Status().Update / Status().Patch). [2]

🏁 Script executed:

# Find the server_controller.go file and examine the context around lines 1019-1042
fd -t f "server_controller.go" --exec wc -l {}

Repository: ironcore-dev/metal-operator

Length of output: 118


🏁 Script executed:

# Look for imports in server_controller.go to understand what utilities are being used
fd -t f "server_controller.go" --exec head -50 {}

Repository: ironcore-dev/metal-operator

Length of output: 1740


🏁 Script executed:

# Get the full updatePowerOnCondition and updatePowerOffCondition functions
sed -n '1019,1067p' ./internal/controller/server_controller.go

Repository: ironcore-dev/metal-operator

Length of output: 1980


🏁 Script executed:

# Search for other status update patterns in the file to understand the practice
rg -n "Status\(\)\.Update|Status\(\)\.Patch" ./internal/controller/server_controller.go

Repository: ironcore-dev/metal-operator

Length of output: 158


🏁 Script executed:

# Look for other uses of CreateOrPatch with status to see if this is a pattern
rg -B5 -A10 "CreateOrPatch.*Status" ./internal/controller/server_controller.go

Repository: ironcore-dev/metal-operator

Length of output: 53


Use Status().Patch() instead of CreateOrPatch() for status-only updates.

controllerutil.CreateOrPatch does not update status subresources—it patches the main resource object only. Per controller-runtime behavior, status updates must use the Status() writer (Status().Patch() or Status().Update()). The current check for OperationResultUpdatedStatus will not work as intended.

Both updatePowerOnCondition and updatePowerOffCondition should use Status().Patch() to update conditions, separating status mutations from spec changes. The file already demonstrates the correct pattern at line 536 with r.Status().Patch(ctx, server, client.MergeFrom(serverBase)).

🤖 Prompt for AI Agents
In `@internal/controller/server_controller.go` around lines 1019 - 1042, The
updatePowerOnCondition function is incorrectly using
controllerutil.CreateOrPatch (which doesn't update the status subresource);
change it to use the Status writer: deep-copy the original server (e.g.,
serverBase := server.DeepCopy()), mutate only server.Status.Conditions with
conditionutils.NewAccessor and acc.UpdateSlice (as you already do), then call
r.Status().Patch(ctx, server, client.MergeFrom(serverBase)) and handle the
error; return true when the patch succeeds (or after comparing pre/post
condition slices to detect a real change) and return false on no-op or error.
Apply the same pattern to updatePowerOffCondition.

Comment on lines +1044 to 1067
func (r *ServerReconciler) updatePowerOffCondition(ctx context.Context, server *metalv1alpha1.Server) (bool, error) {
opResult, err := controllerutil.CreateOrPatch(ctx, r.Client, server, func() error {
acc := conditionutils.NewAccessor(conditionutils.AccessorOptions{})
err := acc.UpdateSlice(
&server.Status.Conditions,
PoweringOnCondition,
conditionutils.UpdateStatus(metav1.ConditionTrue),
conditionutils.UpdateReason("ServerPowerOff"),
conditionutils.UpdateMessage("Server is powering off"),
conditionutils.UpdateObserved(server),
)
if err != nil {
return fmt.Errorf("failed to update powering off condition: %w", err)
}
return nil
})
if err != nil {
return fmt.Errorf("failed to update powering on condition: %w", err)
return false, fmt.Errorf("failed to create or patch server powering off condition: %w", err)
}
if opResult == controllerutil.OperationResultUpdatedStatus || opResult == controllerutil.OperationResultUpdatedStatusOnly {
return true, nil
}
return r.Status().Patch(ctx, server, client.MergeFrom(original))
return false, nil
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Condition type PoweringOnCondition used for power off state is confusing.

The updatePowerOffCondition function uses PoweringOnCondition (line 1049) with reason "ServerPowerOff" and message "Server is powering off". This is semantically inconsistent and potentially misleading. Using the same condition type for opposite power states with different reasons could cause confusion when interpreting server status.

Consider using a distinct condition type like PoweringOffCondition or a more generic PowerStateCondition to clearly represent the power state.

🤖 Prompt for AI Agents
In `@internal/controller/server_controller.go` around lines 1044 - 1067, The
updatePowerOffCondition function is using the PoweringOnCondition constant while
setting reason "ServerPowerOff" and message "Server is powering off", which is
semantically inconsistent; change the condition type used in
updatePowerOffCondition (the call to acc.UpdateSlice) from PoweringOnCondition
to a clearly named constant such as PoweringOffCondition (or a generic
PowerStateCondition) and ensure that the chosen constant is declared and used
consistently (and updated in any related readers/guards) so the condition type
matches the reason/message set by updatePowerOffCondition.

@afritzler
Copy link
Copy Markdown
Member

@coderabbitai do you think we should switch from Patch() to CreateOrPatch() for resource updates? I was under the assumption that CreateOrPatch should be used when a resource is initially created?

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 15, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🌐 Web query:

controller-runtime CreateOrPatch best practices when to use

💡 Result:

Short answer

  • Use controllerutil.CreateOrPatch when you want a simple, idempotent "ensure this object exists and has these fields" helper in your reconciler and you are okay with a client-side mutate→patch flow (simple controllers that "own" the child object). [1]

When NOT to use CreateOrPatch (common pitfalls)

  • You need optimistic locking / strategic-merge semantics or fine-grained Patch control — CreateOrPatch hides patch mechanics and does not give you explicit optimistic-update control. [1][3]
  • Your MutateFn vs API defaults will cause spurious patches (defaulted fields on GET vs zero-values in your object make equality checks fail). [1]
  • You want strict separation of status updates vs spec updates (CreateOrPatch may implicitly compare/affect status handling and can be confusing). [1][3]
  • You rely on read-your-own-writes from the cache right after a write — cached clients and helper helpers can make reasoning about immediate cache state harder. [2][3]

When to prefer alternatives

  • Use client.Patch / client.Update when you need explicit patch type, optimistic concurrency (resourceVersion), strategic merge, or precise control over status vs spec updates. [1][3]
  • Use Server-Side Apply (SSA) when you want declarative ownership of fields and fewer merge surprises between actors. [4]
  • Use CreateOrUpdate / explicit Update flows when you need update semantics you control (but be mindful of the same caveats about cached clients). [1][3]

Practical rules-of-thumb

  • For simple owned children where you just want "create if missing, patch if changed" and you don't need SSA/strategic-merge/explicit resourceVersion handling → CreateOrPatch is fine. [1]
  • For cross-controller shared resources, CRs with webhooks/defaulting, or where you must avoid unnecessary API calls or enforce concurrency → prefer explicit Patch/Update or SSA. [1][3][4]
  • Always write MutateFn so it only changes intended fields and be mindful of defaulted fields to avoid always-patching. Requeue to set status on create if needed instead of relying on complex implicit behavior. [1]

Sources

  • controller-runtime controllerutil.CreateOrPatch docs. [1]
  • controller-runtime package notes / controller writing tips. [2]
  • Practical guide / caveats on helpers (CreateOrUpdate/CreateOrPatch) and client/cache behavior. [3]
  • Discussion about Server-Side Apply recommendation for controllers. [4]

@afritzler Good question! Actually, CreateOrPatch() is not limited to initial resource creation—it's commonly used for idempotent updates in reconcilers. The pattern is recommended for simple "ensure this object exists and has these fields" operations, particularly for owned child resources.

Why CreateOrPatch makes sense here

The key benefit in this PR is that CreateOrPatch() returns an OperationResult that indicates whether a modification occurred. This enables the early-return pattern to prevent double reconciliation, which aligns with the PR's goal of reducing multiple reconciliations during sequential updates.

Important considerations for this specific usage

However, there are some concerns with the current implementation:

  1. Status subresource updates: CreateOrPatch may not be ideal for status updates where you want strict separation between spec and status handling. For status updates, using Status().Patch() explicitly is typically more appropriate and avoids potential confusion.

  2. Using on the reconciled resource itself: CreateOrPatch is most straightforward for simple owned children where you "own" the child object. In this PR, you're using it on the Server resource itself (the resource being reconciled), which is less common.

  3. Defaulting concerns: Be mindful of defaulted fields—if API defaults don't match your mutate function's zero-values, you may trigger spurious patches.

Alternative approach

You could achieve the same modification-detection benefit by:

  • Using explicit Patch() or Status().Patch() calls
  • Comparing resource generation, observedGeneration, or specific field values before/after
  • Maintaining explicit control over patch semantics

Would you like me to look at specific examples in the code where the distinction between spec vs status updates matters?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

4 participants