Noticed that controller logic has a small inconsistency in how deleted nodes are handled in rule status.
When a node is removed from the cluster, status.nodeEvaluations gets cleaned up correctly, but status.failedNodes can still retain entries for that deleted node. This leads to situations where the rule status continues to report failures for nodes that no longer exist.
From what I can see:
cleanupDeletedNodes filters nodeEvaluations
- but
failedNodes isn’t updated accordingly
- and
updateRuleStatus persists those stale entries
So over time, failedNodes can accumulate stale entries.
Ideally, both nodeEvaluations and failedNodes should stay in sync, so once a node is gone, it shouldn’t appear in either.
Noticed that controller logic has a small inconsistency in how deleted nodes are handled in rule status.
When a node is removed from the cluster,
status.nodeEvaluationsgets cleaned up correctly, butstatus.failedNodescan still retain entries for that deleted node. This leads to situations where the rule status continues to report failures for nodes that no longer exist.From what I can see:
cleanupDeletedNodesfiltersnodeEvaluationsfailedNodesisn’t updated accordinglyupdateRuleStatuspersists those stale entriesSo over time, failedNodes can accumulate stale entries.
Ideally, both
nodeEvaluationsandfailedNodesshould stay in sync, so once a node is gone, it shouldn’t appear in either.