[CORE-12452] feat: auto-recover host-networked pods when node IP changes by coutinhop · Pull Request #4784 · tigera/operator

coutinhop · 2026-05-07T19:34:09Z

Description

Detect Calico's host-networked pods (calico-typha, calico-node, calico-node-windows) whose status.podIPs no longer matches the node's current InternalIP, and delete them so the Deployment / DaemonSet controller recreates them with the correct IP.

This works around an upstream Kubernetes limitation [1] where status.podIPs is immutable for hostNetwork pods once set: when a node's IP changes (e.g. KubeVirt VM reboot pulls a new DHCP lease), existing hostNetwork pods keep their old IP. The kube EndpointSlice controller reads from status.podIPs, so the calico-typha EndpointSlice ends up advertising stale IPs and Felix times out connecting to Typha. Restarting the container does not help — only deleting and recreating the pod itself causes the kubelet to repopulate status.podIPs from the current node IP.

Implementation lives in the existing Typha autoscaler tick (every 10s, already has a Node informer cache):

Compare each pod's status.podIPs to its node's status.InternalIP (which the kubelet does update promptly via heartbeat).
Delete stale pods, paced one per workload-batch per tick. Batch size is read from each workload's existing rolling-update setting: the Typha PDB's maxUnavailable, or the DaemonSet's updateStrategy.rollingUpdate.maxUnavailable. Falls back to 1 if not set or if the resolved value is < 1 (minimum-progress guarantee).
Order: Typha first; if any Typha was deleted this cycle, skip the calico-node deletions until the next tick to give the new Typha pod a clean window to come up. Linux and Windows DaemonSets are paced independently of each other.
Skipped entirely on the non-cluster-host autoscaler instance.

Tested by ODCN on KubeVirt: 3-node cluster with all node IPs changed, all calico-node and Typha pods recovered automatically without manual intervention.

[1] kubernetes/kubernetes#93897

Jira: CI-1951, CORE-12452

Release Note

Automatically recover Calico pods stranded with stale pod IPs after a node IP change (e.g. KubeVirt node reboot).

For PR author

Tests for change.
If changing pkg/apis/, run make gen-files
If changing versions, run make gen-versions

For PR reviewers

A note for code reviewers - all pull requests must have the following:

Milestone set according to targeted release.
Appropriate labels:
- kind/bug if this is a bugfix.
- kind/enhancement if this is a a new feature.
- enterprise if this PR applies to Calico Enterprise only.

Detect Calico's host-networked pods (calico-typha, calico-node, calico-node-windows) whose status.podIPs no longer matches the node's current InternalIP, and delete them so the Deployment / DaemonSet controller recreates them with the correct IP. This works around an upstream Kubernetes limitation [1] where status.podIPs is immutable for hostNetwork pods once set: when a node's IP changes (e.g. KubeVirt VM reboot pulls a new DHCP lease), existing hostNetwork pods keep their old IP. The kube EndpointSlice controller reads from status.podIPs, so the calico-typha EndpointSlice ends up advertising stale IPs and Felix times out connecting to Typha. Restarting the container does not help — only deleting and recreating the pod itself causes the kubelet to repopulate status.podIPs from the current node IP. Implementation lives in the existing Typha autoscaler tick (every 10s, already has a Node informer cache): - Compare each pod's status.podIPs to its node's status.InternalIP (which the kubelet does update promptly via heartbeat). - Delete stale pods, paced one per workload-batch per tick. Batch size is read from each workload's existing rolling-update setting: the Typha PDB's maxUnavailable, or the DaemonSet's updateStrategy.rollingUpdate.maxUnavailable. Falls back to 1 if not set or if the resolved value is < 1 (minimum-progress guarantee). - Order: Typha first; if any Typha was deleted this cycle, skip the calico-node deletions until the next tick to give the new Typha pod a clean window to come up. Linux and Windows DaemonSets are paced independently of each other. - Skipped entirely on the non-cluster-host autoscaler instance. Tested by ODCN on KubeVirt: 3-node cluster with all node IPs changed, all calico-node and Typha pods recovered automatically without manual intervention. [1] kubernetes/kubernetes#93897 Jira: CI-1951, CORE-12452

Add a new Installation.Spec.StalePodIPRecovery field (Enabled / Disabled, default Enabled) that gates the host-networked stale pod IP detection and deletion logic in the typha autoscaler. When set to Disabled, the entire detection path is skipped each tick. The default-on choice is consistent with other operator-managed automation (e.g. the typha autoscaler is itself always-on with no toggle), avoids opt-in friction for users who don't know the bug exists, and provides an escape hatch for environments where the detection might interact badly with custom node-IP management. Implementation notes: - api/v1: new StalePodIPRecoveryType enum and IsStalePodIPRecoveryEnabled helper, modeled on the existing FIPSMode pattern. nil is treated as Enabled so the default-on behavior is encoded in one place. - typha_autoscaler.go: new optional func() bool field on the autoscaler consulted at the top of each tick. Wired via the existing option pattern (typhaAutoscalerOptionStalePodIPRecoveryEnabled) so tests can inject true / false / nil. A nil getter is treated as enabled, which keeps existing tests and the non-cluster-host autoscaler path unchanged. - core_controller.go: the closure reads the Installation named "default" from the manager's cached client at call time so toggles take effect on the next tick (~10s). Failures fall through to enabled — recovery is the safer default for the kubelet bug we're working around. Tests: - 3 new gate tests covering nil getter, true, and false. - Defensive Maybe() expectations on SetDegraded in the existing stale pod IP detection and maxUnavailable resolution contexts to fix a pre-existing race-condition flakiness exposed by this work.

coutinhop requested review from caseydavenport and stevegaossou May 7, 2026 19:34

coutinhop self-assigned this May 7, 2026

coutinhop requested a review from a team as a code owner May 7, 2026 19:34

marvin-tigera added this to the v1.43.0 milestone May 7, 2026

marvin-tigera added docs-pr-required release-note-required labels May 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CORE-12452] feat: auto-recover host-networked pods when node IP changes#4784

[CORE-12452] feat: auto-recover host-networked pods when node IP changes#4784
coutinhop wants to merge 2 commits intotigera:masterfrom
coutinhop:pedro-CI-1951-1

coutinhop commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

coutinhop commented May 7, 2026

Description

Release Note

For PR author

For PR reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants