Zebra NHT marks RFC5549 IPv6 link-local next-hops "inaccessible" after netlink churn (Cilium restart) and withdraws default route from kernel

### Description

After upgrading FRR to **10.5.2** (previously worked on **10.1.4**), interface-based eBGP unnumbered with RFC5549 next-hops (IPv4 route with IPv6 link-local nexthop) becomes unstable during netlink churn. When `cilium-agent` is restarted on a Kubernetes node, zebra withdraws the kernel default route (`proto bgp`). In BGP table, `0.0.0.0/0` remains present, but both IPv6 LL next-hops become `(inaccessible)` and thus the route becomes `invalid` / no best-path. Connectivity to the LL next-hop is still fine from the kernel (ping + neigh reachable). A link flap (`shutdown/no shutdown`) on ONE uplink interface immediately makes the default route return without restarting FRR.
 
This looks like a zebra nexthop-tracking / netlink sync regression triggered by unrelated interface deletion (Cilium health veth) causing LL nexthop resolution to get stuck.

- BGP: eBGP unnumbered over 2 uplinks:
  - enp161s0f0np0
  - enp161s0f1np1
- Default route learned from ToR via RFC5549:
  - `0.0.0.0/0` with next-hops:
    - `fe80::b2cf:eff:fe0c:fb70 dev enp161s0f0np0`
    - `fe80::b2cf:eff:fe0c:f970 dev enp161s0f1np1`
- FRR runs on the host (not in a pod)


### Version

```text
- FRR:10.5.2 (regression vs 10.1.4)
- OS: Ubuntu 24.04.1 - 6.8.0-62-generic
- Platform: Kubernetes node (bare metal/VM) running Cilium
- Cilium: 1.18.2 (cilium-agent restart triggers netlink churn)
```

### How to reproduce

1. Ensure FRR is running on node with eBGP unnumbered over interfaces and receiving `0.0.0.0/0` via RFC5549 IPv6 link-local next-hops.
2. Verify default route installed in kernel:
   - `ip route show default` shows `proto bgp` with two nexthops via `fe80::...` on the two uplinks.
3. On the same node, restart Cilium agent pod:
   - `kubectl -n kube-system delete pod -l k8s-app=cilium -o name --field-selector spec.nodeName=<NODE>`
4. Observe: default route is removed from kernel and FRR marks BGP next-hops inaccessible.

### Expected behavior

- Netlink churn due to unrelated veth/interface changes (e.g. Cilium health interface) should not break zebra nexthop tracking for existing IPv6 LL next-hops on uplink interfaces.
- `0.0.0.0/0` should remain valid and installed in kernel if LL neighbors remain reachable.

### Actual behavior

- During Cilium restart, zebra withdraws the default route from kernel:
  - `Deleted default ... proto bgp ... nexthop via inet6 fe80::... dev enp...`
- In FRR:
  - `show bgp ipv4 unicast 0.0.0.0/0` shows next-hops as `(inaccessible)` and route becomes `invalid`, no best-path.
- In kernel at the same time, the neighbors remain reachable:
  - `ip -6 neigh show dev enp...` shows REACHABLE
  - `ping -I enp161s0f0np0 fe80::b2cf:eff:fe0c:fb70` continues to respond

### Additional context

### Kernel netlink monitor (captured during cilium-agent restart)

(typos possible due to OCR)
ip -ts monitor link route neigh
[2026-03-03114:48:56.2955421 158: 1xc_health@if157: ‹BROADCAST, MULTICAST› mtu 8900 adisc noqueue state DOWN group default
link/ether d2:51:0c:b9:77:7c brd ff:ff:ff:ff:ff:ff link-netnsid 5
[2026-03-03T14:48:56.295755]
Deleted 10.0.45.57 dev Ixc_health Iladdr 36:8e:fd:e0:8e:91 STALE
[2026-03-03T14:48:56.295797]
Deleted fe80::/64 dev Ixc_health proto kernel metric 256 pret medium
[2026-03-03T14:48:56.295824]
Deleted local fe80::d051:cff:feb9:777c dev Ixc_health table local proto kernel metric 0 pref medium
[2026-03-03T14:48:56.295925]
Deleted multicast ff00::/8 dev Ixc_health table local proto kernel metric 256 pref medium
[2026-03-03T14:48:56.295946]
Deleted ff02::16 dev Ixc_health Iladdr 33:33:00:00:00:16 NOARP
[2026-03-03T14:48:56.295969] Deleted ff02::1:ffb9:777c dev Ixc_health Iladdr 33:33:ff:b9:77:7c NOARP
[2026-03-03T14:48:56.295989] Deleted ff02::2 dev Ixc_health Iladdr 33:33:00:00:00:02 NOARP
[2026-03-03T14:48:56.362898] Deleted 158: Ixc_health@NONE: ‹BROADCAST, MULTICAST> mtu 8900 adisc noop state DOWN group default
‹BROADCAST, MULTICAST›
link/ether d2:51:0c:b9:77:7c brd ff:ff:ff:ff:ff:ff
[2026-03-03T14:48:56.363672] Deleted default nhid 43 proto bgp sro 10.138.208.63 metric 20
nexthop via ineto fe80::b2cf:eff:fe0c:f970 dev enp161s0f1np1 weight 1
nexthop via ineto fe80::b2cf:eff:fe0c:fb70 dev enp161søfønp@ weight 1
- Cilium deletes its `lxc_health` interface and routes. Immediately after, the BGP default route is deleted.

### Checklist

- [x] I have searched the open issues for this bug.
- [x] I have not included sensitive information in this report.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zebra NHT marks RFC5549 IPv6 link-local next-hops "inaccessible" after netlink churn (Cilium restart) and withdraws default route from kernel #21033

Description

Version

How to reproduce

Expected behavior

Actual behavior

Additional context

Kernel netlink monitor (captured during cilium-agent restart)

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Zebra NHT marks RFC5549 IPv6 link-local next-hops "inaccessible" after netlink churn (Cilium restart) and withdraws default route from kernel #21033

Description

Description

Version

How to reproduce

Expected behavior

Actual behavior

Additional context

Kernel netlink monitor (captured during cilium-agent restart)

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions