bgpd: limit GR-stale NHT-unreach delete to GR helper context#21942
bgpd: limit GR-stale NHT-unreach delete to GR helper context#21942karthikeyav wants to merge 1 commit into
Conversation
PR FRRouting#21742 added deletion of stale paths in evaluate_paths() when the nexthop becomes unreachable, gated on BGP_PATH_STALE. However, BGP_PATH_STALE is also set during Enhanced Route Refresh (RFC 7313) in bgp_clear_route_node(), clearing_clear_one_pi(), and bgp_set_stale_route(), so the existing gate could in principle act on stales set via the enhanced-refresh path. Mirror the GR-specific half of the setter condition (PEER_STATUS_NSF_WAIT && peer->nsf[afi][safi]) on the deletion side so the new code only fires when acting as GR helper for this AFI/SAFI, not on stales set by enhanced route refresh. Signed-off-by: Karthikeya Venkat Muppalla <kmuppalla@nvidia.com>
|
@ton31337 This PR is follow up of your comment on #21742 (comment) Can you please take a look? |
Greptile SummaryThis PR narrows the stale-path deletion gate in
Confidence Score: 5/5Safe to merge. The change is a one-file, three-line guard addition that tightens an existing deletion gate without touching any control paths outside the nexthop-unreachable branch. The two new predicates (PEER_STATUS_NSF_WAIT and peer->nsf[afi][safi]) are the exact GR-specific half of the compound condition used by both setter sites in bgp_route.c. The LLGR and GR timer expiry paths both clear stale routes before unsetting PEER_STATUS_NSF_WAIT, so there is no window where a genuinely GR-stale route with an unreachable nexthop would be silently skipped by the new check. Enhanced Route Refresh stales are correctly excluded because PEER_STATUS_NSF_WAIT is not set during that flow. No files require special attention. bgpd/bgp_nht.c is the only changed file and the modification is self-contained. Important Files Changed
Sequence DiagramsequenceDiagram
participant Peer as BGP Peer
participant FSM as BGP FSM
participant Route as bgp_route.c
participant NHT as bgp_nht.c (evaluate_paths)
participant Zebra as Zebra/NHT
note over Peer,Zebra: GR Helper Path (PR target scenario)
Peer->>FSM: Session down (GR notification)
FSM->>Peer: SET PEER_STATUS_NSF_WAIT
FSM->>Peer: SET nsf[afi][safi]
Route->>Route: bgp_clear_route_node() marks paths BGP_PATH_STALE
Zebra->>NHT: Nexthop becomes unreachable
NHT->>NHT: CHECK BGP_PATH_STALE ✓
NHT->>NHT: CHECK PEER_STATUS_NSF_WAIT ✓
NHT->>NHT: CHECK nsf[afi][safi] ✓
NHT->>Route: bgp_rib_remove() — stale path deleted immediately
note over Peer,Zebra: Enhanced Route Refresh Path (should NOT delete)
Peer->>Route: Route Refresh request
Route->>Route: SET PEER_STATUS_ENHANCED_REFRESH
Route->>Route: marks paths BGP_PATH_STALE
Zebra->>NHT: Nexthop becomes unreachable
NHT->>NHT: CHECK BGP_PATH_STALE ✓
NHT->>NHT: CHECK PEER_STATUS_NSF_WAIT ✗ (not set)
NHT->>NHT: condition fails — path NOT deleted ✓
Reviews (1): Last reviewed commit: "bgpd: limit GR-stale NHT-unreach delete ..." | Re-trigger Greptile |
PR #21742 added deletion of stale paths in evaluate_paths() when the nexthop becomes unreachable, gated on BGP_PATH_STALE. However, BGP_PATH_STALE is also set during Enhanced Route Refresh (RFC 7313) in bgp_clear_route_node(), clearing_clear_one_pi(), and bgp_set_stale_route(), so the existing gate could in principle act on stales set via the enhanced-refresh path.
Mirror the GR-specific half of the setter condition (PEER_STATUS_NSF_WAIT && peer->nsf[afi][safi]) on the deletion side so the new code only fires when acting as GR helper for this AFI/SAFI, not on stales set by enhanced route refresh.