Skip to content

bgpd: skip stalepath-timer clear for LLGR-negotiated AFI/SAFIs#21932

Open
hnattamaisub wants to merge 1 commit into
FRRouting:masterfrom
hnattamaisub:llgr
Open

bgpd: skip stalepath-timer clear for LLGR-negotiated AFI/SAFIs#21932
hnattamaisub wants to merge 1 commit into
FRRouting:masterfrom
hnattamaisub:llgr

Conversation

@hnattamaisub
Copy link
Copy Markdown
Contributor

Rootcause and Impact :

A helper router silently drops every route that LLGR was supposed to retain, breaking RFC 9494 §4.3 ("Procedures for the Receiving Speaker" during the Long-Lived Stale Time window). Downstream routers stop seeing the LLGR_STALE community, fall back to whatever (worse) path they had, and forwarding loses the longer-lived retention guarantee that LLGR is supposed to provide.

When a GR-capable peer's session goes down, bgp_stop() arms two timers in parallel:

t_gr_restart = peer-advertised restart-time (LLGR-aware path)
t_gr_stale = local stalepath-time (was NOT LLGR-aware earlier without our fix)

Whichever fires first wins:

| stalepath ≥ restart (typical defaults) | t_gr_restart fires first → bgp_graceful_restart_timer_off() cancels t_gr_stale → LLGR works fine | | stalepath < restart (this bug) | t_gr_stale fires first → unconditional delete → LLGR has nothing left to retain

Default FRR ships stalepath-time=360s and restart-time=120s, which hides the bug for most users. Operators tuning a longer restart window on the restarter, or shortening stalepath on the helper, expose it.

Fix:
Teach bgp_graceful_stale_timer_expire() to defer to the LLGR path when LLGR is negotiated for that AFI/SAFI and t_gr_restart is still running. bgp_graceful_restart_timer_expire() already implements the RFC-correct LLGR transition (sets PEER_STATUS_LLGR_WAIT, tags paths with LLGR_STALE via bgp_set_llgr_stale(), arms t_llgr_stale); we just have to stop the stalepath-timer from racing it.

For non-LLGR AFI/SAFIs the original delete behavior is preserved.

RFC 9494 §4.3 — Procedures for the Receiving Speaker.

When the "Restart Time" period ends, the LLGR period is said to have
begun and the following procedures MUST be performed:

  • The helper router MUST start a timer for the "Long-lived Stale
    Time"...
  • The helper router MUST attach the LLGR_STALE community for the
    stale routes being retained...

For these MUST clauses to have meaning, the routes have to exist at the moment t_gr_restart expires. The pre-fix stalepath-timer path deletes them earlier, which the RFC does not anticipate.

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 13, 2026

Greptile Summary

This PR fixes an RFC 9494 compliance bug where the stalepath-timer (t_gr_stale) could fire before the GR restart timer (t_gr_restart) and unconditionally delete stale routes that the LLGR period is supposed to retain. The fix teaches bgp_graceful_stale_timer_expire() to skip clearing stale routes for LLGR-negotiated AFI/SAFIs when the restart timer is still running, deferring to the existing RFC-correct LLGR transition in bgp_graceful_restart_timer_expire().

  • The guard condition peer->llgr[afi][safi].stale_time && connection->t_gr_restart is consistent with the pattern already used in bgp_graceful_restart_timer_expire(), and semantically correct: a stale_time of 0 means RFC 9494 requires deletion at GR period end, so deferral is only appropriate when a non-zero LLGR window was negotiated.
  • Non-LLGR AFI/SAFIs retain their original delete-now behaviour; the bug only manifests when stalepath-time is configured shorter than restart-time (non-default, but valid operator tuning).

Confidence Score: 4/5

Safe to merge; the change is narrowly scoped to the stalepath-timer callback and only skips stale-route deletion when both LLGR is negotiated (non-zero stale_time) and the GR restart timer is still pending.

The guard condition is logically correct and consistent with the symmetrical check already present in bgp_graceful_restart_timer_expire(). Existing code in bgp_clear_stale_route() also respects PEER_STATUS_LLGR_WAIT, so the normal (stalepath >= restart) path that falls through to bgp_clear_stale_route() after the restart timer fires is safe. The only scenario that would benefit from additional hardening is a topology test covering stalepath-time < restart-time with LLGR negotiated, which is a test gap rather than a code defect.

bgpd/bgp_fsm.c — the single changed file; worth a second read around the interplay between bgp_graceful_restart_timer_off() and the still-running t_gr_stale in the LLGR case

Important Files Changed

Filename Overview
bgpd/bgp_fsm.c Adds an LLGR-aware guard in bgp_graceful_stale_timer_expire() to defer stale-route deletion when LLGR is negotiated and the GR restart timer is still running, preventing premature deletion of routes that the RFC 9494 LLGR period is supposed to retain

Sequence Diagram

sequenceDiagram
    participant FSM as BGP FSM
    participant ST as t_gr_stale (stalepath-timer)
    participant RT as t_gr_restart (restart-timer)
    participant LLGR as t_llgr_stale

    FSM->>ST: arm(stalepath-time)
    FSM->>RT: arm(restart-time)

    Note over ST,RT: Bug scenario: stalepath-time < restart-time

    ST->>ST: fires first
    alt LLGR negotiated AND t_gr_restart running
        ST-->>FSM: skip bgp_clear_stale_route() [NEW]
    else No LLGR or t_gr_restart already done
        ST->>FSM: bgp_clear_stale_route(afi, safi)
    end

    RT->>RT: fires later
    RT->>FSM: bgp_graceful_restart_timer_expire()
    FSM->>FSM: SET PEER_STATUS_LLGR_WAIT
    FSM->>FSM: bgp_set_llgr_stale() — tag routes LLGR_STALE
    FSM->>FSM: bgp_clear_stale_route() — remove NO_LLGR routes
    FSM->>LLGR: arm(long-lived-stale-time)
    FSM->>FSM: bgp_announce_route() — re-advertise with LLGR_STALE

    LLGR->>LLGR: fires on expiry
    LLGR->>FSM: bgp_clear_stale_route() — delete remaining LLGR stale routes
Loading

Reviews (1): Last reviewed commit: "bgpd: skip stalepath-timer clear for LLG..." | Re-trigger Greptile

Rootcause and Impact :

A helper router silently drops every route that LLGR was supposed to
retain, breaking RFC 9494 §4.3 ("Procedures for the Receiving Speaker"
during the Long-Lived Stale Time window). Downstream routers stop
seeing the `LLGR_STALE` community, fall back to whatever (worse) path
they had, and forwarding loses the longer-lived retention guarantee
that LLGR is supposed to provide.

When a GR-capable peer's session goes down, `bgp_stop()` arms **two**
timers in parallel:

`t_gr_restart` = peer-advertised restart-time (LLGR-aware path)
 `t_gr_stale` = local `stalepath-time` (was NOT LLGR-aware earlier without our fix)

Whichever fires first wins:

| `stalepath ≥ restart` (typical defaults) | `t_gr_restart` fires first → `bgp_graceful_restart_timer_off()` cancels `t_gr_stale` → LLGR works fine |
| `stalepath < restart` (this bug) | `t_gr_stale` fires first → unconditional delete → LLGR has nothing left to retain

Default FRR ships `stalepath-time=360s` and `restart-time=120s`, which
hides the bug for most users. Operators tuning a longer restart window
on the restarter, or shortening stalepath on the helper, expose it.

Fix:
Teach `bgp_graceful_stale_timer_expire()` to defer to the LLGR path
when LLGR is negotiated for that AFI/SAFI and `t_gr_restart` is still
running. `bgp_graceful_restart_timer_expire()` already implements the
RFC-correct LLGR transition (sets `PEER_STATUS_LLGR_WAIT`, tags paths
with `LLGR_STALE` via `bgp_set_llgr_stale()`, arms `t_llgr_stale`); we
just have to stop the stalepath-timer from racing it.

For non-LLGR AFI/SAFIs the original delete behavior is preserved.

RFC 9494 §4.3 — Procedures for the Receiving Speaker.

> When the "Restart Time" period ends, the LLGR period is said to have
> begun and the following procedures MUST be performed:
>
> *  The helper router MUST start a timer for the "Long-lived Stale
>    Time"...
> *  The helper router MUST attach the LLGR_STALE community for the
>    stale routes being retained...

For these MUST clauses to have meaning, the routes have to *exist* at
the moment `t_gr_restart` expires. The pre-fix stalepath-timer path
deletes them earlier, which the RFC does not anticipate.

Signed-off-by: harini <hnattamaisub@nvidia.com>
Comment thread bgpd/bgp_fsm.c
if (!peer->nsf[afi][safi])
continue;

if (peer->llgr[afi][safi].stale_time && connection->t_gr_restart) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use event_scheduled()?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants