Skip to content

No kernel nhg original#21893

Open
donaldsharp wants to merge 3 commits into
FRRouting:masterfrom
donaldsharp:no_kernel_nhg_original
Open

No kernel nhg original#21893
donaldsharp wants to merge 3 commits into
FRRouting:masterfrom
donaldsharp:no_kernel_nhg_original

Conversation

@donaldsharp
Copy link
Copy Markdown
Member

PR #21125 attempts to modify slightly how NHG data is sent to the kernel for connected/local/kernel NHG's. The approach taken there was too complicated and in order for me to figure out what was going on I had to reverse engineer it to a degree that my approach is just much simplier and frankly easier to understand. Here's my alternate approach to the problem

@frrbot frrbot Bot added tests Topotests, make check, etc zebra labels May 8, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 8, 2026

Greptile Summary

This PR simplifies how NHGs for connected/local/kernel (system) routes are handled in the dataplane by letting them flow through the full dplane queue with a skip_kernel flag instead of short-circuiting early and discarding the context. The previous approach discarded the dplane context before FPM and other providers could see it; the new approach allows providers (particularly FPM) to observe and act on these NHG installs while still preventing kernel programming.

  • zebra_dplane.c: Replaces the early-return that marked NHGs as installed and freed the context with a call to dplane_ctx_set_skip_kernel(), then lets the context flow normally through the provider chain.
  • zebra_nhg.c: Removes the now-unnecessary special-case handling of ZEBRA_DPLANE_REQUEST_SUCCESS in zebra_nhg_install_kernel and adds UNSET_FLAG(nhe->flags, NEXTHOP_GROUP_QUEUED) when promoting an NHG from the delayed-install path to a real kernel install, which is critical for correctness.
  • test_fpm_topo1.py: Adds a new test verifying that system-route NHGs are forwarded to FPM but not installed in the Linux kernel.

Confidence Score: 3/5

The C changes are correct but the new test may always time out, leaving the PR's primary behavioral claim unverified.

The core C implementation is clean and the flag-management changes in zebra_nhg.c are correct. The concern is with the test: NEXTHOP_GROUP_FPM is only written during the FPM batch resync walk (fpm_nhg_send_cb), not when individual NHG contexts flow through fpm_nl_enqueue. A connected/local/kernel NHG added after the last FPM sync will never have fpm=True in its show output, so check_nhg_sent_to_fpm will time out on every CI run.

tests/topotests/fpm_testing_topo1/test_fpm_topo1.py — the FPM presence assertion needs a different verification strategy.

Important Files Changed

Filename Overview
zebra/zebra_dplane.c Replaces the early-exit/ctx-discard for NEXTHOP_GROUP_INITIAL_DELAY_INSTALL with dplane_ctx_set_skip_kernel(); ctx now flows through the full provider chain.
zebra/zebra_nhg.c Removes special SUCCESS handling from the dplane result switch (now always an error) and adds critical NEXTHOP_GROUP_QUEUED clearing when promoting a system-route NHG to a real kernel install.
tests/topotests/fpm_testing_topo1/test_fpm_topo1.py New test for system-route NHG FPM delivery and kernel-skip; the FPM presence check relies on NEXTHOP_GROUP_FPM, which is only set during batch FPM resync, not via the individual dplane ctx path the PR introduces.

Sequence Diagram

sequenceDiagram
    participant RIB as zebra_rib (main thread)
    participant NHG as zebra_nhg
    participant DPLANE as dplane thread
    participant FPM as FPM provider
    participant KERNEL as kernel provider

    RIB->>NHG: zebra_nhg_install_kernel(nhe, CONNECT/LOCAL/KERNEL)
    Note over NHG: INITIAL_DELAY_INSTALL set skip flag-clearing block
    NHG->>DPLANE: dplane_nexthop_add(nhe)
    DPLANE->>DPLANE: dplane_ctx_set_skip_kernel(ctx)
    DPLANE->>DPLANE: dplane_update_enqueue(ctx)
    DPLANE-->>NHG: ZEBRA_DPLANE_REQUEST_QUEUED
    NHG->>NHG: SET_FLAG NEXTHOP_GROUP_QUEUED

    DPLANE->>FPM: fpm_nl_process ctx with NH_INSTALL op
    FPM->>FPM: fpm_nl_enqueue encode and send NHG to FPM socket
    FPM-->>DPLANE: enqueue to output

    DPLANE->>KERNEL: kernel provider dequeues ctx
    KERNEL->>KERNEL: dplane_ctx_is_skip_kernel true
    KERNEL->>KERNEL: set status SUCCESS skip netlink
    KERNEL-->>DPLANE: enqueue to output

    DPLANE-->>NHG: zebra_nhg_dplane_result SUCCESS
    NHG->>NHG: UNSET QUEUED REINSTALL SET INSTALLED
    NHG->>NHG: zebra_nhg_handle_install(nhe)
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
tests/topotests/fpm_testing_topo1/test_fpm_topo1.py:256-264
**FPM flag not set via dplane ctx path**

`NEXTHOP_GROUP_FPM` is only set inside `fpm_nhg_send_cb` — the batch hash-walk that runs during an FPM reconnect/resync. When an NHG install context flows through the FPM dplane provider via `fpm_nl_enqueue` (the new path this PR introduces), the flag is never written to the `nhg_hash_entry`. For an interface address added *after* the last FPM sync, `nhg_data.get("fpm", False)` will remain `False` indefinitely, causing the assertion to time out on every run. A more reliable check would either force an FPM reconnect before the assertion (triggering a full resync) or inspect the mock FPM server's received message log directly.

### Issue 2 of 2
tests/topotests/fpm_testing_topo1/test_fpm_topo1.py:258-265
**Misleading error message: `result` is a boolean, not NHG data**

`topotest.run_and_expect` returns `(success, last_function_return_value)`. Here `check_nhg_sent_to_fpm` returns `True` or `False`, so when the assertion fails the message reads `NHG data: False` instead of meaningful diagnostic information. Consider capturing the raw NHG JSON inside the check closure and including it in the assertion message.

Reviews (1): Last reviewed commit: "tests: Add topotest cases for sending RI..." | Re-trigger Greptile

Comment on lines +256 to +264
"""

def check_nhg_sent_to_fpm():
nhg_id, nhg_data = _get_nhg_for_prefix(router, prefix)
if nhg_id is None or nhg_data is None:
return False
return nhg_data.get("fpm", False) is True

success, result = topotest.run_and_expect(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 FPM flag not set via dplane ctx path

NEXTHOP_GROUP_FPM is only set inside fpm_nhg_send_cb — the batch hash-walk that runs during an FPM reconnect/resync. When an NHG install context flows through the FPM dplane provider via fpm_nl_enqueue (the new path this PR introduces), the flag is never written to the nhg_hash_entry. For an interface address added after the last FPM sync, nhg_data.get("fpm", False) will remain False indefinitely, causing the assertion to time out on every run. A more reliable check would either force an FPM reconnect before the assertion (triggering a full resync) or inspect the mock FPM server's received message log directly.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/topotests/fpm_testing_topo1/test_fpm_topo1.py
Line: 256-264

Comment:
**FPM flag not set via dplane ctx path**

`NEXTHOP_GROUP_FPM` is only set inside `fpm_nhg_send_cb` — the batch hash-walk that runs during an FPM reconnect/resync. When an NHG install context flows through the FPM dplane provider via `fpm_nl_enqueue` (the new path this PR introduces), the flag is never written to the `nhg_hash_entry`. For an interface address added *after* the last FPM sync, `nhg_data.get("fpm", False)` will remain `False` indefinitely, causing the assertion to time out on every run. A more reliable check would either force an FPM reconnect before the assertion (triggering a full resync) or inspect the mock FPM server's received message log directly.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +258 to +265
def check_nhg_sent_to_fpm():
nhg_id, nhg_data = _get_nhg_for_prefix(router, prefix)
if nhg_id is None or nhg_data is None:
return False
return nhg_data.get("fpm", False) is True

success, result = topotest.run_and_expect(
check_nhg_sent_to_fpm, True, count=60, wait=1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Misleading error message: result is a boolean, not NHG data

topotest.run_and_expect returns (success, last_function_return_value). Here check_nhg_sent_to_fpm returns True or False, so when the assertion fails the message reads NHG data: False instead of meaningful diagnostic information. Consider capturing the raw NHG JSON inside the check closure and including it in the assertion message.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tests/topotests/fpm_testing_topo1/test_fpm_topo1.py
Line: 258-265

Comment:
**Misleading error message: `result` is a boolean, not NHG data**

`topotest.run_and_expect` returns `(success, last_function_return_value)`. Here `check_nhg_sent_to_fpm` returns `True` or `False`, so when the assertion fails the message reads `NHG data: False` instead of meaningful diagnostic information. Consider capturing the raw NHG JSON inside the check closure and including it in the assertion message.

How can I resolve this? If you propose a fix, please make it concise.

donaldsharp and others added 3 commits May 8, 2026 13:01
Currently zebra sends connected/local/kernel routes NHG's to the
kernel only upon initial usage outside of one of these 3.  Modify
the code to send the result down but to ignore the installation
into the kernel.  There are dplanes out there that are using
the nhg's created for the connected routes being installed,
which are also ignored by the linux kernel.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
…t skipping kernel

Signed-off-by: Yuqing Zhao <galadriel.zyq@alibaba-inc.com>
Tighten up test code for these two things:

a) check_nhg_sent_to_fpm needed to be able to see error messages
from a run_and_expect block so it can be communicated better what
has gone wrong.

b) knowledge of sending of data to the fpm can be gated by
what zebra thinks it knows.  The topotest was confusing this
modify the topotest to grab the data it is looking for from
asking the fpm_listener what it knows.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
@donaldsharp donaldsharp force-pushed the no_kernel_nhg_original branch from afff5d3 to 5a6ed59 Compare May 8, 2026 17:02
@GaladrielZhao
Copy link
Copy Markdown
Contributor

Current SONiC fpmsyncd handles NHG via RTM_NEWNEXTHOP / RTM_DELNEXTHOP in onNextHopMsg.

Below is how each route type's NHG is processed:

  • Connected routes: treated as normal unicast routes. NHG referenced by nh_id is resolved from the in-memory map and expanded into ROUTE_TABLE fields (or NEXTHOP_GROUP_TABLE for multipath). Works correctly.
  • Kernel routes (proto=kernel, type=RTN_UNICAST): same path as connected routes. No special handling.
  • Local routes (type=RTN_LOCAL): the route itself is intentionally skipped in onRouteMsg ("BUM routes aren't supported yet"). However, its NHG is still received via RTM_NEWNEXTHOP:
    • single NH: stored only in memory (m_nh_groups), never written to DB.
    • group NH: written to NEXTHOP_GROUP_TABLE on creation. While this creates a brief orphan entry, multipath local routes are uncommon, so the impact is nearly negligible. Even if it happens, the entry is cleaned up on RTM_DELNEXTHOP.

So from the SONiC side, having FRR send all three types of NHGs causes no functional issue and actually fixes the missing connected-route NHG bug we observed.

Copy link
Copy Markdown
Contributor

@GaladrielZhao GaladrielZhao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

master size/L tests Topotests, make check, etc zebra

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants