Skip to content

zebra: route EVPN FDB/neighbor reads through dplane#21206

Open
rjarry wants to merge 2 commits intoFRRouting:masterfrom
rjarry:evpn-dplane-reads
Open

zebra: route EVPN FDB/neighbor reads through dplane#21206
rjarry wants to merge 2 commits intoFRRouting:masterfrom
rjarry:evpn-dplane-reads

Conversation

@rjarry
Copy link
Copy Markdown

@rjarry rjarry commented Mar 17, 2026

When advertise-all-vni is enabled, zebra reads FDB and neighbor tables directly from the kernel via netlink, bypassing all dplane providers. A dplane provider has no way to supply its own FDB/neighbor state, making EVPN unusable with non-kernel dplane plugins (e.g. grout/DPDK).

This series adds two new dplane operations (DPLANE_OP_FDB_READ and DPLANE_OP_NEIGH_READ) and converts all FDB/neighbor read call sites to use the dplane enqueue path instead of calling netlink directly. The read results already flow through the dplane notification path, so only the read requests need to be rerouted.

The netlink-specific handling lives in rt_netlink.c (macfdb_read_ctx() / neigh_read_ctx()), with no-op stubs in rtread_sysctl.c for BSD, following the existing pattern for platform-specific read functions.

Fixes #21190

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR routes EVPN FDB (MAC forwarding database) and neighbor table reads through the zebra dataplane (dplane) infrastructure instead of invoking netlink directly from the main thread. Two new dplane operations — DPLANE_OP_FDB_READ and DPLANE_OP_NEIGH_READ — are introduced, along with corresponding context structures, accessor functions, enqueue helpers, and kernel-provider dispatch handlers. All seven call sites across zebra_vxlan.c, zebra_evpn.c, zebra_evpn_neigh.c, and zebra_ns.c are updated to use the new dplane_fdb_* / dplane_neigh_* APIs. The netlink-specific implementation is refactored into two new static-scoped dispatchers (kernel_read_macfdb / kernel_read_neigh in rt_netlink.c), and BSD stubs in rtread_sysctl.c are consolidated from seven no-op functions down to two.

  • New dplane_macfdb_read_info and dplane_neigh_read_info structs added to the dplane context union, with full accessor coverage.
  • Dispatch logic in kernel_read_macfdb uses a sentinel-based if-else chain (is_zero_mac, vni, ifindex) to multiplex four read variants; kernel_read_neigh uses a parallel ipa_type/ifindex pattern.
  • All switched enums (kernel_socket.c, dplane_fpm_nl.c, zebra_script.c, zebra_rib.c) handle the two new ops with no-op / break clauses, consistent with the existing pattern for DPLANE_OP_VLAN_INSTALL.
  • The filter_vlan global used in netlink MAC-table parsing is now set and cleared entirely within the dplane pthread; since the kernel dplane provider is single-threaded, this is safe.

Confidence Score: 5/5

Safe to merge; all functional and structural changes are correct and consistent with existing dplane patterns.

All 16 changed files are handled correctly: the dispatch logic in kernel_read_macfdb and kernel_read_neigh faithfully reproduces the four original read variants using sentinel-based multiplexing; the filter_vlan global is accessed safely within the single-threaded dplane provider; all switch statements covering dplane_op_e are updated; BSD stubs are properly consolidated; and all call sites are migrated. The two remaining P2 comments are observability/clarity issues with no effect on runtime correctness.

No files require special attention.

Vulnerabilities

No security concerns identified. The change is purely architectural — it reroutes read-only netlink queries through the dplane queue rather than calling them directly. No new data is exposed externally, no user input flows into the new path, and the netlink socket handling follows the same patterns as existing dplane operations.

Important Files Changed

Filename Overview
zebra/zebra_dplane.c Core of the change: adds context structs, accessors, enqueue helpers, and kernel-provider dispatch for DPLANE_OP_FDB_READ / DPLANE_OP_NEIGH_READ; logic is correct and consistent with existing patterns.
zebra/zebra_dplane.h Adds two new enum values, accessor declarations, and public enqueue API; clean API surface mirroring existing EVPN dplane patterns.
zebra/rt_netlink.c Refactors macfdb/neigh read functions to static, accepting nlsock+dp_info instead of zebra_ns; adds kernel_read_macfdb and kernel_read_neigh entry points called from the dplane pthread. Dispatch logic is correct.
zebra/rt_netlink.h Removes now-static netlink_macfdb/neigh declarations; exposes kernel_read_macfdb and kernel_read_neigh as the two new entry points. Clean.
zebra/rt.h Replaces nine old macfdb_/neigh_ extern declarations with two new ones (kernel_read_macfdb, kernel_read_neigh); clean reduction in public API surface.
zebra/rtread_netlink.c Removes seven now-obsolete macfdb_/neigh_ wrapper stubs; only route_read and kernel_read_pbr_rules remain. Correct.
zebra/rtread_sysctl.c BSD stubs consolidated from seven functions to two no-ops (kernel_read_macfdb / kernel_read_neigh); follows existing no-op pattern for platform-unsupported operations.

Sequence Diagram

sequenceDiagram
    participant ZMain as Zebra Main Thread
    participant DQ as Dplane Queue
    participant DProv as Kernel Dplane Provider
    participant NL as rt_netlink.c
    participant Kernel as Linux Kernel (netlink)
    participant CB as FDB/Neigh Callbacks

    note over ZMain: advertise-all-vni enabled / EVPN VNI event
    ZMain->>DQ: dplane_fdb_read / dplane_neigh_read / dplane_fdb_read_for_bridge etc.
    DQ-->>ZMain: ZEBRA_DPLANE_REQUEST_QUEUED

    DProv->>DQ: dequeue ctx
    alt DPLANE_OP_FDB_READ
        DProv->>NL: kernel_read_macfdb(ctx)
        NL->>NL: dispatch: specific_mac / mcast_vni / for_bridge / full
        NL->>Kernel: netlink_request_macs() RTM_GETNEIGH
        Kernel-->>NL: netlink response
        NL->>CB: netlink_parse_info netlink_macfdb_table
        NL-->>DProv: return
    else DPLANE_OP_NEIGH_READ
        DProv->>NL: kernel_read_neigh(ctx)
        NL->>NL: dispatch: specific_ip / for_vlan / full
        NL->>Kernel: netlink_request_neigh() RTM_GETNEIGH
        Kernel-->>NL: netlink response
        NL->>CB: netlink_parse_info netlink_neigh_table
        NL-->>DProv: return
    end
    DProv->>DQ: dplane_provider_enqueue_out_ctx(ctx)
    DQ->>ZMain: rib_process_dplane_results break no-op for read ops
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: zebra/rt_netlink.c
Line: 4375-4395

Comment:
**Return values from netlink read functions silently discarded**

In `kernel_read_macfdb`, the return values of all four netlink helper calls (`netlink_macfdb_read_specific_mac`, `netlink_macfdb_read_mcast_for_vni`, `netlink_macfdb_read_for_bridge`, `netlink_macfdb_read`) are discarded, and `ZEBRA_DPLANE_REQUEST_SUCCESS` is set unconditionally at the end regardless of whether any of them actually succeeded. The same pattern appears in `kernel_read_neigh` (which does check `ret >= 0` before parsing, but still always ends with SUCCESS).

Since none of the consumers — `kernel_dplane_handle_result` and `rib_process_dplane_results` — check the result status for `DPLANE_OP_FDB_READ` / `DPLANE_OP_NEIGH_READ` (they both just `break`), this is not a runtime bug today. However, silently swallowing errors means any netlink failure during a read (e.g., socket error, message truncation) will go completely unnoticed. Consider at least logging a warning when a negative return is received:

```c
if (!is_zero_mac(mac))
    ret = netlink_macfdb_read_specific_mac(nl, dp_info, br_ifindex, mac, vid, vlan_aware);
else if (vni)
    ret = netlink_macfdb_read_mcast_for_vni(nl, dp_info, ifindex, vni, is_vxlan);
else if (ifindex)
    ret = netlink_macfdb_read_for_bridge(nl, dp_info, br_ifindex, vlan_aware, vid);
else
    ret = netlink_macfdb_read(nl, dp_info);

if (ret < 0)
    zlog_warn("%s: macfdb read failed: %d", __func__, ret);
dplane_ctx_set_status(ctx, ZEBRA_DPLANE_REQUEST_SUCCESS);
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: zebra/zebra_dplane.c
Line: 6479-6497

Comment:
**`dplane_fdb_read_mcast_for_vni` returns `SUCCESS` instead of `QUEUED` on early-exit path**

When `IS_ZEBRA_VXLAN_IF_VNI(zif)` is true the function returns `ZEBRA_DPLANE_REQUEST_SUCCESS` rather than enqueuing anything. All callers currently discard the return value so there is no observable bug, but semantically this path has done no work. If callers are ever added that inspect this value, the distinction between "queued" and "skipped" will matter. A brief comment noting the intentional early-exit would clarify intent:

```c
if (IS_ZEBRA_VXLAN_IF_VNI(zif))
    /* Per-VNI VxLAN interfaces don't need mcast FDB reads */
    return ZEBRA_DPLANE_REQUEST_SUCCESS;
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (3): Last reviewed commit: "zebra: route neighbor reads through the ..." | Re-trigger Greptile

@mjstapp mjstapp self-requested a review March 17, 2026 15:32
Copy link
Copy Markdown
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll need to adjust this in order to avoid having the dplane pthread access unsafe data structs

Comment thread zebra/rt_netlink.c Outdated
void macfdb_read_ctx(struct zebra_dplane_ctx *ctx)
{
ifindex_t ifindex, br_ifindex;
struct interface *ifp, *br_if;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no - if this is run in the dplane pthread, no zebra objects can be accessed at all.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the code accordingly. Let me know if that is better.

@rjarry rjarry force-pushed the evpn-dplane-reads branch from 8c0c0f1 to 91dde11 Compare March 18, 2026 21:46
@github-actions github-actions bot added size/XL rebase PR needs rebase and removed size/L labels Mar 18, 2026
@rjarry rjarry force-pushed the evpn-dplane-reads branch from 91dde11 to b1ae5d1 Compare March 18, 2026 22:30
@rjarry rjarry requested a review from mjstapp March 20, 2026 08:55
@rjarry
Copy link
Copy Markdown
Author

rjarry commented Mar 24, 2026

The CI is failing but I fail to understand what the errors actually are. They don't see related to my changes.

rjarry added a commit to rjarry/grout that referenced this pull request Mar 25, 2026
Zebra silently discards VNI and FDB notifications received before
advertise-all-vni has taken effect. On slow CI workers, the VXLAN
interface creation and subsequent MAC learning can happen before zebra
enables EVPN, causing type-2 routes to never be advertised.

Wait for "show evpn" to produce output (which only happens once EVPN is
enabled) before creating the bridge and VXLAN interfaces.

Link: FRRouting/frr#21190
Link: FRRouting/frr#21206
Signed-off-by: Robin Jarry <rjarry@redhat.com>
rjarry added a commit to rjarry/grout that referenced this pull request Mar 25, 2026
Zebra silently discards VNI and FDB notifications received before
advertise-all-vni has taken effect. On slow CI workers, the VXLAN
interface creation and subsequent MAC learning can happen before zebra
enables EVPN, causing type-2 routes to never be advertised.

Wait for "show evpn" to produce output (which only happens once EVPN is
enabled) before creating the bridge and VXLAN interfaces.

Link: FRRouting/frr#21190
Link: FRRouting/frr#21206
Signed-off-by: Robin Jarry <rjarry@redhat.com>
grout-bot pushed a commit to rjarry/grout that referenced this pull request Mar 26, 2026
Zebra silently discards VNI and FDB notifications received before
advertise-all-vni has taken effect. On slow CI workers, the VXLAN
interface creation and subsequent MAC learning can happen before zebra
enables EVPN, causing type-2 routes to never be advertised.

Wait for "show evpn" to produce output (which only happens once EVPN is
enabled) before creating the bridge and VXLAN interfaces.

Link: FRRouting/frr#21190
Link: FRRouting/frr#21206
Signed-off-by: Robin Jarry <rjarry@redhat.com>
Reviewed-by: Christophe Fontaine <cfontain@redhat.com>
rjarry added a commit to DPDK/grout that referenced this pull request Mar 26, 2026
Zebra silently discards VNI and FDB notifications received before
advertise-all-vni has taken effect. On slow CI workers, the VXLAN
interface creation and subsequent MAC learning can happen before zebra
enables EVPN, causing type-2 routes to never be advertised.

Wait for "show evpn" to produce output (which only happens once EVPN is
enabled) before creating the bridge and VXLAN interfaces.

Link: FRRouting/frr#21190
Link: FRRouting/frr#21206
Signed-off-by: Robin Jarry <rjarry@redhat.com>
Reviewed-by: Christophe Fontaine <cfontain@redhat.com>
rjarry added a commit to rjarry/grout that referenced this pull request Mar 26, 2026
Waiting for advertise-all-vni to take effect is not sufficient. Zebra
also needs to know about VNI 100 before it can process FDB notifications
for that bridge. On slow CI workers, create_interface p1 causes MAC
learning before zebra has seen the VXLAN interface, so the type-2 route
is still lost.

Create the VXLAN interface before the bridge port, and wait for
"show evpn vni 100" to confirm zebra knows about VNI 100 before
proceeding with port creation.

Fixes: 425bf7b ("smoke: work around FRR EVPN race in evpn_vxlan_frr test")
Link: FRRouting/frr#21190
Link: FRRouting/frr#21206
Signed-off-by: Robin Jarry <rjarry@redhat.com>
rjarry added a commit to DPDK/grout that referenced this pull request Mar 26, 2026
Waiting for advertise-all-vni to take effect is not sufficient. Zebra
also needs to know about VNI 100 before it can process FDB notifications
for that bridge. On slow CI workers, create_interface p1 causes MAC
learning before zebra has seen the VXLAN interface, so the type-2 route
is still lost.

Create the VXLAN interface before the bridge port, and wait for
"show evpn vni 100" to confirm zebra knows about VNI 100 before
proceeding with port creation.

Fixes: 425bf7b ("smoke: work around FRR EVPN race in evpn_vxlan_frr test")
Link: FRRouting/frr#21190
Link: FRRouting/frr#21206
Signed-off-by: Robin Jarry <rjarry@redhat.com>
@rjarry rjarry force-pushed the evpn-dplane-reads branch from b1ae5d1 to 140a08b Compare April 4, 2026 18:10
@rjarry
Copy link
Copy Markdown
Author

rjarry commented Apr 7, 2026

Hi there, @mjstapp is there anything blocking this pull request? Let me know if I need to change anything else. Thanks!

@mjstapp
Copy link
Copy Markdown
Contributor

mjstapp commented Apr 8, 2026

@greptileai review

Copy link
Copy Markdown
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good to me now - I'm getting one more pass with the greptile, just to see if there's anything it spots.

@mjstapp mjstapp self-requested a review April 8, 2026 12:16
Copy link
Copy Markdown
Contributor

@mjstapp mjstapp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trying to clear the red flag

@mjstapp
Copy link
Copy Markdown
Contributor

mjstapp commented Apr 8, 2026

if the greptile comment about dead functions is correct, would you please clean those up?

@mjstapp
Copy link
Copy Markdown
Contributor

mjstapp commented Apr 8, 2026

the comment is in the third paragraph after "score", before the "vulnerability" section

if the greptile comment about dead functions is correct, would you please clean those up?

Add DPLANE_OP_FDB_READ and convert all FDB read call sites to use the
new dplane enqueue functions instead of calling netlink helpers from
the main thread.

All needed interface properties (bridge VLAN-awareness, VxLAN type)
are captured at enqueue time so that the dplane handler never accesses
zebra main-thread objects like zebra_ns or struct interface. Rework the
netlink_macfdb_read* and netlink_request_specific_mac functions to take
a struct nlsock pointer and plain values instead of struct zebra_ns and
struct interface pointers. The dplane handler kernel_read_macfdb()
calls them directly. Read results flow back through the existing
notification path via netlink_macfdb_change().

Remove the old macfdb_read wrappers from rt.h, rtread_netlink.c and
rtread_sysctl.c since they are no longer used.

Signed-off-by: Robin Jarry <rjarry@redhat.com>
@rjarry
Copy link
Copy Markdown
Author

rjarry commented Apr 8, 2026

Will do, thanks for the review!

Add DPLANE_OP_NEIGH_READ and convert all IP neighbor read call sites
to use the new dplane enqueue functions instead of calling netlink
helpers from the main thread.

The handler uses the dplane netlink socket and only accesses ctx data,
avoiding any zebra main-thread objects. Change
netlink_request_specific_neigh_in_vlan() to take a struct nlsock
pointer instead of struct zebra_ns so it can be called from the dplane
thread.

Remove the old neigh_read wrappers from rt.h, rtread_netlink.c and
rtread_sysctl.c since they are no longer used.

Remove now unused functions from rt_netlink.[hc]

Signed-off-by: Robin Jarry <rjarry@redhat.com>
@rjarry rjarry force-pushed the evpn-dplane-reads branch from 140a08b to 5779d10 Compare April 8, 2026 13:39
@rjarry
Copy link
Copy Markdown
Author

rjarry commented Apr 8, 2026

It should be OK now. @greptileai was correct. I rebased on the latest master branch and removed the unused functions from rt_netlink.*

@rjarry
Copy link
Copy Markdown
Author

rjarry commented Apr 16, 2026

@mjstapp could you have a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

zebra: EVPN FDB/neighbor state sync bypasses dplane providers

2 participants