Tc dplane conversion#21883
Conversation
Greptile SummaryThis PR migrates TC qdisc kernel message handling from the main netlink thread to the dataplane thread, following the same pattern already used for FDB and neighbor reads. The startup cleanup of leftover zebra-owned qdiscs is decoupled into a master-thread policy handler (
Confidence Score: 4/5The architectural shift is clean and consistent with existing dplane patterns; the two nits are both in the error path of the startup TC read and do not affect the startup sequence completing correctly. The core logic — moving TC processing to the dplane thread, decoupling policy into the master thread, and the startup cleanup — is sound. The only issues are that kernel_read_tc_qdisc unconditionally stamps the ctx SUCCESS even when netlink_request_qdiscs returns an error, and a copy-paste comment mislabels DPLANE_OP_TC_QDISC_NOTIFY as a read. Neither affects runtime behavior since the op-code result is not inspected by callers, but the misleading status could confuse future debugging. zebra/tc_netlink.c (status handling in kernel_read_tc_qdisc) and zebra/zebra_dplane.h (wrong comment on DPLANE_OP_TC_QDISC_NOTIFY) Important Files Changed
Prompt To Fix All With AIFix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
zebra/zebra_dplane.h:228-229
The comment for `DPLANE_OP_TC_QDISC_NOTIFY` is copy-pasted from the one above and says "read" instead of "notification". It describes the wrong operation.
```suggestion
/* Traffic control qdisc kernel notification */
DPLANE_OP_TC_QDISC_NOTIFY,
```
### Issue 2 of 2
zebra/tc_netlink.c:823-827
When `netlink_request_qdiscs` returns an error the context status is still forced to `SUCCESS`. The failure path that already handles the nlsock-lookup error (above) correctly sets `FAILURE`; the same should happen here so the ctx status accurately reflects whether the kernel dump was requested successfully.
```suggestion
ret = netlink_request_qdiscs(nl, AF_UNSPEC, RTM_GETQDISC);
if (ret >= 0)
netlink_parse_info(netlink_qdisc_change, nl, dp_info, 0, true);
dplane_ctx_set_status(ctx, (ret >= 0) ? ZEBRA_DPLANE_REQUEST_SUCCESS
: ZEBRA_DPLANE_REQUEST_FAILURE);
```
Reviews (1): Last reviewed commit: "tests: Test startup of tc with tc_basic" | Re-trigger Greptile |
| /* Traffic control qdisc read */ | ||
| DPLANE_OP_TC_QDISC_NOTIFY, |
There was a problem hiding this comment.
The comment for
DPLANE_OP_TC_QDISC_NOTIFY is copy-pasted from the one above and says "read" instead of "notification". It describes the wrong operation.
| /* Traffic control qdisc read */ | |
| DPLANE_OP_TC_QDISC_NOTIFY, | |
| /* Traffic control qdisc kernel notification */ | |
| DPLANE_OP_TC_QDISC_NOTIFY, |
Prompt To Fix With AI
This is a comment left during a code review.
Path: zebra/zebra_dplane.h
Line: 228-229
Comment:
The comment for `DPLANE_OP_TC_QDISC_NOTIFY` is copy-pasted from the one above and says "read" instead of "notification". It describes the wrong operation.
```suggestion
/* Traffic control qdisc kernel notification */
DPLANE_OP_TC_QDISC_NOTIFY,
```
How can I resolve this? If you propose a fix, please make it concise.| ret = netlink_request_qdiscs(nl, AF_UNSPEC, RTM_GETQDISC); | ||
| if (ret >= 0) | ||
| netlink_parse_info(netlink_qdisc_change, nl, dp_info, 0, true); | ||
|
|
||
| ret = netlink_parse_info(netlink_qdisc_change, &zns->netlink_cmd, | ||
| &dp_info, 0, true); | ||
| if (ret < 0) | ||
| return ret; | ||
| dplane_ctx_set_status(ctx, ZEBRA_DPLANE_REQUEST_SUCCESS); |
There was a problem hiding this comment.
When
netlink_request_qdiscs returns an error the context status is still forced to SUCCESS. The failure path that already handles the nlsock-lookup error (above) correctly sets FAILURE; the same should happen here so the ctx status accurately reflects whether the kernel dump was requested successfully.
| ret = netlink_request_qdiscs(nl, AF_UNSPEC, RTM_GETQDISC); | |
| if (ret >= 0) | |
| netlink_parse_info(netlink_qdisc_change, nl, dp_info, 0, true); | |
| ret = netlink_parse_info(netlink_qdisc_change, &zns->netlink_cmd, | |
| &dp_info, 0, true); | |
| if (ret < 0) | |
| return ret; | |
| dplane_ctx_set_status(ctx, ZEBRA_DPLANE_REQUEST_SUCCESS); | |
| ret = netlink_request_qdiscs(nl, AF_UNSPEC, RTM_GETQDISC); | |
| if (ret >= 0) | |
| netlink_parse_info(netlink_qdisc_change, nl, dp_info, 0, true); | |
| dplane_ctx_set_status(ctx, (ret >= 0) ? ZEBRA_DPLANE_REQUEST_SUCCESS | |
| : ZEBRA_DPLANE_REQUEST_FAILURE); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: zebra/tc_netlink.c
Line: 823-827
Comment:
When `netlink_request_qdiscs` returns an error the context status is still forced to `SUCCESS`. The failure path that already handles the nlsock-lookup error (above) correctly sets `FAILURE`; the same should happen here so the ctx status accurately reflects whether the kernel dump was requested successfully.
```suggestion
ret = netlink_request_qdiscs(nl, AF_UNSPEC, RTM_GETQDISC);
if (ret >= 0)
netlink_parse_info(netlink_qdisc_change, nl, dp_info, 0, true);
dplane_ctx_set_status(ctx, (ret >= 0) ? ZEBRA_DPLANE_REQUEST_SUCCESS
: ZEBRA_DPLANE_REQUEST_FAILURE);
```
How can I resolve this? If you propose a fix, please make it concise.
mjstapp
left a comment
There was a problem hiding this comment.
just had one small nit about the new enum value.
a little broader question though: it seems a bit off that the small TC module would be the place that makes the global "READ FINISHED" decision. I'd have thought that the dplane might be the place where that decision got made - each category would do its read, their individual FINISHED events would pass back through the dplane, and it would then decide when all the required categories had been done.
| /* Traffic control qdisc read */ | ||
| DPLANE_OP_TC_QDISC_READ, | ||
|
|
||
| /* Traffic control qdisc read */ |
|
If you put the |
The zebra_dplane_startup_stage helper only ever needed the namespace id, not the full struct zebra_ns. Switch its parameter to ns_id_t directly so callers running in the dataplane pthread can invoke it without dereferencing a struct zebra_ns owned by the zebra main thread. Update all existing callers in if_netlink.c, if_sysctl.c, if_ioctl.c and kernel_socket.c to pass zns->ns_id. Signed-off-by: Donald Sharp <sharpd@nvidia.com> Co-authored-by: Cursor <cursoragent@cursor.com>
Modify the code to use the dplane for all RTNLGRP_TC netlink messages. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
This test script was not testing the read in of zebra owned tc qdisc values are read in and removed. Add this. Additionally convert the test to use run_and_expect instead of sleep. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
75b0004 to
2dc0bc7
Compare
No description provided.