Summary
The sockmap_basic/sockmap udp multi channels selftest has a race condition
that causes intermittent failures. The test was denylisted in both
kernel-patches/vmtest and libbpf/ci four days after its introduction.
The root cause is a timing gap between native-stack and BPF-redirected data
delivery for UDP sockets in a sockmap, combined with the test using a single
non-blocking recv call that expects all data to be available at once.
Failure Details
- Test / Component:
sockmap_basic/sockmap udp multi channels
(tools/testing/selftests/bpf/prog_tests/sockmap_basic.c)
- Frequency: Flaky (denylisted since 2026-01-28, 4 days after introduction)
- Failure mode: Flaky —
recv_timeout(p1) returns 2 instead of expected 10
- Affected architectures: All (architecture-independent race condition)
- CI runs observed: Denylisted before CI runs could be collected. Denylist
commits:
- vmtest:
3e276e7c01 ("Denylist a flaky sockmap test", 2026-01-28)
- libbpf/ci: synced from vmtest
Root Cause Analysis
The test (test_sockmap_multi_channels) creates two socket pairs and a sockmap
with a BPF verdict program that redirects ingress traffic from socket p0 to
socket p1. The test flow is:
- Send 2 bytes directly from
c1 to p1 (native path, redirected to p1's
own psock ingress by the verdict program)
- Send 8 bytes from
c0 to p0 (BPF verdict redirects to p1's psock
ingress)
- Wait for
FIONREAD to report expected bytes
- Call
recv expecting all 10 bytes
For TCP, FIONREAD via tcp_bpf_ioctl returns msg_tot_len (total bytes
across all sk_msgs in the psock ingress queue). The test waits for
FIONREAD >= 10, ensuring all data has arrived before calling recv.
For UDP, FIONREAD via udp_bpf_ioctl calls sk_msg_first_len, which
returns only the first sk_msg's size (preserving datagram semantics). The
test sets expected = 2 for UDP, so wait_for_fionread returns as soon as
the first 2-byte message arrives — the 8 BPF-redirected bytes may not have
been enqueued yet.
Then recv calls udp_bpf_recvmsg → sk_msg_recvmsg, which reads across
sk_msgs in the psock ingress queue but returns immediately with whatever is
available. If the second sk_msg (8 bytes) hasn't arrived:
sk_msg_recvmsg reads 2 bytes from the first msg, finds no next msg, returns 2
ASSERT_EQ(recvd, sizeof(buf)) fails: 2 != 10
Key code references:
sk_msg_first_len(): include/linux/skmsg.h:571 — UDP FIONREAD returns first msg only
sk_psock_msg_inq(): include/linux/skmsg.h:557 — TCP FIONREAD returns total
__sk_msg_recvmsg(): net/core/skmsg.c:412 — reads across msgs but doesn't block
udp_bpf_recvmsg(): net/ipv4/udp_bpf.c:63 — returns immediately if any data read
Proposed Fix
Replace the single recv call with a recv loop that accumulates data across
multiple recv_timeout calls. Each call uses IO_TIMEOUT_SEC (30s) to wait
for data to become available, so the BPF-redirected data has time to arrive.
This matches the pattern already used in test_sockmap_copied_seq (lines
1184-1185) for partial reads.
See: 0001-selftests-bpf-Fix-flaky-sockmap-udp-multi-channels-t.patch
After the fix is applied, sockmap_basic/sockmap udp multi channels should
be removed from the DENYLIST in both kernel-patches/vmtest and libbpf/ci.
Impact
Without the fix, this test remains denylisted and provides no CI coverage for
the UDP multi-channel sockmap data path (native + BPF-redirect). This code
path was introduced alongside the FIONREAD fix (929e30f93125) and is not
exercised by any other test.
References
- FIONREAD fix:
929e30f93125 ("bpf, sockmap: Fix FIONREAD for sockmap")
- Test introduction:
17e2ce02bf56 ("selftests/bpf: Add tests for FIONREAD and copied_seq")
- Denylist:
3e276e7c01 ("Denylist a flaky sockmap test", 2026-01-28)
sk_msg_first_len vs sk_psock_msg_inq: include/linux/skmsg.h:557-587
Summary
The
sockmap_basic/sockmap udp multi channelsselftest has a race conditionthat causes intermittent failures. The test was denylisted in both
kernel-patches/vmtestandlibbpf/cifour days after its introduction.The root cause is a timing gap between native-stack and BPF-redirected data
delivery for UDP sockets in a sockmap, combined with the test using a single
non-blocking recv call that expects all data to be available at once.
Failure Details
sockmap_basic/sockmap udp multi channels(
tools/testing/selftests/bpf/prog_tests/sockmap_basic.c)recv_timeout(p1)returns 2 instead of expected 10commits:
3e276e7c01("Denylist a flaky sockmap test", 2026-01-28)Root Cause Analysis
The test (
test_sockmap_multi_channels) creates two socket pairs and a sockmapwith a BPF verdict program that redirects ingress traffic from socket
p0tosocket
p1. The test flow is:c1top1(native path, redirected to p1'sown psock ingress by the verdict program)
c0top0(BPF verdict redirects top1's psockingress)
FIONREADto report expected bytesrecvexpecting all 10 bytesFor TCP,
FIONREADviatcp_bpf_ioctlreturnsmsg_tot_len(total bytesacross all sk_msgs in the psock ingress queue). The test waits for
FIONREAD >= 10, ensuring all data has arrived before calling recv.For UDP,
FIONREADviaudp_bpf_ioctlcallssk_msg_first_len, whichreturns only the first sk_msg's size (preserving datagram semantics). The
test sets
expected = 2for UDP, sowait_for_fionreadreturns as soon asthe first 2-byte message arrives — the 8 BPF-redirected bytes may not have
been enqueued yet.
Then
recvcallsudp_bpf_recvmsg→sk_msg_recvmsg, which reads acrosssk_msgs in the psock ingress queue but returns immediately with whatever is
available. If the second sk_msg (8 bytes) hasn't arrived:
sk_msg_recvmsgreads 2 bytes from the first msg, finds no next msg, returns 2ASSERT_EQ(recvd, sizeof(buf))fails: 2 != 10Key code references:
sk_msg_first_len():include/linux/skmsg.h:571— UDP FIONREAD returns first msg onlysk_psock_msg_inq():include/linux/skmsg.h:557— TCP FIONREAD returns total__sk_msg_recvmsg():net/core/skmsg.c:412— reads across msgs but doesn't blockudp_bpf_recvmsg():net/ipv4/udp_bpf.c:63— returns immediately if any data readProposed Fix
Replace the single
recvcall with a recv loop that accumulates data acrossmultiple
recv_timeoutcalls. Each call usesIO_TIMEOUT_SEC(30s) to waitfor data to become available, so the BPF-redirected data has time to arrive.
This matches the pattern already used in
test_sockmap_copied_seq(lines1184-1185) for partial reads.
See:
0001-selftests-bpf-Fix-flaky-sockmap-udp-multi-channels-t.patchAfter the fix is applied,
sockmap_basic/sockmap udp multi channelsshouldbe removed from the DENYLIST in both
kernel-patches/vmtestandlibbpf/ci.Impact
Without the fix, this test remains denylisted and provides no CI coverage for
the UDP multi-channel sockmap data path (native + BPF-redirect). This code
path was introduced alongside the FIONREAD fix (929e30f93125) and is not
exercised by any other test.
References
929e30f93125("bpf, sockmap: Fix FIONREAD for sockmap")17e2ce02bf56("selftests/bpf: Add tests for FIONREAD and copied_seq")3e276e7c01("Denylist a flaky sockmap test", 2026-01-28)sk_msg_first_lenvssk_psock_msg_inq:include/linux/skmsg.h:557-587