bpf, sockmap: keep sk_msg copy state in sync by kernel-patches-daemon-bpf[bot] · Pull Request #12137 · kernel-patches/bpf

kernel-patches-daemon-bpf · 2026-05-20T10:33:38Z

Pull request for series with
subject: bpf, sockmap: keep sk_msg copy state in sync
version: 3
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=1097929

kernel-patches-daemon-bpf · 2026-05-20T10:33:40Z

Upstream branch: 523d2f4
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1097929
version: 3

kernel-patches-review-bot · 2026-05-20T11:00:18Z

commit 0a58a0c24cc61549577f9d37ffa32a979c3425c2
Author: Zhang Cen <rollkingzzc@gmail.com>

bpf, sockmap: keep sk_msg copy state in sync

SK_MSG uses msg->sg.copy as per-scatterlist-entry provenance. Entries
with this bit set are copied before data/data_end are exposed to SK_MSG
BPF programs for direct packet access.

bpf_msg_pull_data(), bpf_msg_push_data(), and bpf_msg_pop_data()
rewrite the sk_msg scatterlist ring by collapsing, splitting, and
shifting entries. These operations move msg->sg.data[] entries, but the
parallel copy bitmap can be left behind on the old slot. A copied entry
can then return to msg->sg.start with its copy bit clear and be exposed
as directly writable packet data.

> diff --git a/net/core/filter.c b/net/core/filter.c
> index 9590877b0714..223d78708486 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c

[ ... ]

> @@ -2794,6 +2835,8 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>  {
>  	struct scatterlist sge, nsge, nnsge, rsge = {0}, *psge;
>  	u32 new, i = 0, l = 0, space, copy = 0, offset = 0;
> +	bool sge_copy = false, nsge_copy = false, nnsge_copy = false;
> +	bool rsge_copy = false;
>  	u8 *raw, *to, *from;
>  	struct page *page;
>
> @@ -2866,6 +2909,7 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>  			sk_msg_iter_var_prev(i);
>  		psge = sk_msg_elem(msg, i);
>  		rsge = sk_msg_elem_cpy(msg, i);
> +		rsge_copy = sk_msg_elem_is_copy(msg, i);
>
>  		psge->length = start - offset;
>  		rsge.length -= psge->length;

Could copy + len overflow and cause an out-of-bounds write? In
bpf_msg_push_data(), the len parameter is an unbounded u32 value passed
from a bpf program. When calculating the allocation size, copy + len can
overflow a 32-bit integer:

    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP,
                       get_order(copy + len));

If copy + len wraps around, alloc_pages() will allocate an undersized
buffer. Later, the destination pointer for shifting tail data is
calculated as:

    to = raw + front + len

Due to len being a large value, front + len wraps around and results in a
large positive 32-bit offset (zero-extended to 64-bit). The subsequent
memcpy(to, from, back) then performs a controlled out-of-bounds write well
past the allocated page.

Does bpf_msg_push_data() leak uninitialized kernel memory? The function
allocates a new memory page:

    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP,
                       get_order(copy + len));

Since the allocation does not include the __GFP_ZERO flag, the new len
bytes inserted into the message are never explicitly zeroed or initialized.
If the bpf program does not manually overwrite these bytes and the message
is transmitted over a socket, this could leak uninitialized kernel memory
to the network or userspace.

Could the ring buffer state become corrupted when data is appended exactly
at the end of the message and the ring buffer is full?

If data is appended exactly at the end of the message (start == offset +
l), the loop finishes with i pointing to the unused slot msg->sg.end. If
the ring buffer is full (!space), the coalescing fallback logic reads:

    copy = msg->sg.data[i].length

Because i points to an unused slot, its length is 0, making copy == 0, and
the fallback coalescing block is bypassed. The function then
unconditionally calls:

    sk_msg_iter_next(msg, end)

which increments msg->sg.end. Since the ring buffer was already full,
advancing msg->sg.end causes it to wrap around and equal msg->sg.start.
This could corrupt the ring buffer state and overwrite the first element
without calling put_page(), leading to page leaks.

[ ... ]

> @@ -2891,23 +2935,34 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>  	/* Shift one or two slots as needed */
>  	sge = sk_msg_elem_cpy(msg, new);
>  	sg_unmark_end(&sge);
> +	sge_copy = sk_msg_elem_is_copy(msg, new);
>
>  	nsge = sk_msg_elem_cpy(msg, i);
> +	nsge_copy = sk_msg_elem_is_copy(msg, i);
>  	if (rsge.length) {
>  		sk_msg_iter_var_next(i);
>  		nnsge = sk_msg_elem_cpy(msg, i);
> +		nnsge_copy = sk_msg_elem_is_copy(msg, i);
>  		sk_msg_iter_next(msg, end);
>  	}

Should rsge.offset be adjusted after splitting? When rsge is created
from a split entry, the code does not include:

    rsge.offset += start - offset;

to correctly adjust the offset. The rsge points to the tail portion of
the split entry but retains the original offset, which appears incorrect.
This could cause the residual entry to point to the wrong data within
the page.

[ ... ]

AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

In-Reply-To-Subject: bpf, sockmap: keep sk_msg copy state in sync
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26156967283

kernel-patches-daemon-bpf · 2026-05-20T11:09:36Z

Forwarding comment 4497664011 via email
In-Reply-To: 20260520102715.3033936-1-rollkingzzc@gmail.com
Patch: https://patchwork.kernel.org/project/netdevbpf/patch/20260520102715.3033936-1-rollkingzzc@gmail.com/

kernel-patches-daemon-bpf · 2026-05-20T16:13:20Z

Upstream branch: 523d2f4
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1097929
version: 3

kernel-patches-daemon-bpf · 2026-05-20T16:40:50Z

Upstream branch: 3db0419
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1097929
version: 3

kernel-patches-daemon-bpf · 2026-05-21T10:02:13Z

Upstream branch: b1fcdf9
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=1097929
version: 3

SK_MSG uses msg->sg.copy as per-scatterlist-entry provenance. Entries with this bit set are copied before data/data_end are exposed to SK_MSG BPF programs for direct packet access. bpf_msg_pull_data(), bpf_msg_push_data(), and bpf_msg_pop_data() rewrite the sk_msg scatterlist ring by collapsing, splitting, and shifting entries. These operations move msg->sg.data[] entries, but the parallel copy bitmap can be left behind on the old slot. A copied entry can then return to msg->sg.start with its copy bit clear and be exposed as directly writable packet data. This corruption path requires an attached SK_MSG BPF program that calls the mutating helpers; ordinary sockmap/TLS traffic that never runs push/pop/pull helper sequences is not affected. Keep msg->sg.copy synchronized with scatterlist entry moves, preserve the copy bit when an entry is split, clear it when a helper replaces an entry with a private page, and clear slots vacated by pull-data compaction. Fixes: 015632b ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Fixes: 6fff607 ("bpf: sk_msg program helper bpf_msg_push_data") Fixes: 7246d8e ("bpf: helper to pop data from messages") Cc: stable@vger.kernel.org Co-developed-by: Han Guidong <2045gemini@gmail.com> Signed-off-by: Han Guidong <2045gemini@gmail.com> Signed-off-by: Zhang Cen <rollkingzzc@gmail.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com>

kernel-patches-daemon-bpf Bot added new bpf-next V3 labels May 20, 2026

kernel-patches-review-bot Bot added the ai-review label May 20, 2026

kernel-patches-daemon-bpf Bot added V3-ci-fail and removed ai-review labels May 20, 2026

kernel-patches-daemon-bpf Bot force-pushed the series/1097929=>bpf-next branch from 0a58a0c to 8fdf7da Compare May 20, 2026 16:13

kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch from ceeb3aa to ec31e3e Compare May 20, 2026 16:39

kernel-patches-daemon-bpf Bot force-pushed the series/1097929=>bpf-next branch from 8fdf7da to 970310f Compare May 20, 2026 16:40

kernel-patches-daemon-bpf Bot added V3-ci-pass and removed V3-ci-fail labels May 20, 2026

kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch from ec31e3e to b3beebb Compare May 21, 2026 10:00

kernel-patches-daemon-bpf Bot force-pushed the series/1097929=>bpf-next branch from 970310f to f0c74f5 Compare May 21, 2026 10:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpf, sockmap: keep sk_msg copy state in sync#12137

bpf, sockmap: keep sk_msg copy state in sync#12137
kernel-patches-daemon-bpf[bot] wants to merge 1 commit into
bpf-next_basefrom
series/1097929=>bpf-next

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-review-bot Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-review-bot Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 20, 2026

Uh oh!

kernel-patches-daemon-bpf Bot commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants