Skip to content

Obj relationship v5#12125

Closed
ameryhung wants to merge 14 commits into
kernel-patches:bpf-next_basefrom
ameryhung:obj_relationship_v5
Closed

Obj relationship v5#12125
ameryhung wants to merge 14 commits into
kernel-patches:bpf-next_basefrom
ameryhung:obj_relationship_v5

Conversation

@ameryhung
Copy link
Copy Markdown
Contributor

No description provided.

ameryhung added 8 commits May 18, 2026 16:05
Rename mark_stack_slot_obj_read() as mark_stack_slots_scratched() and
directly call it from functions processing iter, dynptr and irq_flag.
Commit 6762e3a ("bpf: simplify liveness to use (callsite, depth)
keyed func_instances") has removed the dynamic liveness component in
mark_stack_slot_obj_read(). The function effectively only marks stack
slots as scratched and always succeed. Therefore, return void, drop the
unused bpf_reg_state argument and rename it to
mark_stack_slots_scratched() to reflect what it does now.

In addition, to prepare for unifying dynptr handling, dynptr_get_spi()
will be moved out of mark_dynptr_read(). As mark_dynptr_read() would join
mark_iter_read() as a thin wrapper of mark_stack_slots_scratched(), just
open code these helpers.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Simplify dynptr checking for helper and kfunc by unifying it. Remember
the initialized dynptr (i.e.,g !(arg_type |= MEM_UNINIT)) pass to a
dynptr kfunc during process_dynptr_func() so that we can easily
retrieve the information for verification later. By saving it in
meta->dynptr, there is no need to call dynptr helpers such as
dynptr_id(), dynptr_ref_obj_id() and dynptr_type() in check_func_arg().

Remove and open code the helpers in process_dynptr_func() when
saving id, ref_obj_id, and type.

Besides, since dynptr ref_obj_id information is now pass around in
meta->bpf_dynptr_desc, drop the check in helper_multiple_ref_obj_use.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Acked-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Assign reg->id when getting referenced kptr from read program context
to be consistent with R0 of KF_ACQUIRE kfunc. skb dynptr will track the
referenced skb in qdisc programs using a new field reg->parent_id in
a later patch.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Preserve reg->id of pointer objects after null-checking the register so
that children objects derived from it can still refer to it in the new
object relationship tracking mechanism introduced in a later patch. This
change incurs a slight increase in the number of states in one selftest
bpf object, rbtree_search.bpf.o. For Meta bpf objects, the increase of
states is also negligible.

Selftest BPF objects with insns_diff > 0

Program                   Insns (A)  Insns (B)  Insns   (DIFF)  States (A)  States (B)  States (DIFF)
------------------------  ---------  ---------  --------------  ----------  ----------  -------------
rbtree_search                  6820       7326   +506 (+7.42%)         379         398   +19 (+5.01%)

Meta BPF objects with insns_diff > 0

Program                   Insns (A)  Insns (B)  Insns   (DIFF)  States (A)  States (B)  States (DIFF)
------------------------  ---------  ---------  --------------  ----------  ----------  -------------
ned_imex_be_tclass               52         57     +5 (+9.62%)           5           6   +1 (+20.00%)
ned_imex_be_tclass               52         57     +5 (+9.62%)           5           6   +1 (+20.00%)
ned_skop_auto_flowlabel         523        526     +3 (+0.57%)          39          40    +1 (+2.56%)
ned_skop_mss                    289        292     +3 (+1.04%)          20          20    +0 (+0.00%)
ned_skopt_bet_classifier         78         82     +4 (+5.13%)           8           8    +0 (+0.00%)
dctcp_update_alpha              252        320   +68 (+26.98%)          21          27   +6 (+28.57%)
dctcp_update_alpha              252        320   +68 (+26.98%)          21          27   +6 (+28.57%)
ned_ts_func                     119        126     +7 (+5.88%)           6           7   +1 (+16.67%)
tw_egress                      1119       1128     +9 (+0.80%)          95          96    +1 (+1.05%)
tw_ingress                     1128       1137     +9 (+0.80%)          95          96    +1 (+1.05%)
tw_tproxy_router               4380       4465    +85 (+1.94%)         114         118    +4 (+3.51%)
tw_tproxy_router4              3093       3170    +77 (+2.49%)          83          88    +5 (+6.02%)
ttls_tc_ingress               34656      35717  +1061 (+3.06%)         936         970   +34 (+3.63%)
tw_twfw_egress               222327     222338    +11 (+0.00%)       10563       10564    +1 (+0.01%)
tw_twfw_ingress               78295      78299     +4 (+0.01%)        3825        3826    +1 (+0.03%)
tw_twfw_tc_eg                222839     222859    +20 (+0.01%)       10584       10585    +1 (+0.01%)
tw_twfw_tc_in                 78295      78299     +4 (+0.01%)        3825        3826    +1 (+0.03%)
tw_twfw_egress                 8080       8085     +5 (+0.06%)         456         456    +0 (+0.00%)
tw_twfw_ingress                8053       8056     +3 (+0.04%)         454         454    +0 (+0.00%)
tw_twfw_tc_eg                  8154       8174    +20 (+0.25%)         456         457    +1 (+0.22%)
tw_twfw_tc_in                  8060       8063     +3 (+0.04%)         455         455    +0 (+0.00%)
tw_twfw_egress               222327     222338    +11 (+0.00%)       10563       10564    +1 (+0.01%)
tw_twfw_ingress               78295      78299     +4 (+0.01%)        3825        3826    +1 (+0.03%)
tw_twfw_tc_eg                222839     222859    +20 (+0.01%)       10584       10585    +1 (+0.01%)
tw_twfw_tc_in                 78295      78299     +4 (+0.01%)        3825        3826    +1 (+0.03%)
tw_twfw_egress                 8080       8085     +5 (+0.06%)         456         456    +0 (+0.00%)
tw_twfw_ingress                8053       8056     +3 (+0.04%)         454         454    +0 (+0.00%)
tw_twfw_tc_eg                  8154       8174    +20 (+0.25%)         456         457    +1 (+0.22%)
tw_twfw_tc_in                  8060       8063     +3 (+0.04%)         455         455    +0 (+0.00%)

Looking into rbtree_search, the reason for such increase is that the
verifier has to explore the main loop shown below for one more iteration
until state pruning decides the current state is safe.

long rbtree_search(void *ctx)
{
	...
	bpf_spin_lock(&glock0);
	rb_n = bpf_rbtree_root(&groot0);
	while (can_loop) {
		if (!rb_n) {
			bpf_spin_unlock(&glock0);
			return __LINE__;
		}

		n = rb_entry(rb_n, struct node_data, r0);
		if (lookup_key == n->key0)
			break;
		if (nr_gc < NR_NODES)
			gc_ns[nr_gc++] = rb_n;
		if (lookup_key < n->key0)
			rb_n = bpf_rbtree_left(&groot0, rb_n);
		else
			rb_n = bpf_rbtree_right(&groot0, rb_n);
	}
	...
}

Below is what the verifier sees at the start of each iteration
(65: may_goto) after preserving id of rb_n. Without id of rb_n, the
verifier stops exploring the loop at iter 16.

           rb_n  gc_ns[15]
iter 15    257   257

iter 16    290   257    rb_n: idmap add 257->290
                        gc_ns[15]: check 257 != 290 --> state not equal

iter 17    325   257    rb_n: idmap add 290->325
                        gc_ns[15]: idmap add 257->257 --> state safe

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Refactor object relationship tracking in the verifier and fix a dynptr
use-after-free bug where file/skb dynptrs are not invalidated when the
parent referenced object is freed.

Add parent_id to bpf_reg_state to precisely track child-parent
relationships. A child object's parent_id points to the parent object's
id. This replaces the PTR_TO_MEM-specific dynptr_id and does not
increase the size of bpf_reg_state on 64-bit machines as there is
existing padding.

When calling dynptr constructors (i.e., process_dynptr_func() with
MEM_UNINIT argument), track the parent's id if the parent is a
referenced object. This only applies to file dynptr and skb dynptr,
so only pass parent reg->id to kfunc constructors.

For release_reference(), invalidating an object now also invalidates
all descendants by traversing the object tree. This is done using
stack-based DFS to avoid recursive call chains of release_reference() ->
unmark_stack_slots_dynptr() -> release_reference(). Referenced objects
encountered during tree traversal cannot be indirectly released. They
require an explicit helper/kfunc call to release the acquired resources.

While the new design changes how object relationships are tracked in
the verifier, it does not change the verifier's behavior. Here is the
implication for dynptr, pointer casting, and owning/non-owning
references:

Dynptr:

When initializing a dynptr, referenced dynptrs acquire a reference for
ref_obj_id. If the dynptr has a referenced parent, parent_id tracks the
parent's id. When cloning, ref_obj_id and parent_id are copied from the
original. Releasing a referenced dynptr via release_reference(ref_obj_id)
invalidates all clones and derived slices. For non-referenced dynptrs,
only the specific dynptr and its children are invalidated.

Pointer casting:

Referenced socket pointers and their casted counterparts share the same
lifetime but have different nullness — they have different id but the
same ref_obj_id.

Owning to non-owning reference conversion:

After converting owning to non-owning by clearing ref_obj_id (e.g.,
object(id=1, ref_obj_id=1) -> object(id=1, ref_obj_id=0)), the
verifier only needs to release the reference state, so it calls
release_reference_nomark() instead of release_reference().

Note that the error message "reference has not been acquired before" in
the helper and kfunc release paths is removed. This message was already
unreachable. The verifier only calls release_reference() after
confirming meta.ref_obj_id is valid, so the condition could never
trigger in practice (no selftest exercises it either). With the
refactor, release_reference() can now be called with non-acquired ids
and have different error conditions. Report directly in
release_reference() instead.

Fixes: 870c285 ("bpf: net_sched: Add basic bpf qdisc kfuncs")
Signed-off-by: Amery Hung <ameryhung@gmail.com>
unmark_stack_slots_dynptr() already makes sure that CONST_PTR_TO_DYNPTR
cannot be released. process_dynptr_func() also prevents passing
uninitialized dynptr to helpers expecting initialized dynptr. Now that
unmark_stack_slots_dynptr() also reports error returned from
release_reference(), there should be no reason to keep these redundant
checks.

Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
Helpers and kfuncs independently tracked referenced object metadata (id,
ref_obj_id) using separate fields in their respective arg_meta structs.
This led to duplicated logic and inconsistent error handling between the
two paths.

Introduce struct ref_obj_desc to consolidate these fields along with a
count of how many arguments carry a reference. Add update_ref_obj() to
populate it from a bpf_reg_state, replacing open-coded assignments in
check_func_arg(), check_kfunc_args(), and process_iter_arg(). Add
validate_ref_obj() to check for ambiguous ref_obj before using it.

For ref_obj releasing helpers and kfuncs, keep checking it before
calling update_ref_obj() for now. A later patch will make these
functions not depending on ref_obj. For other users of ref_obj, move the
checks to the use locations. For helper, this means moving the checks
inside helper_multiple_ref_obj_use() to use locations.
is_acquire_function() is dropped as ref_obj is never used.

Pass ref_obj_desc into process_dynptr_func()/mark_stack_slots_dynptr()
instead of a bare parent_id to make it less confusing.

Drop the selftest introduced in 7ec899a (“selftests/bpf: Negative
test case for ref_obj_id in args”) since the verifier no longer
complains about ambiguous ref_obj if it is not used.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Introduce release_reg() to consolidate the release logic shared by both
helpers and kfuncs: dynptr release, kptr_xchg percpu-to-RCU conversion,
regular reference release, and NULL pass-through. NULL pass-through is
only allowed if the prototype indicates the argument may be null.

Determine release_regno from the function prototype/metadata before
argument checking, rather than discovering it dynamically during
argument processing. For helpers, scan the arg_type array in
check_func_proto() via check_proto_release_reg(). For kfuncs, set
release_regno to BPF_REG_1 in bpf_fetch_kfunc_arg_meta() when
KF_RELEASE is set. In the future when we start adding decl_tag to
kfunc arguments, we can just look at the function prototype instead
of a release_regno.

Extract ref_convert_alloc_rcu_protected() and
invalidate_rcu_protected_refs() to make it more clear what the code is
doing. For ref_convert_alloc_rcu_protected(), it pre-converts
MEM_ALLOC | MEM_PERCPU registers to MEM_RCU (clearing ref_obj_id so they
survive), then calls release_reference() to invalidate the remaining
registers and release the reference state.

Add KF_RELEASE to bpf_dynptr_file_discard() so its release_regno is set
via fetch_kfunc_meta rather than being assigned manually in the dynptr
argument processing. Set arg_type to ARG_PTR_TO_DYNPTR for
KF_ARG_PTR_TO_DYNPTR so that check_func_arg_reg_off() correctly allows
non-zero stack offsets for dynptr release arguments same as helper.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Remove ref_obj_id from bpf_reg_state by folding its role into the
existing id field. Previously, id tracked pointer identity for null
checking while ref_obj_id tracked the owning reference for lifetime
management. These are now unified: acquire helpers and kfuncs set id
to the acquired reference id, and release paths use id directly.

To handle cases where objects share the same lifetime but need distinct
identities, pointer casting (bpf_sk_fullsock, bpf_tcp_sock) and
referenced dynptr clones, introduce virtual references. A virtual
reference is a bpf_reference_state entry with is_virtual=true that
serves as a lifetime anchor. It has no backing register or stack slot
and exists only in acquired_refs.

For pointer casting, the first cast from a referenced pointer creates a
virtual reference, converts the source to a non-referenced pointer with
parent_id pointing to the virtual ref, and gives the cast result its own
unique id with the same parent_id. Chained casts reuse the same virtual
ref. Each pointer retains a unique id for independent null checking via
mark_ptr_or_null_regs, while releasing any of them releases the shared
virtual ref and invalidates all siblings.

For referenced dynptrs, the constructor creates a virtual reference
instead of a regular one. All clones share the same parent_id (the
virtual ref) but get unique ids for independent slice tracking.
Releasing a referenced dynptr releases the virtual reference, which in
turn invalidates all clones and their derived slices.

The release_reference() DFS is extended to detect virtual refs whose
parent is being released and report them as leaked references, matching
the existing behavior for regular refs encountered during cascading.

Also add reg_is_referenced() which checks if a register is referenced
either directly (id in the reference table) or indirectly (parent_id
points to a virtual ref). This replaces all former ref_obj_id checks.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
@ameryhung ameryhung force-pushed the obj_relationship_v5 branch from d8fe505 to 7b59c04 Compare May 19, 2026 16:30
ameryhung added 5 commits May 19, 2026 10:12
When checking whether a referenced dynptr can be overwritten,
destroy_if_dynptr_stack_slot only counted sibling dynptrs in the
current call frame. If a clone sharing the same virtual ref parent
existed in a different frame (e.g., passed to a subprog), it would
not be counted, causing the verifier to incorrectly reject the
overwrite with "cannot overwrite referenced dynptr".

Fix by extracting the counting into dynptr_ref_cnt() which uses
bpf_for_each_reg_in_vstate_mask() to scan dynptr stack slots across
all call frames.

Fixes: 017f5c4 ("bpf: Allow overwriting referenced dynptr when refcnt > 1")
Reported-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Amery Hung <ameryhung@gmail.com>
The verifier currently does not allow creating dynptr from dynptr data
or slice. Add a selftest to test this explicitly.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
Make sure the verifier invalidates the dynptr and dynptr slice derived
from an skb after the skb is freed.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
The parent object of a cloned dynptr is skb not the original dynptr.
Invalidate the original dynptr should not prevent the program from
using the slice derived from the cloned dynptr.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
…dropped

File dynptr and slice should be invalidated when the parent file's
reference is dropped in the program. Without the verifier tracking
dyntpr's parent referenced object, the dynptr would continute to be
incorrectly used even if the underlying file is being tear down or gone.

Signed-off-by: Amery Hung <ameryhung@gmail.com>
@ameryhung ameryhung force-pushed the obj_relationship_v5 branch from 7b59c04 to 79485ba Compare May 19, 2026 17:12
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch 4 times, most recently from 01a7a7f to f1eeafa Compare May 25, 2026 15:55
@kernel-patches-daemon-bpf
Copy link
Copy Markdown

Automatically cleaning up stale PR; feel free to reopen if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant