Skip to content

B4/timer nolock#10629

Closed
mykyta5 wants to merge 9 commits intokernel-patches:bpf-next_basefrom
mykyta5:b4/timer_nolock
Closed

B4/timer nolock#10629
mykyta5 wants to merge 9 commits intokernel-patches:bpf-next_basefrom
mykyta5:b4/timer_nolock

Conversation

@mykyta5
Copy link
Copy Markdown
Collaborator

@mykyta5 mykyta5 commented Jan 6, 2026

No description provided.

@kernel-patches-daemon-bpf kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch 3 times, most recently from e7b5368 to 1e194b1 Compare January 7, 2026 05:14
@mykyta5 mykyta5 force-pushed the b4/timer_nolock branch 3 times, most recently from 894d24f to f591629 Compare January 7, 2026 17:14
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch 10 times, most recently from 0b35e5a to f0ad505 Compare January 14, 2026 03:44
@mykyta5 mykyta5 force-pushed the b4/timer_nolock branch 3 times, most recently from 47c3bc1 to f37bab3 Compare January 14, 2026 18:06
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch 2 times, most recently from 5fad8c5 to f61cb76 Compare January 15, 2026 03:15
@mykyta5 mykyta5 force-pushed the b4/timer_nolock branch 3 times, most recently from 8118549 to d3577f3 Compare January 15, 2026 18:01
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch 5 times, most recently from be3c2bf to 8709eff Compare January 16, 2026 23:11
@mykyta5 mykyta5 force-pushed the b4/timer_nolock branch 2 times, most recently from ea745e1 to d9ab17a Compare January 22, 2026 17:28
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch 11 times, most recently from 91d46f6 to 9624cf2 Compare January 27, 2026 02:57
This series reworks implementation of BPF timer and workqueue APIs.
The goal is to make both timers and wq non-blocking, enabling their use
in NMI context.
Today this code relies on a bpf_spin_lock embedded in the map element to
serialize:
 * init of the async object,
 * setting/changing the callback and bpf_prog
 * starting/cancelling the timer/work
 * tearing down when the map element is deleted or the map’s user ref is
 dropped

The basic design approach in this series:
 * Use irq_work to offload all blocking work from NMI
 * Introduce refcount to guarantee lifetime of the bpf_async_cb structs
 deferred to potentially multiple irq_work callbacks
 * Keep objects under RCU protection to make sure they are not freed
 while kfuncs/helpers access them (We can't use refcnt for this, as
 refcnt itself is part of the bpf_async_cb struct)

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>

---
Changes in v8:
- Return -EBUSY in bpf_async_read_op() if last_seq is failed to be set
- In bpf_async_cancel_and_free() drop bpf_async_cb ref after calling bpf_async_process()
- Link to v7: https://lore.kernel.org/r/20260122-timer_nolock-v7-0-04a45c55c2e2@meta.com

Changes in v7:
- Addressed Andrii's review points from the previous version - nothing
  very significang.
- Added NMI stress tests for bpf_timer - hit few verifier failing checks
  and removed them.
- Address sparse warning in the bpf_async_update_prog_callback()
- Link to v6: https://lore.kernel.org/r/20260120-timer_nolock-v6-0-670ffdd787b4@meta.com

Changes in v6:
- Reworked destruction and refcnt use:
  - On cancel_and_free() set last_seq to BPF_ASYNC_DESTROY value, drop
    map's reference
  - In irq work callback, atomically switch DESTROY to DESTROYED, cancel
    timer/wq
  - Free bpf_async_cb on refcnt going to 0.
- Link to v5: https://lore.kernel.org/r/20260115-timer_nolock-v5-0-15e3aef2703d@meta.com

Changes in v5:
- Extracted lock-free algorithm for updating cb->prog and
cb->callback_fn into a function bpf_async_update_prog_callback(),
added a new commit and introduces this function and uses it in
__bpf_async_set_callback(), bpf_timer_cancel() and
bpf_async_cancel_and_free().
This allows to move the change into the separate commit without breaking
correctness.
- Handle NULL prog in bpf_async_update_prog_callback().
- Link to v4: https://lore.kernel.org/r/20260114-timer_nolock-v4-0-fa6355f51fa7@meta.com

Changes in v4:
- Handle irq_work_queue failures in both schedule and cancel_and_free
paths: introduced bpf_async_refcnt_dec_cleanup() that decrements refcnt
and makes sure if last reference is put, there is at least one irq_work
scheduled to execute final cleanup.
- Additional refcnt inc/dec in set_callback() + rcu lock to make sure
cleanup is not running at the same time as set_callback().
- Added READ_ONCE where it was needed.
- Squash 'bpf: Refactor __bpf_async_set_callback()' commit into 'bpf:
Add lock-free cell for NMI-safe
async operations'
- Removed mpmc_cell, use seqcount_latch_t instead.
- Link to v3: https://lore.kernel.org/r/20260107-timer_nolock-v3-0-740d3ec3e5f9@meta.com

Changes in v3:
- Major rework
- Introduce mpmc_cell, allowing concurrent writes and reads
- Implement irq_work deferring
- Adding selftests
- Introduces bpf_timer_cancel_async kfunc
- Link to v2: https://lore.kernel.org/r/20251105-timer_nolock-v2-0-32698db08bfa@meta.com

Changes in v2:
- Move refcnt initialization and put (from cancel_and_free())
from patch 5 into the patch 4, so that patch 4 has more clear and full
implementation and use of refcnt
- Link to v1: https://lore.kernel.org/r/20251031-timer_nolock-v1-0-b064ae403bfb@meta.com

--- b4-submit-tracking ---
{
  "series": {
    "revision": 8,
    "change-id": "20251028-timer_nolock-457f5b9daace",
    "prefixes": [
      "bpf-next"
    ],
    "history": {
      "v1": [
        "20251031-timer_nolock-v1-0-b064ae403bfb@meta.com"
      ],
      "v2": [
        "20251105-timer_nolock-v2-0-32698db08bfa@meta.com"
      ],
      "v3": [
        "20260107-timer_nolock-v3-0-740d3ec3e5f9@meta.com"
      ],
      "v4": [
        "20260114-timer_nolock-v4-0-fa6355f51fa7@meta.com"
      ],
      "v5": [
        "20260115-timer_nolock-v5-0-15e3aef2703d@meta.com"
      ],
      "v6": [
        "20260120-timer_nolock-v6-0-670ffdd787b4@meta.com"
      ],
      "v7": [
        "20260122-timer_nolock-v7-0-04a45c55c2e2@meta.com"
      ]
    }
  }
}
Refactor bpf timer and workqueue helpers to allow calling them from NMI
context by making all operations lock-free and deferring NMI-unsafe
work to irq_work.

Previously, bpf_timer_start(), and bpf_wq_start()
could not be called from NMI context because they acquired
bpf_spin_lock and called hrtimer/schedule_work APIs directly. This
patch removes these limitations.

Key changes:
 * Remove bpf_spin_lock from struct bpf_async_kern.
 * Initialize/Destroy via setting/unsetting bpf_async_cb pointer
   atomically.
 * Add per-bpf_async_cb irq_work to defer NMI-unsafe
   operations (hrtimer_start, hrtimer_try_to_cancel, schedule_work) from
   NMI to softirq context.
 * Use the lock-free seqcount_latch_t to pass operation
   commands (start/cancel/free) and parameters
   from NMI-safe callers to the irq_work handler.
 * Add reference counting to bpf_async_cb to ensure the object stays
   alive until all scheduled irq_work completes.
 * Move bpf_prog_put() to RCU callback to handle races between
   set_callback() and cancel_and_free().
 * Modify cancel_and_free() path:
   * Detach bpf_async_cb.
   * Signal destruction to irq_work side via setting last_seq to
     BPF_ASYNC_DESTROY.
   * On receiving BPF_ASYNC_DESTROY, cancel timer/wq.
 * Free bpf_async_cb on refcnt reaching 0, wait for both rcu and rcu
   task trace grace periods before freeing the bpf_async_cb. Removed
   unnecessary rcu locks, as kfunc/helper allways assumes rcu or rcu
   task trace lock.

This enables BPF programs attached to NMI-context hooks (perf
events) to use timers and workqueues for deferred processing.

Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Reviewed-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Extend the verifier to recognize struct bpf_timer as a valid kfunc
argument type. Previously, bpf_timer was only supported in BPF helpers.

This prepares for adding timer-related kfuncs in subsequent patches.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
introducing bpf timer cancel kfunc that attempts canceling timer
asynchronously, hence, supports working in NMI context.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Refactor timer selftests, extracting stress test into a separate test.
This makes it easier to debug test failures and allows to extend.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Extend BPF timer selftest to run stress test for async cancel.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Add test that verifies that bpf_timer_cancel_async works: can cancel
callback successfully.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Add stress tests for BPF timers that run in NMI context using perf_event
programs attached to PERF_COUNT_HW_CPU_CYCLES.

The tests cover three scenarios:
- nmi_race: Tests concurrent timer start and async cancel operations
- nmi_update: Tests updating a map element (effectively deleting and
  inserting new for array map) from within a timer callback
- nmi_cancel: Tests timer self-cancellation attempt.

A common test_common() helper is used to share timer setup logic across
all test modes.

The tests spawn multiple threads in a child process to generate
perf events, which trigger the BPF programs in NMI context. Hit counters
verify that the NMI code paths were actually exercised.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Now bpf_timer can be used in tracepoints, so these tests are no longer
relevant.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch 2 times, most recently from 3a73c9c to aa9aae9 Compare January 27, 2026 17:22
@kernel-patches-daemon-bpf kernel-patches-daemon-bpf Bot force-pushed the bpf-next_base branch 3 times, most recently from cd8cbf1 to 358bea9 Compare January 28, 2026 02:41
@kernel-patches-daemon-bpf
Copy link
Copy Markdown

Automatically cleaning up stale PR; feel free to reopen if needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant