fix: k_timer_status_sync blocks on an embedded sem, not a poll loop (#28) by swoisz · Pull Request #49 · intercreate/boreas

swoisz · 2026-06-07T18:53:17Z

Summary

Rewrites k_timer_status_sync (both the esp_timer and linux backends) from a k_msleep(1) busy-poll to a true blocking wait on a binary struct k_sem embedded in struct k_timer. This was the original design intent; it is finally safe because the corruption that killed the earlier attempt was root-caused to the pre-#18 k_thread zombies (#21) — those exact crash shapes are now green regression tests — and the semaphore itself is notification-backed (#41) with synchronous reference severance.

Wakes are now immediate instead of quantized to the FreeRTOS tick period, and the busy-poll divergence note on the declaration is removed.

Upstream-verified semantics (kernel/timer.c)

Single-waiter model — upstream's wait_q holds "the (single) thread waiting on this timer", and expiry/stop wake exactly one. A binary sem latch matches this exactly (not an approximation).
Woken by expiry OR stop; the wake fires after the expiry callback / stop function (upstream order).
status is re-read after the wake and reset to 0 on every return path; returns the count immediately if already non-zero, or 0 immediately if the timer is stopped (no block).
The sem is a wake latch, not a counter — the expiry count stays in timer->status. A give latched while no waiter was blocked would satisfy a later take early, so status_sync re-checks and re-blocks in a loop.

k_sem_give is ISR-safe (IRAM), so the expiry-path give works under CONFIG_K_TIMER_DISPATCH_ISR=y, the ESP_TIMER_TASK fallback, and the linux dispatcher. K_TIMER_DEFINE gains a compile-time sem initializer; k_work_init_delayable inherits the sem init through k_timer_init. struct k_timer grows by sizeof(struct k_sem) (~24 B), including inside k_work_delayable.

k_timer_remaining_get needed no change — it already returns uint32_t milliseconds, matching upstream's k_ticks_to_ms_floor32 shape.

Review fan-out (boreas-review + adversarial tracer)

One real semantic gap, fixed: between status_sync's status exchange and its running load, a one-shot expiry could complete entirely (status=1, running=false, give latched) and the waiter returned 0 where upstream's spinlocked read returns 1 — terminally lost for a one-shot in a while (k_timer_status_sync()) drain loop. The !running path now re-reads status instead of returning 0; the callback orders status++ (RELEASE) before running=false (RELEASE), so an ACQUIRE load observing running==false is guaranteed to observe the increment.

Adversarial probes refuted (no change needed):

Zeroed-sem gives before k_timer_init are structurally unreachable (the running guard + handle==NULL guard prevent every give/expiry path).
Stale latches cost at most one extra loop iteration — the binary sem caps at 1, no busy-spin.
SMP ordering is sound (RELEASE-before-give, portMUX barriers in k_sem, aligned 32-bit status).

Doc + hygiene:

@note on k_timer_start: a restart does not wake a blocked status_sync waiter (it waits for the restarted timer's first expiry — upstream parity).
@note on k_timer_init: re-init while a status_sync waiter is blocked is caller error (clobbers the embedded sem's waiter list).
Z_SEM_INITIALIZER single-sources the compile-time k_sem initializer for K_SEM_DEFINE and K_TIMER_DEFINE.

Test plan

linux: 206/0 ×3 (204 + 2 new tests)
ESP32-S3: 235/0 on hardware (233 + 2), CONFIG_K_TIMER_DISPATCH_ISR=y verified in the regenerated sdkconfig
New tests: test_timer_status_sync_after_status_get_blocks (stale-latch regression, made deterministic — one-shot + restart with t0 captured before the restart, so the elapsed bound holds under arbitrary host stall), test_timer_status_sync_stopped_returns_immediately
clang-format 21.1.8 clean

Closes #28

🤖 Generated with Claude Code

Rewrites k_timer_status_sync (both backends) from the k_msleep(1) busy-poll to a true blocking wait on a binary k_sem embedded in struct k_timer -- the original PR-4 design, now safe: the April crash shapes that killed it were root-caused to the pre-#18 k_thread zombies (#21) and are green regression tests, and the sem itself is notification-backed (#41) with synchronous severance. Upstream-verified semantics (kernel/timer.c): - single-waiter model (upstream's wait_q holds "the (single) thread waiting on this timer"; expiry/stop wake exactly one) -> a binary sem latch matches exactly - woken by expiry OR stop; status re-read after wake; status reset to 0 on every path; returns 0 immediately when stopped - the wake fires AFTER the expiry callback / stop_fn (upstream order) The sem is a wake LATCH, not a counter -- the expiry count stays in timer->status. A give latched while nobody waited would satisfy a later take early, so status_sync re-checks and re-blocks in a loop; pinned by the new test_timer_status_sync_after_status_get_blocks regression. Wakes are now immediate instead of quantized to the FreeRTOS tick; the busy-poll divergence note on the declaration is replaced. k_sem_give is ISR-safe (IRAM), so the expiry-path give works under CONFIG_K_TIMER_DISPATCH_ISR=y, the ESP_TIMER_TASK fallback, and the linux dispatcher. K_TIMER_DEFINE gains the compile-time sem initializer; k_work_init_delayable inherits the sem init via k_timer_init. struct k_timer grows by sizeof(struct k_sem) (~24 B), including inside k_work_delayable. k_timer_remaining_get needed no change: it already returns uint32_t milliseconds like upstream's k_ticks_to_ms_floor32 shape. linux suite: 206/0 x3 (204 + 2). Closes #28 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@note

Review fan-out findings folded: - One-shot exchange-vs-running window closed: between status_sync's status exchange and its running load, a one-shot expiry could complete entirely (status=1, running=false, give latched) and the waiter returned 0 where upstream's spinlocked read returns 1 -- terminally lost for a one-shot in a while(status_sync()) drain loop. The !running path now re-reads status instead of returning 0; the callback orders status++ (RELEASE) before running=false (RELEASE), so an ACQUIRE load of running==false is guaranteed to observe the increment. - test_timer_status_sync_after_status_get_blocks made deterministic: one-shot expiries only and t0 captured BEFORE the restart, so the elapsed lower bound holds under arbitrary host stall (the periodic re-arm racing the entry was CI-flaky both directions). - @note on k_timer_start: restart does not wake a blocked status_sync waiter (upstream parity). - @note on k_timer_init: re-init while a status_sync waiter is blocked is caller error (clobbers the embedded sem's waiter list). - Z_SEM_INITIALIZER single-sources the compile-time k_sem initializer body for K_SEM_DEFINE and K_TIMER_DEFINE. Adversarial review refuted (no change needed): zeroed-sem gives are structurally unreachable before k_timer_init (running-guard + handle-guard); stale latches cost at most one extra loop iteration (binary sem caps at 1, no busy-spin); SMP ordering is sound (RELEASE-before-give, portMUX barriers, aligned 32-bit status). linux suite: 206/0 x3. Refs #28 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR rewrites k_timer_status_sync() to block on an embedded binary k_sem latch (instead of k_msleep(1) polling), aligning Boreas’ timer wait behavior more closely with upstream Zephyr semantics while improving wake latency and avoiding tick-quantized busy polling.

Changes:

Implement sem-backed k_timer_status_sync() in both ESP and Linux timer backends and wake waiters on expiry/stop.
Extend struct k_timer with an embedded sync_sem and add Z_SEM_INITIALIZER to support compile-time initialization (including K_TIMER_DEFINE).
Add regression tests covering stale-latch behavior and “stopped timer returns immediately” behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
`test/main/test_k_timer.c`	Adds regression tests for stale latch handling and immediate return when stopped.
`components/zkernel/src/k_timer.c`	ESP backend: gives `sync_sem` on expiry/stop and replaces poll loop with sem-backed wait loop.
`components/zkernel/src/k_timer_linux.c`	Linux backend: mirrors sem-backed status_sync and waiter wakes on expiry/stop.
`components/zkernel/include/boreas/zephyr/kernel.h`	Adds `sync_sem` to `k_timer`, introduces `Z_SEM_INITIALIZER`, updates docs for `k_timer_status_sync`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Upstream unpends aborted threads from the timer wait queue; Boreas cannot, so this restriction is a divergence, not parity -- matching the wording already on k_sem_take. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

swoisz and others added 2 commits June 7, 2026 12:11

swoisz requested a review from Copilot June 7, 2026 18:54

Copilot started reviewing on behalf of swoisz June 7, 2026 18:54 View session

Copilot AI reviewed Jun 7, 2026

View reviewed changes

Comment thread components/zkernel/include/boreas/zephyr/kernel.h Outdated

swoisz merged commit 0d1045f into main Jun 8, 2026
5 checks passed

swoisz deleted the fix/k-timer-status-sync-blocking branch June 8, 2026 01:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: k_timer_status_sync blocks on an embedded sem, not a poll loop (#28)#49

fix: k_timer_status_sync blocks on an embedded sem, not a poll loop (#28)#49
swoisz merged 3 commits into
mainfrom
fix/k-timer-status-sync-blocking

swoisz commented Jun 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

swoisz commented Jun 7, 2026

Summary

Upstream-verified semantics (kernel/timer.c)

Review fan-out (boreas-review + adversarial tracer)

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants