Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Changelog

Notable and breaking changes for downstream projects. Versions are not
yet tagged; entries reference the merge PR.

## Unreleased

### Migration checklist (bumping across the 2026-06 hardening series)

1. **Add `sdkconfig.boreas` to your defaults list** (#41) — in your
project `CMakeLists.txt`, before `project.cmake`:
```cmake
set(SDKCONFIG_DEFAULTS "sdkconfig.defaults;components/boreas/sdkconfig.boreas")
```
Then regenerate config once: `rm -rf build sdkconfig && idf.py
set-target <target>`. A compile-time error names the file if missing.
2. **Audit `k_work_submit*` / `k_work_schedule*` / `k_work_reschedule*`
return checks** (#37) — success is no longer `0`. Upstream-parity
codes: `0` = no-op (already queued/scheduled), `1` = queued/armed,
`2` = was running and queued again, negative = error (`-ENODEV` for
an unstarted queue, previously `-EINVAL`). `< 0` error checks are
unaffected; `== 0` success checks must change.
3. **Audit boolean uses of `k_work_cancel`** (#38) — **polarity
inverted**: it now returns the remaining busy state (`int`), so old
`true` meant *cancelled* while new nonzero means *still busy*.
`if (k_work_cancel(&w))` flips meaning. `k_work_cancel_sync` now
returns `bool` was-pending (was `int 0`).
4. **Collapse triple-cancel workarounds** (#38) — self-rescheduling
delayables are now stopped reliably with one
`k_work_cancel_delayable_sync()` call; submission during a cancel
is rejected with `-EBUSY`.
5. **Do not use task-notification index 1** in application code (#41)
— reserved for zkernel blocking primitives.
6. **Do not abort threads blocked in `k_sem_take`** (#41) — see the
`@note` on `k_sem_take`; upstream unpends aborted threads, Boreas
cannot.

### Changed

- **k_sem is notification-backed** (#41): no FreeRTOS control block;
`K_SEM_DEFINE` is a true compile-time initializer (usable from any
constructor); `k_sem_reset` wakes waiters with `-EAGAIN`;
`k_sem_give` wakes the highest-priority waiter; `k_sem_init` returns
`-EINVAL` for invalid limits. When `k_sem_take` returns, the kernel
holds no references into the caller's struct.
- **k_work cancel family enforces `K_WORK_CANCELING`** (#38) and
removes a queued-again-while-running instance on cancel.
- **k_work return codes match upstream Zephyr** (#37); the schedule
no-op window and schedule-while-running semantics now match upstream.
- **k_thread lifecycle honors the caller-owned-memory contract** (#18):
a returning entry function terminates the thread; `k_thread_join`
reclaims it (codes: `-EBUSY` no-wait, `-EDEADLK` self-join,
`-EAGAIN` timeout) and no longer false-joins suspended threads.

### Fixed

- The 2026-04 stack-local `k_sem` scheduler corruption was root-caused
to the pre-#18 k_thread zombie defect and is fixed since #18; the
trigger shapes are permanent regression tests (#39, issue #21).
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,19 @@ Add Boreas to your ESP-IDF project as a git submodule or local path:
```bash
# As submodule
git submodule add https://github.com/intercreate/boreas.git components/boreas
```

# In your top-level CMakeLists.txt
```cmake
# In your top-level CMakeLists.txt (before project.cmake is included)
set(EXTRA_COMPONENT_DIRS components/boreas/components)
set(SDKCONFIG_DEFAULTS "sdkconfig.defaults;components/boreas/sdkconfig.boreas")
```

`sdkconfig.boreas` carries the configuration Boreas requires (with the
rationale documented inline); a compile-time error names it if missing.
After adding or updating it, regenerate your config once:
`rm -rf build sdkconfig && idf.py set-target <target>`.

Then include headers:

```c
Expand Down
42 changes: 37 additions & 5 deletions components/zkernel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,10 @@ this principle, and any divergence belongs in an `@note` on the declaration.

Current status: `k_thread` severs synchronously on silicon and documents the
linux best-effort window; `k_work`/`k_work_sync` unlink synchronously via
their own state machine; the k_timer linux backend dequeues synchronously on
stop; `k_sem`/`k_mutex`/`k_msgq`/`k_event` waiters are unlinked by FreeRTOS
their own state machine; `k_sem` is notification-backed (nothing of the
caller's memory ever enters a kernel list); the k_timer linux backend
dequeues synchronously on stop; `k_mutex`/`k_msgq`/`k_event` waiters are
unlinked by FreeRTOS
before the blocking call returns (a *blocked* caller's frame is necessarily
live, and the control-block updates that wake a waiter complete before the
waiter runs — but the giving/sending context may be preempted by the woken
Expand Down Expand Up @@ -80,9 +82,25 @@ k_sem_count_get(&sem);
|--------|---------|
| `0` | Success |
| `-EBUSY` | Not available (K_NO_WAIT) |
| `-EAGAIN` | Timeout expired |

**ISR-safe:** `k_sem_give` (uses `xSemaphoreGiveFromISR` automatically).
| `-EAGAIN` | Timeout expired, or the semaphore was reset while waiting |
| `-EINVAL` | `k_sem_init` with `limit == 0` or `initial > limit` |

**Notification-backed** (no FreeRTOS control block): the count and waiter
list live in the caller-owned struct; blocking rides direct-to-task
notifications on **reserved index 1** (requires
`CONFIG_FREERTOS_TASK_NOTIFICATION_ARRAY_ENTRIES >= 2`, enforced at compile
time). Consequences:

- `K_SEM_DEFINE` is a true compile-time initializer (usable from any
constructor — upstream parity)
- `k_sem_reset` wakes all waiters with `-EAGAIN` (upstream parity)
- `k_sem_give` wakes the highest-priority waiter (upstream parity)
- when `k_sem_take` returns, the kernel holds zero references into the
caller's struct — the design principle above, by construction

**ISR-safe:** `k_sem_give` (uses `vTaskNotifyGiveIndexedFromISR`
automatically). Do not use task-notification index 1 directly in
application code.

## Mutex (`k_mutex`)

Expand Down Expand Up @@ -276,3 +294,17 @@ Ordered initialization. Entries are emplaced into the `.sys_init_entries` linker
| `CONFIG_ZKERNEL_SYS_INIT` | y | Enable SYS_INIT framework |
| `CONFIG_ZKERNEL_SYS_INIT_MAX_ENTRIES` | 32 | Max SYS_INIT registrations |
| `CONFIG_ZKERNEL_FATAL_CAPTURE` | n | Save fatal context to NVS |

Required configuration ships in **`sdkconfig.boreas`** at the repo root —
add it to your project's defaults list (before `project.cmake`):

```cmake
set(SDKCONFIG_DEFAULTS "sdkconfig.defaults;path/to/boreas/sdkconfig.boreas")
```

It currently sets `CONFIG_FREERTOS_TASK_NOTIFICATION_ARRAY_ENTRIES=2` —
zkernel reserves task-notification index 1 for its blocking primitives;
index 0 stays free for ESP-IDF internals. A compile-time `#error` backstops
the requirement (Kconfig cannot set another component's int symbol
automatically: `select` is bool-only and cross-component int defaults lose
the parse-order race).
95 changes: 47 additions & 48 deletions components/zkernel/include/boreas/zephyr/kernel.h
Original file line number Diff line number Diff line change
Expand Up @@ -170,64 +170,57 @@ static inline void k_yield(void)
* Semaphore
* ---------------------------------------------------------------- */

/* Notification-backed (no FreeRTOS control block): count/limit/waiter
* list live here, guarded by the lock; blocking rides direct-to-task
* notifications on a reserved index, whose state is kernel-owned. See
* the README design principle -- when k_sem_take returns, the kernel
* holds no references into this struct. */
struct k_sem {
SemaphoreHandle_t handle;
StaticSemaphore_t buffer;
uint32_t count;
uint32_t limit;
sys_dlist_t waiters; /* of z_sem_waiter, caller-stack resident */
portMUX_TYPE lock;
};

/**
* Statically declare a semaphore and auto-initialize it with the
* given @p initial count and @p limit before main() runs.
*
* Implementation note: FreeRTOS counting semaphores require a runtime
* xSemaphoreCreateCountingStatic() call, so true compile-time init
* isn't possible. The macro emits a per-instance constructor that
* runs at startup.
*
* @note ESP-IDF Xtensa caveat: ESP-IDF iterates `.init_array` in
* DESCENDING order on Xtensa (see esp_system/startup.c
* do_global_ctors). Default-priority user constructors run
* BEFORE prioritized ones, and within a single TU the LAST-
* declared constructor runs first. Adding a constructor
* priority does not help -- prioritized constructors run
* AFTER unprioritized ones in the descending iteration. As a
* result the K_SEM_DEFINE'd sem is NOT guaranteed ready
* inside arbitrary user constructors on ESP-IDF Xtensa. It IS
* ready by:
* - app_main()
* - SYS_INIT() callbacks
* - constructors declared earlier in the same TU
* (textually) than the K_SEM_DEFINE
*
* For sems consumed from constructor context, prefer manual
* k_sem_init() in a SYS_INIT callback.
*
* @note Archive-stripping: when this macro expands inside a static
* archive, the constructor can be stripped by the linker
* unless pulled in via WHOLE_ARCHIVE (idf_component_register
* WHOLE_ARCHIVE). User application code typically isn't in
* such an archive; library/component code is.
* Statically define a fully-initialized semaphore. True compile-time
* initializer (matches upstream Zephyr): usable from constructors and
* SYS_INIT callbacks without any runtime init step.
*/
/* Indirection helpers so __LINE__ expands before token-pasting. The
* line-number-based ctor name avoids generating a file-scope
* identifier that begins with underscore (reserved by C11 7.1.3) and
* is robust to caller-supplied names that themselves begin with
* underscore. Caveat: two K_SEM_DEFINE on the same source line will
* collide; not expected in practice. */
#define K_SEM_CONCAT_(a, b) a##b
#define K_SEM_CONCAT(a, b) K_SEM_CONCAT_(a, b)
#define K_SEM_INIT_CTOR_NAME(line) K_SEM_CONCAT(k_sem_init_ctor_, line)

#define K_SEM_DEFINE(name, _initial, _limit) \
struct k_sem name = {0}; \
__attribute__((constructor)) static void K_SEM_INIT_CTOR_NAME(__LINE__)(void) \
{ \
k_sem_init(&name, (_initial), (_limit)); \
}
struct k_sem name = { \
.count = (_initial), \
.limit = (_limit), \
.waiters = SYS_DLIST_STATIC_INIT(&name.waiters), \
.lock = portMUX_INITIALIZER_UNLOCKED, \
}; \
BUILD_ASSERT(((_limit) != 0) && ((_initial) <= (_limit)), \
"K_SEM_DEFINE: limit must be nonzero and >= initial") /* upstream parity */

/**
* @retval 0 on success
* @retval -EINVAL if @p limit is zero or @p initial_count exceeds it
* (matches upstream)
*/
int k_sem_init(struct k_sem *sem, unsigned int initial_count, unsigned int limit);
/**
* @retval 0 on success
* @retval -EBUSY if K_NO_WAIT and the semaphore was unavailable
* @retval -EAGAIN on timeout, or if the semaphore was reset while
* waiting (matches upstream k_sem_reset semantics)
*
* @note Divergence: a thread blocked in k_sem_take must NOT be
* aborted (k_thread_abort / vTaskDelete). Upstream Zephyr
* unpends an aborted thread from any wait queue; Boreas cannot
* reach into the semaphore's waiter list from abort, so the
* dead thread would leave a dangling waiter node. The same
* applies to aborting a thread that is inside k_sem_give.
*/
int k_sem_take(struct k_sem *sem, k_timeout_t timeout);
void k_sem_give(struct k_sem *sem);
/** Zero the count and wake all waiters; their takes return -EAGAIN
* (upstream parity -- the previous FreeRTOS-backed implementation
* could only drain the count). */
void k_sem_reset(struct k_sem *sem);
unsigned int k_sem_count_get(struct k_sem *sem);

Expand Down Expand Up @@ -851,6 +844,12 @@ void k_thread_name_set(struct k_thread *thread, const char *name);
* runners, not guaranteed under idle starvation). Upstream
* Zephyr's k_thread_abort does not block this way; on silicon
* reclamation is synchronous and does not block.
*
* @note Divergence: do not abort a thread that is blocked in
* k_sem_take (or inside k_sem_give) -- upstream unpends aborted
* threads from wait queues; Boreas cannot reach into the
* notification-backed semaphore's waiter list from here (see
* the @note on k_sem_take).
*/
void k_thread_abort(struct k_thread *thread);
void k_thread_suspend(struct k_thread *thread);
Expand Down
Loading
Loading