Describe the bug
vPortRecursiveLock() in portable/ThirdParty/GCC/RP2040/include/portmacro.h (current main)
acquires the SIO task lock on the uncontended fast path via spin_try_lock_unsafe() — a bare
SIO register read (return *lock;) with no memory ordering. The acquire barrier
(__mem_fence_acquire() == __dmb()) lives only inside spin_lock_unsafe_blocking(), i.e. on the
contended path. When spin_try_lock_unsafe() succeeds (the common case), control falls straight
through to the lock-protected bookkeeping with no acquire barrier, so the new holder has no
happens-before edge to the previous holder's critical-section writes and may observe stale state
under weak memory.
portGET_TASK_LOCK drives taskENTER_CRITICAL(), so this affects every kernel structure read after
the lock is taken — it surfaces first on the lock's own ucRecursionCountByLock. This is a C11 data
race (undefined behavior); the access has no happens-before regardless of whether a given core
actually reorders.
Code (current main):
if (uxAcquire) {
if (!spin_try_lock_unsafe(pxSpinLock)) { // fast path: bare register read, NO ordering
if (ucOwnedByCore[xCoreID][ulLockNum]) { ... return; } // recursive
spin_lock_unsafe_blocking(pxSpinLock); // contended: spins, THEN __mem_fence_acquire()
}
/* FAST PATH falls through here with NO acquire barrier */
configASSERT(ucRecursionCountByLock[ulLockNum] == 0); // reads protected metadata
ucRecursionCountByLock[ulLockNum] = 1;
ucOwnedByCore[xCoreID][ulLockNum] = 1;
}
This is a portability/correctness defect against the weak-memory contract the SIO spinlock targets —
present even on a core whose in-order pipeline happens to mask the reorder (see the To Reproduce note).
Target
- Development board: Raspberry Pi Pico (RP2040, dual Cortex-M0+)
- Instruction Set Architecture: Armv6-M
- IDE and version: N/A — analysis against upstream source, not a board build
- Toolchain and version: N/A — defect identified by source review + formal memory-model checking
Host
- Host OS: Linux
- Version: N/A
To Reproduce
Note: this was found and verified at the memory-model level, not by observing a failure on
physical RP2040 hardware. RP2040 is in-order (Cortex-M0+), so the reorder may be masked in practice;
the C11 data race (UB) is present regardless.
Tools: GenMC v0.17.0 (RC11) and herd7 (AArch64 model).
bash harness/run_rq1.sh # GenMC: faithful=race, broken=race, fix=clean (3/3)
herd7 litmus/RQ1_handoff_noacq.litmus # Observation Sometimes 1 3 (stale read allowed)
herd7 litmus/RQ1_handoff_acq.litmus # Observation Never 0 3 (forbidden with the fix)
Minimal handoff litmus — writer does STR data; DMB; STR lock=0, reader does LDR lock; LDR data
with no acquire between, checking exists (reader sees lock==0 /\ data==stale). GenMC
counterexample: core0's acquiring CAS reads-from core1's release (mutual exclusion holds), but with
no trailing acquire fence it does not synchronize-with the release, so core1's critical-section write
is not happens-before core0's read → race.
Expected behavior
The lock holder acquired via the fast path should have an acquire barrier before reading
lock-protected state, establishing happens-before with the previous holder's release — the same
ordering the contended path already gets from __mem_fence_acquire().
Screenshots
N/A.
Additional context
Fix is one line, self-contained in the FreeRTOS port — no pico-sdk change required
(__mem_fence_acquire() is already used in this same file inside spin_lock_unsafe_blocking()):
spin_lock_unsafe_blocking(pxSpinLock);
}
__mem_fence_acquire(); /* fast path carries no acquire ordering */
configASSERT(ucRecursionCountByLock[ulLockNum] == 0);
Unconditional, so the contended path gets one redundant (harmless) __dmb(). A no-redundancy
variant puts the fence in an else of the try-lock if (fast path only). Equivalent root-cause
alternatives, if preferred: give spin_try_lock_unsafe() acquire semantics in pico-sdk, or add a
fencing try-lock variant.
I've verified both the defect and the fix locally with GenMC (RC11) and herd7 (AArch64). Happy to
submit a PR for the portmacro.h fast-path fence — or defer to a pico-sdk fix if you prefer that
layer. Full write-up + counterexample trace available on request.
Describe the bug
vPortRecursiveLock()inportable/ThirdParty/GCC/RP2040/include/portmacro.h(currentmain)acquires the SIO task lock on the uncontended fast path via
spin_try_lock_unsafe()— a bareSIO register read (
return *lock;) with no memory ordering. The acquire barrier(
__mem_fence_acquire()==__dmb()) lives only insidespin_lock_unsafe_blocking(), i.e. on thecontended path. When
spin_try_lock_unsafe()succeeds (the common case), control falls straightthrough to the lock-protected bookkeeping with no acquire barrier, so the new holder has no
happens-before edge to the previous holder's critical-section writes and may observe stale state
under weak memory.
portGET_TASK_LOCKdrivestaskENTER_CRITICAL(), so this affects every kernel structure read afterthe lock is taken — it surfaces first on the lock's own
ucRecursionCountByLock. This is a C11 datarace (undefined behavior); the access has no happens-before regardless of whether a given core
actually reorders.
Code (current
main):This is a portability/correctness defect against the weak-memory contract the SIO spinlock targets —
present even on a core whose in-order pipeline happens to mask the reorder (see the To Reproduce note).
Target
Host
To Reproduce
Tools: GenMC v0.17.0 (RC11) and herd7 (AArch64 model).
Minimal handoff litmus — writer does
STR data; DMB; STR lock=0, reader doesLDR lock; LDR datawith no acquire between, checking
exists (reader sees lock==0 /\ data==stale). GenMCcounterexample: core0's acquiring CAS reads-from core1's release (mutual exclusion holds), but with
no trailing acquire fence it does not synchronize-with the release, so core1's critical-section write
is not happens-before core0's read → race.
Expected behavior
The lock holder acquired via the fast path should have an acquire barrier before reading
lock-protected state, establishing happens-before with the previous holder's release — the same
ordering the contended path already gets from
__mem_fence_acquire().Screenshots
N/A.
Additional context
Fix is one line, self-contained in the FreeRTOS port — no pico-sdk change required
(
__mem_fence_acquire()is already used in this same file insidespin_lock_unsafe_blocking()):Unconditional, so the contended path gets one redundant (harmless)
__dmb(). A no-redundancyvariant puts the fence in an
elseof the try-lockif(fast path only). Equivalent root-causealternatives, if preferred: give
spin_try_lock_unsafe()acquire semantics in pico-sdk, or add afencing try-lock variant.
I've verified both the defect and the fix locally with GenMC (RC11) and herd7 (AArch64). Happy to
submit a PR for the
portmacro.hfast-path fence — or defer to a pico-sdk fix if you prefer thatlayer. Full write-up + counterexample trace available on request.