Skip to content

bench(engine): overhead compensation + SCOPE.md (rigor pass items 2 + 7)#36

Open
avrabe wants to merge 1 commit intomainfrom
chore/bench-overhead-compensation-and-scope
Open

bench(engine): overhead compensation + SCOPE.md (rigor pass items 2 + 7)#36
avrabe wants to merge 1 commit intomainfrom
chore/bench-overhead-compensation-and-scope

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 3, 2026

Summary

Two parallel items from the bench-rigor work order, both engine_control-scoped. No Rust source touched outside the bench. No changes to flight_control (separate pass).

Item 2 — overhead compensation

`measure_overhead()` runs at boot under `irq_lock`, takes the median of 1000 empty `k_cycle_get_32()`-pair measurements, and stores it as `bench_overhead_cycles`. Every `algo` and `handoff` count emitted to the CSV is the raw measurement minus that constant (saturating at 0).

The compensation is visible, not silent:

  • Firmware emits `overhead_cycles,` in the CSV header.
  • `tag_events.py` and the synth-workflow inline heredocs tag it through to `M,R,,overhead_cycles,` (5 inline-tag tuples updated in `engine-bench-renode-synth.yml`).
  • `analyze.py` parses it into `Meta.overhead_cycles[run]` and surfaces it in the report header: "Overhead subtracted (cycles): baseline R1=N; gale R1=M".
  • README has a new "Framework overhead compensation" section explaining the matching upstream `ztest_bench` pattern.

Reviewers can audit and re-add the subtraction; the raw number is recoverable. Same idiom as Zephyr 4.4 ztest_bench's `ctrl` benchmark in `subsys/testsuite/ztest/benchmark/`.

Pre-compensation and post-compensation numbers are different measurements. Both the README and SCOPE.md call this out explicitly — do not combine them in the same comparison table.

Item 7 — `benches/engine_control/SCOPE.md` (new)

Source of truth for what the bench measures, what it does NOT measure, and what kind of evidence its numbers constitute. Downstream copy imports from there.

Explicit non-claims:

  • Peripheral contention, DMA-driven I/O — none
  • SMP / multi-core — single-CPU only; `gale_spinlock` is shipped but its hazard is not exercised here
  • WCET — explicitly out of scope. Establishing WCET requires AbsInt aiT, Rapita RapiTime, or OTAWA. Worst-case-observed numbers (work item 6, deferred) must be labelled `worst_observed`, never `wcet`. Not negotiable.
  • Power consumption, memory pressure (planned via item 5), fault tolerance, long-duration drift

The file ends with the recommended one-paragraph framing for blog posts and a list of when to update SCOPE.md.

Test plan

  • Firmware compiles cleanly (verified by inspection — local Zephyr workspace is in a non-buildable state for unrelated Kconfig reasons)
  • `analyze.py` parses (verified)
  • `tag_events.py` updated for `overhead_cycles,` line
  • CI produces the new compensation-aware Renode reference baseline — this is what the workflow will produce when this PR's bench runs land. The numbers under this PR are the new reference; do not compare them against pre-compensation numbers in the same table.

Out of scope (deferred per the work order)

  • Item 1 (real-silicon anchor) — blocked on hardware delivery; will be run with item 2 already merged so silicon numbers are compensation-aware from the first measurement.
  • Item 3 (scipy cross-check on MW-U / bootstrap) — deferred to user; statistical-method judgment call.
  • Item 5 (`CONFIG_THREAD_ANALYZER` stack high-water mark) — after item 1.
  • Item 4 (`ZTEST_BENCHMARK` triangulation arm) — after item 5.
  • Item 6 (worst-case-observed paths) — after item 4. Will respect the `worst_observed` ≠ `wcet` labelling rule from SCOPE.md.

🤖 Generated with Claude Code

Item 2 — overhead compensation
==============================

Every algo / handoff cycle count emitted to the CSV stream now has a
constant `bench_overhead_cycles` subtracted. The constant is the
median of 1000 empty `k_cycle_get_32()`-pair measurements taken at
boot under irq_lock, before any per-event timing begins. Saturating
subtraction at 0 — never report a negative cycle count.

The compensation is **visible**, not silent:
- Emitted as `overhead_cycles,<value>` in the CSV header.
- Tagged through `tag_events.py` and the synth-workflow inline
  heredocs as `M,R<run>,<variant>,overhead_cycles,<value>`.
- Parsed by `analyze.py` into `Meta.overhead_cycles[run]` and
  surfaced in the report header as
  "Overhead subtracted (cycles): baseline ...; gale ...".

Reviewers can audit the subtraction step and re-add it to recover
the raw numbers if needed. Same idiom as Zephyr 4.4 ztest_bench's
`ctrl` benchmark pattern (see subsys/testsuite/ztest/benchmark/).

**Pre-compensation and post-compensation numbers are different
measurements.** The README and SCOPE.md both call this out
explicitly. Do not combine them in the same comparison table.

Item 7 — SCOPE.md
=================

New file `benches/engine_control/SCOPE.md` is the **source of truth**
for what the bench measures, what it does NOT measure, and what
kind of evidence its numbers constitute. Subsequent published copy
imports language from there.

Explicit non-claims include peripheral contention, DMA-driven I/O,
SMP/multi-core, **WCET** (which requires static analysis tooling
like AbsInt aiT, Rapita RapiTime, or OTAWA — explicitly out of
scope), power consumption, memory pressure, fault tolerance, and
long-duration drift.

The WCET distinction is unambiguous and not negotiable in published
copy: an observation is not a proof. Worst-case-observed numbers,
when added later under work item 6, must be labeled as
`worst_observed`, never `wcet`.

Local build verification
========================

The local Zephyr workspace at /Users/r/git/pulseengine/z is in a
state that doesn't currently build (a Kconfig env var for
`hal_espressif` is unset, unrelated to these changes). CI will
produce the new compensation-aware Renode reference baseline when
this branch lands.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 3, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant