diff --git a/benches/engine_control/README.md b/benches/engine_control/README.md index 1a3db31..3525655 100644 --- a/benches/engine_control/README.md +++ b/benches/engine_control/README.md @@ -46,6 +46,22 @@ This replaces the in-firmware histogram+mean approach whose mean divisor (reader `count`) diverged from the numerator (ISR event sum) when the sweep truncated early, invalidating the published deltas. +## Silicon-anchor protocol + +Renode is the CI workhorse; **silicon captures are manual**, periodic, +and recorded directly into the repo as immutable evidence. See +[`silicon/README.md`](silicon/README.md) for the procedure, board +notes, and the `capture.sh` wrapper. Per-board configs live under +`silicon/boards/`; recorded captures land under `silicon/runs//` +with a manifest, the firmware ELF, and the tagged events CSV. + +The first supported board is the NUCLEO-G474RE (STM32G474, Cortex-M4F ++ FPU, 170 MHz) — closest production-shape silicon to the +`stm32f4_disco` Renode target. The ratio `silicon_median / +renode_median` per RPM step is what the anchor establishes; once +consistent across multiple captures it can be cited as the +Renode-silicon multiplier. + ## Building ```sh diff --git a/benches/engine_control/silicon/README.md b/benches/engine_control/silicon/README.md new file mode 100644 index 0000000..3428ad9 --- /dev/null +++ b/benches/engine_control/silicon/README.md @@ -0,0 +1,145 @@ +# Silicon-anchor protocol — engine_control + +CI runs Renode (deterministic, parallel-safe). **Silicon runs are +manual**, periodic, and hand-driven on a single shared board. +This directory contains the protocol for taking a silicon capture, +recording it as immutable evidence in the repo, and citing it as +the anchor for Renode-headlined published numbers. + +## Why + +Renode is per-translated-block instruction-cost simulation, not +microarchitectural simulation: no cache, no memory contention, no +pipeline modeling. The cross-Renode A/B (1.16.0 vs nightly = 0.0% +drift) ruled out simulator-version drift but did NOT rule out +Renode being systematically off vs real silicon by a fixed +multiplier. The silicon anchor settles that. + +The relationship `silicon_cycles / renode_cycles = R` is what the +silicon anchor establishes. Once `R` is consistent across +multiple silicon captures over time, it can be cited as the +Renode-silicon multiplier for that bench/board combination. + +## Recorded-run-in-git protocol + +Every silicon run lives in `silicon/runs/---/` +and contains: + +- `output.csv` — the raw UART capture (firmware-emitted) +- `events.csv` — same data, tagged through `tag_events.py` +- `manifest.txt` — board, MCU, clock, rustc/cargo versions, gale + commit SHA, ELF sha256, capture timestamp +- `firmware.elf` — the exact binary that produced the capture +- `firmware.elf.sha256` — checksum file + +These directories are **immutable** once committed. To re-run the +same capture, create a new dated directory; never overwrite an +existing one. This makes any silicon citation in a blog post or +report point to a stable git URL. + +CSV row counts are small (~50–500 KB per run, ~7,750 rows long +sweep). At one capture per board per major bench-relevant commit, +the repo growth is modest. + +## Boards + +| Board | Status | Anchors | +|---|---|---| +| `nucleo_g474re` (STM32G474RE, Cortex-M4F, 170 MHz) | scaffold ready | the existing Renode `stm32f4_disco` Cortex-M numbers | +| `esp32c3_devkit_rust1` (ESP32-C3, RV32IMC, 160 MHz) | not started | the *future* RISC-V Renode lane (separate work) | + +## Capture procedure (NUCLEO-G474RE) + +Hardware: +- Hardware: STMicroelectronics NUCLEO-G474RE +- Connection: USB to host (ST-Link integrated, virtual COM port at 115200 8N1) +- Programming: `west flash` via OpenOCD or pyOCD (ST-Link backend) + +Host setup (one-time): +- Zephyr SDK with `arm-zephyr-eabi` toolchain +- OpenOCD or pyOCD installed (`brew install open-ocd` on macOS, or `apt install openocd`) +- Python with `pyserial` for the capture script: `pip3 install pyserial` + +To take a baseline capture (stock Zephyr): + +```sh +cd $GALE_ROOT +bash benches/engine_control/silicon/capture.sh \ + --board nucleo_g474re \ + --variant baseline \ + --sweep long +``` + +To take a gale capture: + +```sh +bash benches/engine_control/silicon/capture.sh \ + --board nucleo_g474re \ + --variant gale \ + --sweep long +``` + +Both invocations: + +1. Build the firmware locally (no Bazel; `west build -b `). +2. Compute the firmware ELF sha256. +3. Flash via `west flash`. +4. Open the board's USB CDC serial port and read until `=== END ===` + (default timeout: 30 minutes for `--sweep long`). +5. Generate `manifest.txt` from the build environment + capture + metadata. +6. Tag the raw output through `tag_events.py` (run-id auto-derived + from the date + board). +7. Write everything into a new `silicon/runs//`. + +The capture script does not commit. After both variants are +captured and you've eyeballed `output.csv` for sanity, commit: + +```sh +git add benches/engine_control/silicon/runs/-nucleo_g474re-*-{baseline,gale}/ +git commit -m "silicon: NUCLEO-G474RE anchor at gale@" +``` + +## Comparing silicon vs Renode + +Once `silicon/runs/-{baseline,gale}/` exist, run: + +```sh +python3 benches/engine_control/analyze.py \ + --baseline silicon/runs//events.csv \ + --gale silicon/runs//events.csv \ + --runs 1 \ + > /tmp/silicon-comparison.md +``` + +The analyzer renders the same baseline-vs-gale tables as for +Renode, but the metadata in the report header carries through the +silicon-run identifiers. Compare side-by-side with the Renode CI +output for the same gale SHA — the **ratio** `silicon_median / +renode_median` per RPM step is the calibration data. + +If you want a single-call Renode-vs-silicon side-by-side rendering, +that's a planned analyzer extension (`--silicon-anchor `) +to be added once the first capture exists to test against. + +## Anchor cadence + +- One silicon capture per board per major bench-relevant gale + commit (e.g., when overhead compensation lands, when synth + pipeline changes, when a primitive's hot-path is rewritten). +- Each Renode-headlined publication cites the most recent matching + anchor by stable git URL. +- Three to four anchor points per board per year is enough to + claim the Renode-silicon relationship is monotonic. + +## Don't + +- Don't overwrite an existing `runs//` — start a new one. +- Don't combine pre-overhead-compensation and post-overhead- + compensation captures in the same comparison table; they're + different measurements (see `../SCOPE.md`). +- Don't claim WCET from silicon captures. Worst-case-observed is + not WCET. Same rule as the synthetic bench (see `../SCOPE.md`). +- Don't run silicon captures from a branch that isn't reproducible + (uncommitted changes). The manifest captures the working-tree + state, not just HEAD. diff --git a/benches/engine_control/silicon/boards/nucleo_g474re/README.md b/benches/engine_control/silicon/boards/nucleo_g474re/README.md new file mode 100644 index 0000000..a5df2ba --- /dev/null +++ b/benches/engine_control/silicon/boards/nucleo_g474re/README.md @@ -0,0 +1,72 @@ +# NUCLEO-G474RE — silicon-anchor board notes + +## Hardware + +- **Board:** STMicroelectronics NUCLEO-G474RE +- **MCU:** STM32G474RET6 (Cortex-M4F + FPU + DSP, 170 MHz) +- **Memory:** 512 KB Flash, 128 KB RAM +- **Cycle counter:** DWT_CYCCNT (same as Cortex-M4F on `stm32f4_disco`) +- **Programmer:** integrated ST-Link/V3E over USB; exposes virtual + COM port for stdout +- **Upstream Zephyr support:** `nucleo_g474re` (already in the tree) + +## Why this board for the anchor + +Cortex-M4F + FPU at 170 MHz is the closest production-shape silicon +to the simulated `stm32f4_disco` (also Cortex-M4F + FPU at 168 MHz). +The architectural variables held constant between the synthetic and +silicon measurements are: + +- ARMv7E-M instruction set (Thumb-2) +- DWT_CYCCNT cycle counter (same width, same definition) +- 3-stage in-order pipeline +- Single-cycle MUL, hardware DIV, single-precision FPU + +What differs: + +- Real cache effects (none on Cortex-M4 — no D-cache; flash + prefetch buffer behavior visible) +- Real bus arbitration with non-existent peripherals on this bench + (negligible — no DMA, no peripheral activity) +- Clock 170 vs 168 MHz (1.2% — accountable directly) + +So the cycle ratio `silicon / renode` for `algo` and `handoff` +should be near 1.0 in steady state. Anything materially off is +information about Renode's cycle model, not about the silicon. + +## Connection + +USB cable from NUCLEO USB connector (CN1) to host. The ST-Link +virtual COM port appears as: + +- macOS: `/dev/cu.usbmodem*` +- Linux: `/dev/ttyACM0` + +Zephyr's default for this board uses LPUART1 for stdout, exposed +through ST-Link. + +## Programming + +`west flash` from a build directory works out of the box: + +```sh +west flash -d /tmp/eng-nucleo-baseline +``` + +Default backend is OpenOCD. To force pyOCD: + +```sh +west flash -d /tmp/eng-nucleo-baseline --runner pyocd +``` + +## Clock / cycle counter notes + +On the G4 family, `k_cycle_get_32()` returns `SCB_DWT->CYCCNT` +directly, same as on F4. `sys_clock_hw_cycles_per_sec()` returns +the bus clock the cycle counter ticks at — verify this matches +170 MHz at runtime by reading the boot banner before relying on +absolute ns conversions. + +## Known issues + +None yet — populate as captures happen. diff --git a/benches/engine_control/silicon/boards/nucleo_g474re/prj.conf b/benches/engine_control/silicon/boards/nucleo_g474re/prj.conf new file mode 100644 index 0000000..1cb76bc --- /dev/null +++ b/benches/engine_control/silicon/boards/nucleo_g474re/prj.conf @@ -0,0 +1,15 @@ +# NUCLEO-G474RE — engine_control bench overlay +# +# Empty for now: Zephyr's nucleo_g474re defaults give us: +# - 170 MHz HCLK (PLL'd up) +# - LPUART1 console at 115200 8N1 via ST-Link VCP +# - DWT_CYCCNT enabled (Cortex-M4 default in Zephyr) +# +# Add overlay options here only if a future capture exposes a +# default that biases the measurement (e.g. interrupt priority of +# a peripheral we don't use; tickless idle behavior; etc.). +# +# Anything board-specific that *must* be on for the silicon +# measurement to be valid goes here. Anything project-wide +# (gale module enable, sweep size) stays in the main prj.conf +# overlay or the CMake invocation. diff --git a/benches/engine_control/silicon/capture.py b/benches/engine_control/silicon/capture.py new file mode 100755 index 0000000..2e59421 --- /dev/null +++ b/benches/engine_control/silicon/capture.py @@ -0,0 +1,91 @@ +#!/usr/bin/env python3 +"""Cross-platform UART capture for the silicon-anchor protocol. + +Reads lines from a serial port until either a sentinel line +(default '=== END ===') appears, the byte budget is exhausted, or +the wall-clock timeout fires. Writes the raw stream to stdout (or +to a file with --out). + +Designed to be invoked by capture.sh — keep this script's +dependencies minimal: stdlib + pyserial. + +Usage: + capture.py --port /dev/cu.usbmodem11403 --baud 115200 \\ + --sentinel '=== END ===' --timeout 1800 \\ + --out output.csv +""" +from __future__ import annotations + +import argparse +import sys +import time + +try: + import serial # type: ignore +except ImportError: + sys.stderr.write( + "ERROR: pyserial not installed. Run: pip3 install pyserial\n") + sys.exit(2) + + +def main() -> int: + p = argparse.ArgumentParser() + p.add_argument("--port", required=True, + help="serial device path (e.g. /dev/cu.usbmodem11403)") + p.add_argument("--baud", type=int, default=115200, + help="baud rate (default 115200)") + p.add_argument("--sentinel", default="=== END ===", + help="line marking end-of-capture") + p.add_argument("--timeout", type=int, default=1800, + help="wall-clock timeout in seconds (default 1800)") + p.add_argument("--out", default="-", + help="output path or '-' for stdout (default '-')") + p.add_argument("--max-bytes", type=int, default=64 * 1024 * 1024, + help="byte-budget ceiling (default 64 MiB)") + args = p.parse_args() + + out = sys.stdout if args.out == "-" else open(args.out, "w") + deadline = time.monotonic() + args.timeout + bytes_written = 0 + sentinel_seen = False + + try: + # serial timeout = 1s so we wake periodically to check the + # wall-clock budget even if the firmware is silent. + ser = serial.Serial(args.port, args.baud, timeout=1) + except serial.SerialException as e: + sys.stderr.write(f"ERROR opening {args.port}: {e}\n") + return 3 + + try: + while time.monotonic() < deadline and bytes_written < args.max_bytes: + line_bytes = ser.readline() + if not line_bytes: + continue # serial timeout, loop back to check deadline + try: + line = line_bytes.decode("utf-8", errors="replace") + except Exception: + line = line_bytes.decode("latin-1", errors="replace") + out.write(line) + out.flush() + bytes_written += len(line_bytes) + if line.rstrip("\r\n") == args.sentinel: + sentinel_seen = True + break + finally: + ser.close() + if out is not sys.stdout: + out.close() + + if not sentinel_seen: + sys.stderr.write( + f"WARN: sentinel '{args.sentinel}' not seen " + f"(timeout={args.timeout}s, bytes={bytes_written})\n") + return 1 + sys.stderr.write( + f"OK: sentinel seen at {bytes_written} bytes\n") + return 0 + + +if __name__ == "__main__": + raise SystemExit(main()) diff --git a/benches/engine_control/silicon/capture.sh b/benches/engine_control/silicon/capture.sh new file mode 100755 index 0000000..e8b7560 --- /dev/null +++ b/benches/engine_control/silicon/capture.sh @@ -0,0 +1,198 @@ +#!/usr/bin/env bash +# Silicon-anchor capture wrapper for engine_control. +# +# Builds, flashes, and captures one variant on a real board, then +# writes the result + manifest into a dated directory under runs/. +# Manual flow — not invoked from CI. +# +# Usage: +# capture.sh --board nucleo_g474re --variant {baseline,gale} \ +# [--sweep {short,long}] [--port /dev/cu.usbmodem11403] +# +# Defaults: +# --sweep short (use --sweep long for the publication-grade run) +# --port: auto-detect first /dev/cu.usbmodem* (macOS) or +# /dev/ttyACM0 (Linux). Override if multiple boards present. + +set -euo pipefail + +# --------------------------------------------------------------------- args +BOARD="" +VARIANT="" +SWEEP="short" +PORT="" +SILICON_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +GALE_ROOT="$(cd "${SILICON_DIR}/../../.." && pwd)" + +while [[ $# -gt 0 ]]; do + case "$1" in + --board) BOARD="$2"; shift 2 ;; + --variant) VARIANT="$2"; shift 2 ;; + --sweep) SWEEP="$2"; shift 2 ;; + --port) PORT="$2"; shift 2 ;; + -h|--help) + sed -n '1,/^set -/p' "$0" | sed -n 's/^# \?//p'; exit 0 ;; + *) + echo "unknown arg: $1" >&2; exit 2 ;; + esac +done + +[[ -z "$BOARD" ]] && { echo "missing --board" >&2; exit 2; } +[[ -z "$VARIANT" ]] && { echo "missing --variant" >&2; exit 2; } +case "$VARIANT" in baseline|gale) ;; *) + echo "--variant must be 'baseline' or 'gale'" >&2; exit 2 ;; +esac +case "$SWEEP" in short|long) ;; *) + echo "--sweep must be 'short' or 'long'" >&2; exit 2 ;; +esac + +# Verify board overlay exists in our silicon/boards/ tree +BOARD_DIR="${SILICON_DIR}/boards/${BOARD}" +if [[ ! -d "$BOARD_DIR" ]]; then + echo "no silicon overlay for board '$BOARD' at $BOARD_DIR" >&2 + echo "supported: $(ls "${SILICON_DIR}/boards/" 2>/dev/null | tr '\n' ' ')" >&2 + exit 2 +fi + +# --------------------------------------------------------------------- env +: "${ZEPHYR_BASE:?need ZEPHYR_BASE in env}" +: "${ZEPHYR_SDK_INSTALL_DIR:=}" # optional; west picks one up if unset +GALE_SHA_FULL="$(git -C "$GALE_ROOT" rev-parse HEAD)" +GALE_SHA="${GALE_SHA_FULL:0:8}" +DATE="$(date -u +%Y-%m-%d)" +RUNS_DIR_BASE="${SILICON_DIR}/runs" +RUN_DIR="${RUNS_DIR_BASE}/${DATE}-${BOARD}-${GALE_SHA}-${VARIANT}" +BUILD_DIR="/tmp/silicon-${BOARD}-${VARIANT}" + +if [[ -d "$RUN_DIR" ]]; then + echo "ERROR: run dir already exists: $RUN_DIR" >&2 + echo "Per protocol, never overwrite. Start a new dated dir or delete the old one." >&2 + exit 3 +fi + +# --------------------------------------------------------------------- port autodetect +if [[ -z "$PORT" ]]; then + case "$(uname -s)" in + Darwin) + PORT="$(ls /dev/cu.usbmodem* 2>/dev/null | head -1 || true)" ;; + Linux) + PORT="$(ls /dev/ttyACM* 2>/dev/null | head -1 || true)" ;; + esac + [[ -z "$PORT" ]] && { + echo "could not auto-detect serial port; pass --port" >&2; exit 2; + } + echo "auto-detected port: $PORT" +fi + +# --------------------------------------------------------------------- build +echo "==> Building $VARIANT for $BOARD (sweep=$SWEEP)" +WEST_ARGS=( -b "$BOARD" -d "$BUILD_DIR" -s "${GALE_ROOT}/benches/engine_control" ) +WEST_DEFINES=( -DENGINE_BENCH_SWEEP="$SWEEP" ) + +# Layer the board's silicon-overlay if it has anything. +BOARD_OVERLAY="${BOARD_DIR}/prj.conf" +if [[ -s "$BOARD_OVERLAY" ]]; then + # If gale variant, append after the gale overlay; if baseline, this + # is the only overlay. + if [[ "$VARIANT" == "gale" ]]; then + WEST_DEFINES+=( + -DZEPHYR_EXTRA_MODULES="$GALE_ROOT" + -DOVERLAY_CONFIG="${GALE_ROOT}/benches/engine_control/prj-gale.conf;${BOARD_OVERLAY}" + ) + else + WEST_DEFINES+=( -DOVERLAY_CONFIG="${BOARD_OVERLAY}" ) + fi +elif [[ "$VARIANT" == "gale" ]]; then + WEST_DEFINES+=( + -DZEPHYR_EXTRA_MODULES="$GALE_ROOT" + -DOVERLAY_CONFIG="${GALE_ROOT}/benches/engine_control/prj-gale.conf" + ) +fi + +rm -rf "$BUILD_DIR" +( cd "$GALE_ROOT/.." && west build -p auto "${WEST_ARGS[@]}" -- "${WEST_DEFINES[@]}" ) + +ELF="${BUILD_DIR}/zephyr/zephyr.elf" +[[ ! -f "$ELF" ]] && { echo "build did not produce $ELF" >&2; exit 4; } + +# --------------------------------------------------------------------- record +mkdir -p "$RUN_DIR" +cp "$ELF" "$RUN_DIR/firmware.elf" + +if command -v sha256sum >/dev/null 2>&1; then + ELF_SHA="$(sha256sum "$ELF" | awk '{print $1}')" +else + ELF_SHA="$(shasum -a 256 "$ELF" | awk '{print $1}')" # macOS fallback +fi +echo "$ELF_SHA firmware.elf" > "$RUN_DIR/firmware.elf.sha256" + +# --------------------------------------------------------------------- flash +echo "==> Flashing" +( cd "$GALE_ROOT/.." && west flash -d "$BUILD_DIR" ) + +# --------------------------------------------------------------------- capture +# Long sweep can take a few minutes wall-time at 168 MHz; short ~10s. +TIMEOUT=1800 # 30 min +[[ "$SWEEP" == "short" ]] && TIMEOUT=120 + +echo "==> Capturing from $PORT (timeout ${TIMEOUT}s)" +python3 "${SILICON_DIR}/capture.py" \ + --port "$PORT" --baud 115200 \ + --sentinel "=== END ===" \ + --timeout "$TIMEOUT" \ + --out "$RUN_DIR/output.csv" + +# --------------------------------------------------------------------- tag +RUN_ID="silicon-${DATE}" # deterministic per-day-per-board; tag_events + # prefixes with R, so this becomes R-silicon-... +python3 "${GALE_ROOT}/benches/engine_control/tag_events.py" \ + "$RUN_DIR/output.csv" "$RUN_ID" "$VARIANT" \ + > "$RUN_DIR/events.csv" + +# --------------------------------------------------------------------- manifest +MANIFEST="$RUN_DIR/manifest.txt" +{ + echo "# Silicon-anchor manifest" + echo "# Produced by benches/engine_control/silicon/capture.sh" + echo "captured_at: $(date -u +%Y-%m-%dT%H:%M:%SZ)" + echo "board: ${BOARD}" + echo "variant: ${VARIANT}" + echo "sweep: ${SWEEP}" + echo "gale_sha: ${GALE_SHA_FULL}" + echo "gale_status: $(cd "$GALE_ROOT" && git status --porcelain | wc -l | tr -d ' ') uncommitted file(s)" + echo "host: $(uname -srm)" + echo "rustc: $(rustc --version 2>&1 | head -1)" + echo "cargo: $(cargo --version 2>&1 | head -1)" + echo "west: $(west --version 2>&1 | head -1)" + echo "zephyr_base: ${ZEPHYR_BASE}" + echo "zephyr_sha: $(git -C "$ZEPHYR_BASE" rev-parse HEAD 2>/dev/null || echo unknown)" + echo "sdk_dir: ${ZEPHYR_SDK_INSTALL_DIR:-auto-detected by west}" + echo "elf_sha256: ${ELF_SHA}" + echo "csv_sha256: $(sha256sum "$RUN_DIR/output.csv" 2>/dev/null \ + || shasum -a 256 "$RUN_DIR/output.csv") | awk '{print $1}'" + echo "csv_bytes: $(wc -c < "$RUN_DIR/output.csv" | tr -d ' ')" + echo "csv_event_lines: $(grep -c '^E,' "$RUN_DIR/output.csv" || echo 0)" + echo "serial_port: ${PORT}" + echo "capture_timeout_s: ${TIMEOUT}" +} > "$MANIFEST" + +# --------------------------------------------------------------------- summary +echo +echo "==========================================================" +echo " Silicon capture complete" +echo " board: $BOARD" +echo " variant: $VARIANT" +echo " sweep: $SWEEP" +echo " events: $(grep -c '^E,' "$RUN_DIR/output.csv" || echo 0)" +echo " manifest: $MANIFEST" +echo " events.csv: $RUN_DIR/events.csv" +echo "==========================================================" +echo +echo "Next steps:" +echo " 1) sanity-check the output: head -20 $RUN_DIR/output.csv" +echo " 2) commit the run dir:" +echo " git add benches/engine_control/silicon/runs/${DATE}-${BOARD}-${GALE_SHA}-${VARIANT}" +echo " 3) (after both variants captured) compare against the matching Renode CI:" +echo " python3 benches/engine_control/analyze.py \\" +echo " --baseline silicon/runs/${DATE}-${BOARD}-${GALE_SHA}-baseline/events.csv \\" +echo " --gale silicon/runs/${DATE}-${BOARD}-${GALE_SHA}-gale/events.csv" diff --git a/benches/engine_control/silicon/runs/.gitkeep b/benches/engine_control/silicon/runs/.gitkeep new file mode 100644 index 0000000..e69de29