hw-preflight is a Linux hardware pre-flight check runner: 24 checks
across CPU, memory, disk, kernel, thermal, serial, network, GPIO, I2C,
systemd, NVMe SMART, USB, RTC drift, IOMMU groups, VM overcommit, and
SELinux mode, each emitting one of pass | fail | skip | unavailable. The
runner produces a JSON and Markdown report with raw measured values,
expected thresholds, and a failure-detail section. Checks read /proc,
/sys, run a few standard binaries (ip, timedatectl, systemctl),
talk to a serial device via pyserial, and (for CPUID feature flags)
call into a small C++ helper compiled via CMake + pybind11.
- The four-state result model. Most validation tools collapse
skipandunavailable. Keeping them distinct means a runner that lacks/sys/class/thermalis reported honestly as "unavailable", not fake-passed and not falsely failed.--exit-on-failonly triggers onfail, so missing optional hardware does not break a CI gate. - Hermetic CI for hardware code. Real
/procand/sysreads run on the GitHub Actions Linux runner; pyfakefs simulates kernel surfaces in unit tests;socatcreates a virtual pty pair so the serial handshake can be reproduced without a USB dongle. Nothing is faked at the result layer — every check has a real implementation that runs against the host or against a documented mock. - Threshold versus availability. Numeric thresholds (RAM, disk,
kernel version) emit
failwhen the host produced a measurement that didn't satisfy the bound. Presence checks (GPIO, I2C, serial) default tounavailablewhen the underlying device file is missing, since the absence of optional peripherals is not a failure.
The table below is the actual output of hw-preflight run on the
GitHub Actions ubuntu-24.04 runner (the JSON it came from is
examples/sample-run.json, produced by the
e2e job and committed verbatim).
- host:
ubuntu-runner(kernel6.17.0-1010-azure, 4 CPUs) - summary: 11 pass / 1 fail / 2 skip / 4 unavailable (18 total)
| # | Check | Status | Detail |
|---|---|---|---|
| 1 | clock_source |
PASS | clocksource=tsc |
| 2 | cpu_count |
PASS | cpu_count=4 |
| 3 | cpu_features |
PASS | 102 flags read from /proc/cpuinfo (C++ ext built in build-cpp job; the test-py job uses the Python fallback) |
| 4 | disk_free |
PASS | 88.7 GiB free on / |
| 5 | gpio_chips |
UNAVAIL | no gpiochips exposed |
| 6 | i2c_bus_present |
UNAVAIL | no /dev/i2c-* nodes |
| 7 | kernel_module_loaded |
FAIL | loop not loaded (Azure runner kernel ships it built-in, not as a module — adjust system.required_modules for tier) |
| 8 | kernel_version |
PASS | release 6.17.0-1010-azure |
| 9 | loadavg_short |
PASS | loadavg=1.28, 4 CPUs |
| 10 | memory_available |
PASS | 14.6 GiB available |
| 11 | memory_total |
PASS | 15.6 GiB total |
| 12 | network_default_route |
PASS | 1 default route present |
| 13 | serial_handshake |
UNAVAIL | /dev/ttyS0 exists but EACCES (runner not in dialout); the e2e job exercises the same code through socat with a real round-trip |
| 14 | serial_port_present |
PASS | path=/dev/ttyS0 |
| 15 | service_unit_active |
SKIP | no service units configured |
| 16 | swap_disabled |
SKIP | swap-disabled requirement not enforced |
| 17 | thermal_max |
UNAVAIL | no thermal zones in /sys/class/thermal |
| 18 | time_sync |
PASS | NTPSynchronized=yes |
The single fail is honest: the default config requires the loop
module, and the Azure runner kernel does not load it as a separate
module. A real deployment changes system.required_modules to match
its tier, or removes the constraint.
The full machine-readable artifact is at
examples/sample-run.json and the rendered
Markdown at examples/sample-run.md.
| # | Check | What it reads |
|---|---|---|
| 1 | cpu_count |
os.cpu_count() against cpu.min_count |
| 2 | cpu_features |
C++ __builtin_cpu_supports ∪ /proc/cpuinfo flags: against cpu.required_features |
| 3 | memory_total |
/proc/meminfo MemTotal |
| 4 | memory_available |
/proc/meminfo MemAvailable |
| 5 | swap_disabled |
/proc/meminfo SwapTotal == 0 (toggle) |
| 6 | disk_free |
os.statvfs(path) |
| 7 | loadavg_short |
/proc/loadavg 1-minute load against cpu_count * factor |
| 8 | kernel_version |
os.uname().release against system.min_kernel_version |
| 9 | kernel_module_loaded |
/proc/modules against system.required_modules |
| 10 | clock_source |
/sys/devices/system/clocksource/clocksource0/current_clocksource against allowlist |
| 11 | time_sync |
timedatectl show -p NTPSynchronized --value |
| 12 | thermal_max |
max of /sys/class/thermal/thermal_zone*/temp |
| 13 | serial_port_present |
HW_PREFLIGHT_SERIAL_PATH -> serial.by_id_glob -> serial.candidate_paths |
| 14 | serial_handshake |
open at 115200, write AT\r\n, regex-match response |
| 15 | network_default_route |
ip route show default |
| 16 | gpio_chips |
count of gpiochip* in /sys/class/gpio |
| 17 | i2c_bus_present |
count of /dev/i2c-* |
| 18 | service_unit_active |
systemctl is-active <unit> for each entry in service.units |
| 19 | nvme_smart |
nvme smart-log /dev/nvme0n1 critical_warning field |
| 20 | usb_device_count |
count of entries under /sys/bus/usb/devices/ |
| 21 | rtc_drift |
/sys/class/rtc/rtc0/since_epoch vs time.time() |
| 22 | pci_iommu_groups |
count of /sys/kernel/iommu_groups/* |
| 23 | vm_overcommit |
/proc/sys/vm/overcommit_memory against allowlist |
| 24 | selinux_status |
getenforce output (Enforcing / Permissive / Disabled) |
See docs/checks.md for the per-check reference,
including how each is mocked in CI.
| Module | Purpose |
|---|---|
hw_preflight.cli |
Click CLI: run, list, render-md |
hw_preflight.runner |
Discovers checks, runs each with a per-check timeout, aggregates results |
hw_preflight.config |
Pydantic v2 settings tree, YAML loader |
hw_preflight.reports |
JSON and Markdown emitters |
hw_preflight.checks._base |
CheckResult schema + @register_check decorator |
hw_preflight.checks.* |
One module per check group (cpu, memory, disk, system, thermal, serial, network, gpio, service) |
hw_preflight._hwprobe |
Wrapper around the C++ helper with a /proc/cpuinfo fallback |
hwprobe/ |
C++ 20 helper: cpuid, dmi, main (CLI), bindings (pybind11) |
# from the source tree
pip install -e ".[dev]"
# optional: build the C++ helper (CMake + pybind11)
cmake -S hwprobe -B hwprobe/build -DCMAKE_BUILD_TYPE=Release
cmake --build hwprobe/build -j
# run it
hw-preflight run --json out.json --md out.md
hw-preflight list # list registered check names
hw-preflight profiles # list built-in profiles
hw-preflight render-md out.json # rerender JSON to stdout Markdown
# with a custom config
hw-preflight run --config config/preflight.example.yaml --exit-on-fail
# with a built-in profile
hw-preflight run --profile production-server --exit-on-fail
# parallel execution + webhook delivery
HW_PREFLIGHT_WEBHOOK_SECRET=topsecret \
hw-preflight run --profile ci-runner --parallelism 0 \
--webhook-url https://collector.example/hw-preflightThree profiles ship in config/profiles/:
| Profile | One-line use case |
|---|---|
production-server |
Long-lived bare-metal/VM hosts running services. Strict thresholds, requires NTP sync, allows SELinux Enforcing or Permissive only, gates on NVMe SMART. |
edge-device |
Small ARM SBCs and field-deployed appliances with peripherals. Looser CPU/RAM floors, requires GPIO chip + I2C bus + serial handshake; tolerates higher RTC drift and thermal headroom. |
ci-runner |
Ephemeral CI workers (GitHub-hosted, GitLab-shared). Permissive thresholds, fast timeouts, peripheral and security checks excluded. |
Select with --profile <name> (mutually exclusive with --config).
--webhook-url <url> POSTs the JSON report with Content-Type: application/json and an HMAC-SHA256 signature derived from the body and
the secret in HW_PREFLIGHT_WEBHOOK_SECRET:
| Header | Value |
|---|---|
X-HW-Preflight-Signature |
sha256=<hex hmac> of body |
X-HW-Preflight-Timestamp |
unix epoch seconds at signing |
A receiver re-computes HMAC_SHA256(body, secret) and compares against
the header to authenticate the payload. The
tests/unit/test_webhook.py suite documents
this round-trip end to end.
+----------------------+
YAML config ---> | PreflightConfig |
| (pydantic v2) |
+----------+-----------+
|
v
+-----------------+ +--+--+ +------------------+
| CLI (click) |-->|run |-->| reports.to_json /|
+-----------------+ |all | | to_markdown |
+--+--+ +------------------+
|
per-check |
ThreadPool |
+ timeout v
+-----------------------+----------------------+
| checks/ (auto-registered) |
| cpu.py memory.py disk.py system.py |
| thermal.py serial.py network.py gpio.py |
| service.py |
+----------+-------------+---------------------+
| |
v v
/proc, /sys, syscalls pyserial / subprocess
^
|
+----------+----------+
| _hwprobe.py |
| (C++ ext OR |
| /proc/cpuinfo) |
+----------+----------+
|
v
+---------------------+
| hwprobe/ (C++ 20) |
| CMake + pybind11 |
+---------------------+
See ARCHITECTURE.md for the four-state result-model
rationale, the C++ binding decision, and tier-by-tier mocking strategy.
- Not a continuous-monitoring agent. Each invocation is a one-shot snapshot.
- Not a remote-agent framework. Runs locally; there is no transport, auth, or scheduler.
- Not a firmware-update or hardware-diagnostic path. Read-only.
- Not a GPU or accelerator validator. Vendor SDKs differ enough that GPU validation deserves its own framework.
- Not a network throughput or latency probe. Only checks the availability of a default route — peripheral presence, not bandwidth.
- Not Windows or macOS compatible. Many checks read
/procand/sysdirectly; cross-platform support would require a parallel registry rather than a thin abstraction. Linux-only by design.
MIT — see LICENSE.