Skip to content

Commit e7bd956

Browse files
committed
WIP: snapshot goldens: split fixtures by hv/cpu/profile + cross-load CI
Signed-off-by: Ludvig Liljenberg <4257730+ludfjig@users.noreply.github.com>
1 parent f1ce942 commit e7bd956

7 files changed

Lines changed: 188 additions & 62 deletions

File tree

.github/workflows/RegenSnapshotGoldens.yml

Lines changed: 19 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,17 @@ defaults:
1515
shell: bash
1616

1717
jobs:
18-
# Build guests once, upload as artifacts for the regen jobs to
19-
# download. Mirrors `build-guests` in `ValidatePullRequest.yml`.
18+
# Build guests once per config, upload as artifacts for the regen
19+
# jobs to download. Mirrors `build-guests` in
20+
# `ValidatePullRequest.yml`. Both debug and release simpleguest
21+
# binaries are needed because the captured memory blob differs
22+
# between them, so each ends up in a separate fixture cell.
2023
build-guests:
2124
if: github.event.label.name == 'regen-goldens'
2225
strategy:
2326
fail-fast: false
2427
matrix:
25-
config: [debug]
28+
config: [debug, release]
2629
uses: ./.github/workflows/dep_build_guests.yml
2730
secrets: inherit
2831
with:
@@ -38,12 +41,14 @@ jobs:
3841
matrix:
3942
hypervisor: ['hyperv-ws2025', mshv3, kvm]
4043
cpu: [amd, intel]
44+
config: [debug, release]
4145
runs-on: ${{ fromJson(
42-
format('["self-hosted", "{0}", "X64", "1ES.Pool=hld-{1}-{2}", "JobId=regen-goldens-{3}-{4}-{5}-{6}"]',
46+
format('["self-hosted", "{0}", "X64", "1ES.Pool=hld-{1}-{2}", "JobId=regen-goldens-{3}-{4}-{5}-{6}-{7}"]',
4347
matrix.hypervisor == 'hyperv-ws2025' && 'Windows' || 'Linux',
4448
matrix.hypervisor == 'hyperv-ws2025' && 'win2025' || matrix.hypervisor == 'mshv3' && 'azlinux3-mshv' || matrix.hypervisor,
4549
matrix.cpu,
4650
matrix.hypervisor,
51+
matrix.config,
4752
github.run_id,
4853
github.run_number,
4954
github.run_attempt)) }}
@@ -64,33 +69,32 @@ jobs:
6469
- name: Rust cache
6570
uses: Swatinem/rust-cache@v2
6671
with:
67-
shared-key: "${{ runner.os }}-debug"
72+
shared-key: "${{ runner.os }}-${{ matrix.config }}"
6873
cache-on-failure: "true"
6974

7075
- name: Download Rust guests
7176
uses: actions/download-artifact@v8
7277
with:
73-
name: rust-guests-debug
74-
path: src/tests/rust_guests/bin/debug/
78+
name: rust-guests-${{ matrix.config }}
79+
path: src/tests/rust_guests/bin/${{ matrix.config }}/
7580

7681
- name: Run regen
7782
env:
7883
HYPERLIGHT_REGEN_GOLDENS: "1"
7984
run: |
80-
cargo test -p hyperlight-host --lib \
85+
cargo test --profile=${{ matrix.config == 'debug' && 'dev' || matrix.config }} \
86+
-p hyperlight-host --lib \
8187
sandbox::snapshot::golden_tests::golden_regen \
8288
-- --nocapture
8389
8490
- name: Upload regenerated fixtures
8591
uses: actions/upload-artifact@v4
8692
with:
87-
name: goldens-${{ matrix.hypervisor }}-${{ matrix.cpu }}
88-
# Map the runner's hypervisor input to the fixture filename
89-
# suffix that golden_regen produces for that runner: mshv3 -> mshv,
90-
# hyperv-ws2025 -> whp, kvm -> kvm. Restricting the glob this way
91-
# avoids uploading stale fixtures for other HVs that happened to
92-
# be checked in alongside this branch.
93+
name: goldens-${{ matrix.hypervisor }}-${{ matrix.cpu }}-${{ matrix.config }}
94+
# Restrict the glob to fixtures matching this cell's
95+
# `{hv}_{cpu}_{config}` suffix. mshv3 maps to the `mshv`
96+
# filename suffix, hyperv-ws2025 maps to `whp`.
9397
path: |
94-
src/hyperlight_host/tests/snapshot_goldens/fixtures/*_${{ matrix.hypervisor == 'mshv3' && 'mshv' || matrix.hypervisor == 'hyperv-ws2025' && 'whp' || matrix.hypervisor }}.hls
98+
src/hyperlight_host/tests/snapshot_goldens/fixtures/*_${{ matrix.hypervisor == 'mshv3' && 'mshv' || matrix.hypervisor == 'hyperv-ws2025' && 'whp' || matrix.hypervisor }}_${{ matrix.cpu }}_${{ matrix.config }}.hls
9599
if-no-files-found: error
96100
retention-days: 14

.github/workflows/dep_build_test.yml

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,48 @@ jobs:
103103
# with default features
104104
just test ${{ inputs.config }}
105105
106+
- name: Cross-load snapshot goldens from every (hv, cpu, profile) cell
107+
# The committed snapshot fixtures are tagged
108+
# `{hv}_{cpu}_{config}`; the regular `just test` step
109+
# above already loaded the fixture matching this runner's
110+
# triple without bypass. This step additionally cross-
111+
# loads every other committed fixture into the local slot
112+
# with `HYPERLIGHT_SNAPSHOT_BYPASS_VALIDATION=1`. That
113+
# exercises that the on-disk format is portable across HVs
114+
# and CPU vendors and that the host can restore a snapshot
115+
# built against a foreign guest profile. Missing fixture
116+
# cells are warned about but skipped so the step is
117+
# forward-compatible while regen catches up.
118+
env:
119+
HYPERLIGHT_SNAPSHOT_BYPASS_VALIDATION: "1"
120+
run: |
121+
case "${{ inputs.hypervisor }}" in
122+
mshv3) LOCAL_HV=mshv ;;
123+
hyperv-ws2025) LOCAL_HV=whp ;;
124+
*) LOCAL_HV=kvm ;;
125+
esac
126+
LOCAL_SUFFIX="${LOCAL_HV}_${{ inputs.cpu }}_${{ inputs.config }}"
127+
FIX=src/hyperlight_host/tests/snapshot_goldens/fixtures
128+
PROFILE=${{ inputs.config == 'debug' && 'dev' || inputs.config }}
129+
for hv in kvm mshv whp; do
130+
for cpu in intel amd; do
131+
for cfg in debug release; do
132+
SRC_SUFFIX="${hv}_${cpu}_${cfg}"
133+
if [ "$SRC_SUFFIX" = "$LOCAL_SUFFIX" ]; then continue; fi
134+
if [ ! -f "$FIX/init_${SRC_SUFFIX}.hls" ] || [ ! -f "$FIX/call_${SRC_SUFFIX}.hls" ]; then
135+
echo "::warning::missing fixture pair for ${SRC_SUFFIX}, skipping cross-load"
136+
continue
137+
fi
138+
echo "::group::cross-load source=${SRC_SUFFIX} into local=${LOCAL_SUFFIX}"
139+
cp -f "$FIX/init_${SRC_SUFFIX}.hls" "$FIX/init_${LOCAL_SUFFIX}.hls"
140+
cp -f "$FIX/call_${SRC_SUFFIX}.hls" "$FIX/call_${LOCAL_SUFFIX}.hls"
141+
cargo test --profile=$PROFILE -p hyperlight-host --lib sandbox::snapshot::golden_tests
142+
echo "::endgroup::"
143+
done
144+
done
145+
done
146+
git checkout -- "$FIX"
147+
106148
- name: Run Rust tests with single driver
107149
if: runner.os == 'Linux'
108150
run: |

Justfile

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -598,21 +598,28 @@ fetch-snapshot-goldens run_id:
598598
gh run download {{run_id}} --dir target/snapshot-goldens-cross-load -p 'goldens-*'
599599
ls target/snapshot-goldens-cross-load/
600600

601-
# Cross-load test: copy the chosen artifact's `.hls` files over the
602-
# committed fixtures (renamed to the local HV's suffix) and run
603-
# goldens with validation bypassed. `source` is the artifact dir
604-
# name (e.g. `goldens-kvm-intel`). `source_hv` is the suffix used
605-
# inside that artifact (`kvm`, `mshv`, or `whp`). `local_hv` is the
606-
# locally-detected hypervisor's suffix (`kvm`, `mshv`, or `whp`).
601+
# Cross-load test: copy the chosen artifact's `.hls` files over
602+
# the locally-named fixtures and run goldens with validation
603+
# bypassed so the HV-tag mismatch (and the resulting hash
604+
# mismatch) does not fail the load.
607605
#
608-
# Example (load AMD-on-KVM CI artifact on a local Intel KVM box):
609-
# just cross-load-snapshot-goldens goldens-kvm-amd kvm kvm
606+
# `source` is the artifact dir name produced by the regen
607+
# workflow, e.g. `goldens-kvm-amd-debug`. `source_suffix` is the
608+
# `{hv}_{cpu}_{config}` suffix of the files inside that artifact
609+
# (e.g. `kvm_amd_debug`; mshv3 -> `mshv`, hyperv-ws2025 -> `whp`).
610+
# `local_suffix` is the `{hv}_{cpu}_{config}` suffix the local
611+
# `golden_tests` detection will look for on this machine.
610612
#
611-
# Example (load Intel-on-MSHV CI artifact on a local KVM box):
612-
# just cross-load-snapshot-goldens goldens-mshv3-intel mshv kvm
613-
cross-load-snapshot-goldens source source_hv local_hv:
614-
{{ if os() == "windows" { "Copy-Item -Force target/snapshot-goldens-cross-load/" + source + "/init_" + source_hv + ".hls src/hyperlight_host/tests/snapshot_goldens/fixtures/init_" + local_hv + ".hls" } else { "cp target/snapshot-goldens-cross-load/" + source + "/init_" + source_hv + ".hls src/hyperlight_host/tests/snapshot_goldens/fixtures/init_" + local_hv + ".hls" } }}
615-
{{ if os() == "windows" { "Copy-Item -Force target/snapshot-goldens-cross-load/" + source + "/call_" + source_hv + ".hls src/hyperlight_host/tests/snapshot_goldens/fixtures/call_" + local_hv + ".hls" } else { "cp target/snapshot-goldens-cross-load/" + source + "/call_" + source_hv + ".hls src/hyperlight_host/tests/snapshot_goldens/fixtures/call_" + local_hv + ".hls" } }}
613+
# Example (load AMD-on-KVM-debug CI artifact on a local Intel KVM
614+
# debug box):
615+
# just cross-load-snapshot-goldens goldens-kvm-amd-debug kvm_amd_debug kvm_intel_debug
616+
#
617+
# Example (load Intel-on-MSHV-release CI artifact on a local
618+
# KVM-debug box):
619+
# just cross-load-snapshot-goldens goldens-mshv3-intel-release mshv_intel_release kvm_intel_debug
620+
cross-load-snapshot-goldens source source_suffix local_suffix:
621+
{{ if os() == "windows" { "Copy-Item -Force target/snapshot-goldens-cross-load/" + source + "/init_" + source_suffix + ".hls src/hyperlight_host/tests/snapshot_goldens/fixtures/init_" + local_suffix + ".hls" } else { "cp target/snapshot-goldens-cross-load/" + source + "/init_" + source_suffix + ".hls src/hyperlight_host/tests/snapshot_goldens/fixtures/init_" + local_suffix + ".hls" } }}
622+
{{ if os() == "windows" { "Copy-Item -Force target/snapshot-goldens-cross-load/" + source + "/call_" + source_suffix + ".hls src/hyperlight_host/tests/snapshot_goldens/fixtures/call_" + local_suffix + ".hls" } else { "cp target/snapshot-goldens-cross-load/" + source + "/call_" + source_suffix + ".hls src/hyperlight_host/tests/snapshot_goldens/fixtures/call_" + local_suffix + ".hls" } }}
616623
{{ if os() == "windows" { "$env:HYPERLIGHT_SNAPSHOT_BYPASS_VALIDATION = '1'; cargo test -p hyperlight-host --lib sandbox::snapshot::golden_tests" } else { "HYPERLIGHT_SNAPSHOT_BYPASS_VALIDATION=1 cargo test -p hyperlight-host --lib sandbox::snapshot::golden_tests" } }}
617624
git checkout src/hyperlight_host/tests/snapshot_goldens/fixtures/
618625

src/hyperlight_host/src/sandbox/snapshot/golden_tests.rs

Lines changed: 89 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -40,17 +40,19 @@ limitations under the License.
4040
//!
4141
//! ## Skipping
4242
//!
43-
//! At test time we detect the locally available hypervisor. If no
44-
//! fixture is committed for that HV (e.g. KVM-only fixtures
45-
//! present, test runs on WHP), the test silently skips with a
46-
//! message. CI matrices ensure each HV is exercised by at least
47-
//! one job.
43+
//! At test time we detect the locally available hypervisor, the
44+
//! CPU vendor, and the build profile (`debug` vs `release`) and
45+
//! resolve the fixture file `{name}_{hv}_{cpu}_{profile}.hls`. If
46+
//! no matching fixture is committed (e.g. only Intel KVM fixtures
47+
//! checked in, test runs on AMD MSHV), the test silently skips
48+
//! with a message. CI matrices ensure every (hv, cpu, profile)
49+
//! combination is exercised by at least one job.
4850
//!
4951
//! ## Regeneration
5052
//!
5153
//! Set `HYPERLIGHT_REGEN_GOLDENS=1` and run `cargo test
52-
//! golden_regen` to overwrite every fixture for the locally
53-
//! available hypervisor. Always overwrites; the test is
54+
//! golden_regen` to overwrite every fixture matching the local
55+
//! `{hv}_{cpu}_{profile}` triple. Always overwrites; the test is
5456
//! deliberate. See `tests/snapshot_goldens/fixtures/README.md`
5557
//! for when this is needed.
5658
@@ -116,16 +118,83 @@ impl LocalHypervisor {
116118
}
117119
}
118120

121+
/// Detected CPU vendor. Used as a fixture-name suffix because the
122+
/// snapshot byte stream depends on the underlying CPU's CPUID
123+
/// (folded into sregs and the page tables we relocate on load).
124+
#[derive(Copy, Clone, Debug, PartialEq, Eq)]
125+
enum LocalCpu {
126+
Intel,
127+
Amd,
128+
}
129+
130+
impl LocalCpu {
131+
fn short_name(self) -> &'static str {
132+
match self {
133+
Self::Intel => "intel",
134+
Self::Amd => "amd",
135+
}
136+
}
137+
138+
/// Detect the running CPU vendor via CPUID leaf 0. Returns
139+
/// `None` for any vendor string we do not recognize so the
140+
/// test simply skips on exotic hardware rather than guessing.
141+
fn detect() -> Option<Self> {
142+
#[cfg(target_arch = "x86_64")]
143+
unsafe {
144+
let r = std::arch::x86_64::__cpuid(0);
145+
let mut vendor = [0u8; 12];
146+
vendor[0..4].copy_from_slice(&r.ebx.to_le_bytes());
147+
vendor[4..8].copy_from_slice(&r.edx.to_le_bytes());
148+
vendor[8..12].copy_from_slice(&r.ecx.to_le_bytes());
149+
match &vendor {
150+
b"GenuineIntel" => Some(Self::Intel),
151+
b"AuthenticAMD" => Some(Self::Amd),
152+
_ => None,
153+
}
154+
}
155+
#[cfg(not(target_arch = "x86_64"))]
156+
{
157+
None
158+
}
159+
}
160+
}
161+
162+
/// Locally selected build profile, detected at test-time. The
163+
/// profile picks which guest binary `simple_guest_as_string`
164+
/// resolves to (debug vs release simpleguest), which changes the
165+
/// memory blob, so it is part of the fixture name.
166+
fn local_profile() -> &'static str {
167+
if cfg!(debug_assertions) {
168+
"debug"
169+
} else {
170+
"release"
171+
}
172+
}
173+
119174
fn fixtures_dir() -> PathBuf {
120175
PathBuf::from(env!("CARGO_MANIFEST_DIR")).join("tests/snapshot_goldens/fixtures")
121176
}
122177

123-
/// Resolve the path to a fixture file for the current hypervisor.
124-
/// Returns `None` if no hypervisor is available, or if the fixture
125-
/// is missing from the checked-in set.
126-
fn fixture_path(name: &str) -> Option<PathBuf> {
178+
/// Filename suffix used for fixtures matching the current host:
179+
/// `{hv}_{cpu}_{profile}`. Returns `None` if any dimension is not
180+
/// detectable.
181+
fn fixture_suffix() -> Option<String> {
127182
let hv = LocalHypervisor::detect()?;
128-
let path = fixtures_dir().join(format!("{}_{}.hls", name, hv.short_name()));
183+
let cpu = LocalCpu::detect()?;
184+
Some(format!(
185+
"{}_{}_{}",
186+
hv.short_name(),
187+
cpu.short_name(),
188+
local_profile()
189+
))
190+
}
191+
192+
/// Resolve the path to a fixture file for the current hypervisor,
193+
/// CPU vendor and build profile. Returns `None` if any dimension is
194+
/// missing or the fixture is not checked in.
195+
fn fixture_path(name: &str) -> Option<PathBuf> {
196+
let suffix = fixture_suffix()?;
197+
let path = fixtures_dir().join(format!("{}_{}.hls", name, suffix));
129198
path.exists().then_some(path)
130199
}
131200

@@ -141,7 +210,7 @@ fn load_golden(
141210
Some(p) => p,
142211
None => {
143212
eprintln!(
144-
"snapshot_goldens: skipping {} (no fixture for local hypervisor)",
213+
"snapshot_goldens: skipping {} (no fixture matching local hv/cpu/profile)",
145214
name,
146215
);
147216
return None;
@@ -267,10 +336,8 @@ fn register_host_echo_fns<R: Registerable>(r: &mut R) {
267336
type FixtureBuilder = fn() -> Arc<Snapshot>;
268337

269338
/// Master list of fixtures regenerated by the regen test.
270-
const FIXTURES: &[(&str, FixtureBuilder)] = &[
271-
(INIT_FIXTURE, build_init),
272-
(CALL_FIXTURE, build_call),
273-
];
339+
const FIXTURES: &[(&str, FixtureBuilder)] =
340+
&[(INIT_FIXTURE, build_init), (CALL_FIXTURE, build_call)];
274341

275342
// ============================================================================
276343
// Regeneration test (env-var-gated, not run in CI)
@@ -293,21 +360,21 @@ fn golden_regen() {
293360
}
294361
}
295362

296-
let hv = match LocalHypervisor::detect() {
297-
Some(h) => h,
363+
let suffix = match fixture_suffix() {
364+
Some(s) => s,
298365
None => {
299-
eprintln!("golden_regen: no hypervisor available, nothing to write");
366+
eprintln!("golden_regen: skipping (no detected hypervisor / cpu vendor on this host)",);
300367
return;
301368
}
302369
};
303-
eprintln!("golden_regen: using hypervisor {}", hv.short_name());
370+
eprintln!("golden_regen: writing fixtures for suffix {}", suffix);
304371

305372
let dir = fixtures_dir();
306373
std::fs::create_dir_all(&dir).expect("create fixtures dir");
307374

308375
let mut wrote = 0usize;
309376
for (name, build) in FIXTURES {
310-
let path = dir.join(format!("{}_{}.hls", name, hv.short_name()));
377+
let path = dir.join(format!("{}_{}.hls", name, suffix));
311378
let snap = build();
312379
snap.to_file(&path).unwrap_or_else(|e| {
313380
panic!(

src/hyperlight_host/tests/snapshot_goldens/fixtures/README.md

Lines changed: 18 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,21 @@
11
# Snapshot golden fixtures
22

33
These are checked-in `.hls` snapshot files that exhaustively pin
4-
the on-disk snapshot format. Two fixtures per (arch, hypervisor):
4+
the on-disk snapshot format. Two fixtures per
5+
`(hypervisor, cpu, profile)` cell:
56

6-
* `init_{hv}.hls`: Initialise (preinit) snapshot
7+
* `init_{hv}_{cpu}_{profile}.hls`: Initialise (preinit) snapshot
78
built with non-default layout sizes plus an init-data blob.
8-
* `call_{hv}.hls`: Call (mid-execution) snapshot
9+
* `call_{hv}_{cpu}_{profile}.hls`: Call (mid-execution) snapshot
910
built after the guest has bumped a static, allocated and pinned
1011
a heap buffer with a known pattern, and round-tripped a value
1112
through every primitive-typed host function.
1213

13-
`{hv}` is one of `kvm`, `mshv`, `whp`. Tests that don't match the
14-
local hypervisor skip silently.
14+
`{hv}` is one of `kvm`, `mshv`, `whp`. `{cpu}` is `intel` or
15+
`amd`. `{profile}` is `debug` or `release` (it changes which
16+
`simpleguest` binary the fixture was built against, which changes
17+
the captured memory blob). Tests that don't find a fixture
18+
matching the local triple skip silently.
1519

1620
See `docs/snapshot-golden-tests-plan.md` for the full design and
1721
A-P surface enumeration.
@@ -44,9 +48,9 @@ every (HV, CPU) cell yourself.
4448
relevant assertion. Resolve those first.
4549
2. Apply the `regen-goldens` label to the PR.
4650
3. The `Regenerate Snapshot Goldens` workflow runs once per
47-
`(hypervisor, cpu)` cell and uploads the produced `.hls`
48-
files as artifacts named `goldens-{hv}-{cpu}`. If you re-add
49-
the label the workflow runs again from scratch.
51+
`(hypervisor, cpu, config)` cell and uploads the produced
52+
`.hls` files as artifacts named `goldens-{hv}-{cpu}-{config}`.
53+
If you re-add the label the workflow runs again from scratch.
5054
4. Download the artifacts from the workflow run page (or with
5155
`gh run download`).
5256
5. Drop the files into this directory, replacing existing fixtures.
@@ -64,8 +68,8 @@ HYPERLIGHT_REGEN_GOLDENS=1 cargo test -p hyperlight-host \
6468
--lib sandbox::snapshot::golden_tests::golden_regen -- --nocapture
6569
```
6670

67-
This always overwrites the fixtures for the locally available
68-
hypervisor.
71+
This always overwrites the fixtures matching the local
72+
`(hv, cpu, profile)` triple.
6973

7074
**Do not commit fixtures generated this way.** Use the CI workflow
7175
artifacts instead. Local regen exists for ad-hoc debugging.
@@ -77,5 +81,7 @@ artifacts instead. Local regen exists for ad-hoc debugging.
7781
at the diff and confirm "this is just bytes that match the new
7882
format" rather than mixing it with logic changes.
7983
* Do not include unrelated `.hls` files in the same commit.
80-
* Total fixture size is ~10 MB per HV today. Keep an eye on the
81-
size if adding more fixtures.
84+
* Total fixture size is ~10 MB per (hv, cpu, profile) cell
85+
today. With 12 cells (3 hv * 2 cpu * 2 profile) the on-disk
86+
total is around 120 MB. Keep an eye on the size if adding more
87+
fixtures.
-5.1 MB
Binary file not shown.
-5.18 MB
Binary file not shown.

0 commit comments

Comments
 (0)