feat(scripts): post-mortem snapshot analyzer with replay + motion-history detectors by LightAxe · Pull Request #121 · LightAxe/subterrans

LightAxe · 2026-05-13T03:29:23Z

Summary

Adds scripts/analyze-snapshot.ts — a pure-Node CLI for offline analysis of downloaded debug snapshots. No sim-side changes; no save-format changes; no perf cost in-game. References #120.

Phase A — replay + byte-equality: createScenario(seed) → tick() through the captured inputLog up to snapshot.tick, then JSON-compares serializeWorldState(replay) against the captured snapshot. A divergence is a free SCEN-06 regression signal — exits 1 on mismatch so CI can detect it without parsing stdout. Also reports tile-occupancy clusters (≥5 ants on one tile) and underground ants on non-Open tiles (Solid / Marked / BeingDug — stuck-in-dirt).

Phase B — motion-history detectors: during replay, samples each live ant's (zone, tile, task, subTask, colony) every 20 ticks (= 1 sim sec) into a 30-sample sliding window (= 30 sim sec). Reports:

Stationary — top tile dominates ≥90% of the window. Grouped by (zone, tile, colony) so a pile-up surfaces as one line.
Oscillating — ≤3 unique tiles AND not stationary. Canonical tile-set key so A↔B and B↔A collapse to one eddy.

Queens are filtered (they never move by design). Each group is labeled with its dominant (task, subTask) so the reader can tell a real bug ("70 Foraging/CarryingFood ants stationary on (104,0)") from expected stuck cases ("18 Fighting ants at the rally point").

Run

node --experimental-transform-types scripts/analyze-snapshot.ts <snapshot.json>

--transform-types (not --strip-types) is required because src/platform/save.ts uses constructor parameter properties, which strip-types rejects. The script installs the same .js→.ts resolve hook as scripts/run-sim.ts.

Verification

Ran against ~/Downloads/subterrans-debug-seed400367819-tick91342.json (the May 11 snapshot that motivated #120):

Replay: 91342 ticks in 173.8s. Byte-equality PASS.
Tile clusters: 75 ants on underground (104,0) colony 2; 18 on surface (104,64) colony 2.
Motion: 70 stationary Foraging/CarryingFood ants on underground (104,0) — bug [BUG] Ants get stuck oscillating between adjacent tiles (and stacking on hot tiles) as ant count grows #120 symptom (carriers wedged at chamber boundary). Ant 252 oscillating (104,0) ↔ (104,1) ↔ (104,2) Foraging/CarryingFood — same chamber, same bug class. Expected non-bug: 18 Fighting at the rally point, 4 Nurses oscillating near the queen chamber, ~20 Idle young workers.

Caveats

Old snapshots (pre-PR feat(sim,render,save): finite food piles with runtime respawn (closes #112) #113) won't deserialize — foodPiles[].pickupsInitial didn't exist yet. Save-compat artifact; analyzer exits with a clear error in that case.
Replay time is O(captured tick count) — 91k ticks ≈ 3 min. Post-mortem use, not real-time.
Replay determinism check uses JSON.stringify byte-equality; in the unlikely event a future change reorders object keys in serializeWorldState, the analyzer will report FAIL even though state is semantically identical. The FAIL warning calls this out.
scripts/* is excluded from tsconfig.json's include, consistent with scripts/run-sim.ts and scripts/check-foraging-survival.ts. No tests added — this is a diagnostic CLI; src/sim/issue-44-snapshot-replay.test.ts already exercises the deserializeWorldState + tick() path in CI.

UAT — quick tests you can run locally

End-to-end on a real snapshot. Download a debug snapshot from the game's debug UI, then:
```
node --experimental-transform-types scripts/analyze-snapshot.ts ~/Downloads/subterrans-debug-seed*.json
```
Expect: prints Snapshot: …, Replaying from seed for N ticks…, then PASS, Live ants: …, Tile-occupancy clusters …, Underground ants on non-Open tiles …, Motion-history analysis …, Done. Exit code 0.

Missing-argument error path.

node --experimental-transform-types scripts/analyze-snapshot.ts
echo "exit=$?"

Expect: usage message on stderr; exit=2.

Missing file error path.
```
node --experimental-transform-types scripts/analyze-snapshot.ts /tmp/does-not-exist.json
echo "exit=$?"
```
Expect: Invalid snapshot: could not read "..." : ENOENT... on stderr; exit=3. (No stack trace.)

Malformed JSON.

echo 'not json' > /tmp/bad.json
node --experimental-transform-types scripts/analyze-snapshot.ts /tmp/bad.json
echo "exit=$?"

Expect: Invalid snapshot: not valid JSON (...); exit=3.

Wrong-shape envelope.

echo '{}' > /tmp/empty.json
node --experimental-transform-types scripts/analyze-snapshot.ts /tmp/empty.json
echo "exit=$?"

Expect: Invalid snapshot: missing or non-numeric "seed"; exit=3.

Determinism regression signal. On a real snapshot, temporarily tweak something deterministic in src/sim/ (any byte-changing edit), re-run the analyzer. Expect replay vs captured byte-equality: FAIL, the SCEN-06 warning printed, then Done., then exit=1. Revert the edit afterwards.
Dominant-task labels render. On a real snapshot, eyeball the Stationary and Oscillating sections — each line should show a Foraging/CarryingFood / Fighting / Nursing / Idle style label between the count and the [ids…] sample. Mixed-task groups should show Foo (n/total, +k others).

🤖 Generated with Claude Code

…tory detectors Adds scripts/analyze-snapshot.ts — a pure-Node CLI for offline analysis of downloaded debug snapshots (DebugSnapshot envelope from src/platform/debug-snapshot.ts). Phase A — replay-from-seed byte-equality check: createScenario(seed) → tick() through the captured inputLog up to snapshot.tick, then JSON-compare serializeWorldState(replay) against the captured snapshot. A divergence here is a free SCEN-06 regression signal. Exits 1 on mismatch so CI / scripts can detect a determinism regression without parsing stdout. Also reports tile-occupancy clusters (≥5 ants on one tile) and underground ants standing on non-Open tiles (Solid / Marked / BeingDug — stuck-in-dirt). Phase B — motion-history detectors: During replay, samples each live ant's (zone, tile, task, subTask, colony) every 20 ticks into a 30-sample sliding window (= 30 sim sec). Two classes reported, with dominant-task annotation: - Stationary: top tile dominates ≥90% of the window. Grouped by (zone, tile, colony) so a 75-ant pile-up surfaces as one line. - Oscillating: ≤3 unique tiles AND not stationary. Canonical tile-set key so A↔B and B↔A collapse to one eddy. Queens are filtered (they never move by design). The dominant-task labeling is the key UX: a human can scan the report and instantly tell a real bug ("70 Foraging/CarryingFood ants stationary on (104,0)") apart from expected stuck cases ("18 Fighting ants at the rally point"). Run: node --experimental-transform-types scripts/analyze-snapshot.ts <snapshot.json> --transform-types (not --strip-types) is required because src/platform/save.ts uses constructor parameter properties, which strip-types rejects. The script installs the same .js→.ts resolve hook used by scripts/run-sim.ts. Verified against the May 11 91342-tick snapshot that motivated bug #120: replay PASS in 173s; analyzer flagged 70 carrier ants stationary at the underground chamber boundary plus expected non-bug stationary clusters at the rally point. Envelope is validated up front (missing/wrong-shape fields exit 3 with a clear message); malformed JSON or unreadable file also exits 3. scripts/* is excluded from tsconfig.json's include, consistent with scripts/run-sim.ts. No tests — this is a diagnostic CLI; existing replay tests (src/sim/issue-44-snapshot-replay.test.ts) already exercise the deserializeWorldState + tick() path in CI. References #120 (carriers wedged at chamber boundary). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c28fa91a04

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- Correct usage text: --experimental-transform-types (strip-types fails on src/platform/save.ts parameter properties). - Validate every inputLog entry up front — non-object entries or missing / non-integer / negative issuedAtTick now bail(3) with a clear error pointing to the offending index instead of TypeErroring mid-replay. - On replay FAIL, distinguish trustworthy reports (cluster, stuck-in-dirt — built from the captured snapshot) from caveat-laden ones (motion history — built from the divergent replayed trajectory) so a determinism regression doesn't silently misdirect debugging. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LightAxe · 2026-05-13T03:35:58Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: abd231ad6b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Per save.ts Pitfall 7, serializeWorldState preserves world.commandQueue — debug snapshots and autosaves can fire between ticks while input handlers have staged pending commands. The replay path only processes commands already drained into inputLog, so replay.commandQueue is always [] at the end. A naive byte-equality compare therefore false-positives FAIL on any snapshot captured between input and drain, even though sim determinism is intact. Strip commandQueue from both sides before comparing. Surface the captured pending-command count in the result line so a non-zero queue is visible when reading the report. Addresses codex P1 on PR #121. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LightAxe · 2026-05-13T03:54:47Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 61eb1e42f0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

typeof === 'number' lets NaN, +/-Infinity, and fractional values through. Concrete harm: - tick = Infinity → unbounded replay loop - tick = NaN → loop skipped silently, replay reports PASS on no work - seed = NaN → createScenario coerces via `seed >>> 0` to 0, producing a misleading "wrong scenario" replay instead of a clear error Switch to Number.isInteger (rejects all of the above plus BigInts) and include the offending value in the error message. Addresses codex P1 follow-up on PR #121. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LightAxe · 2026-05-13T04:06:56Z

@codex review

chatgpt-codex-connector · 2026-05-13T04:11:18Z

Codex Review: Didn't find any major issues. Swish!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Adds a section under "Building and Running" pointing future contributors (human or AI) at scripts/analyze-snapshot.ts as the standard way to dig into a debug snapshot. Names the F9 download convention, the run command (including the --transform-types flag the script requires), and what the analyzer reports. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

Comment thread scripts/analyze-snapshot.ts Outdated

Comment thread scripts/analyze-snapshot.ts Outdated

Comment thread scripts/analyze-snapshot.ts

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

Comment thread scripts/analyze-snapshot.ts Outdated

chatgpt-codex-connector Bot reviewed May 13, 2026

View reviewed changes

Comment thread scripts/analyze-snapshot.ts Outdated

LightAxe merged commit 8370950 into main May 13, 2026
1 check passed

LightAxe deleted the feat/analyze-snapshot-cli branch May 13, 2026 04:23

LightAxe mentioned this pull request May 13, 2026

End-of-game survey + opt-in play-trace upload #122

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scripts): post-mortem snapshot analyzer with replay + motion-history detectors#121

feat(scripts): post-mortem snapshot analyzer with replay + motion-history detectors#121
LightAxe merged 5 commits into
mainfrom
feat/analyze-snapshot-cli

LightAxe commented May 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LightAxe commented May 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

LightAxe commented May 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

LightAxe commented May 13, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LightAxe commented May 13, 2026

Summary

Run

Verification

Caveats

UAT — quick tests you can run locally

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LightAxe commented May 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

LightAxe commented May 13, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

LightAxe commented May 13, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant