Skip to content

Add software-only visual replay harness#13

Merged
RMANOV merged 1 commit into
mainfrom
software-sim-visual-harness
Apr 25, 2026
Merged

Add software-only visual replay harness#13
RMANOV merged 1 commit into
mainfrom
software-sim-visual-harness

Conversation

@RMANOV
Copy link
Copy Markdown
Owner

@RMANOV RMANOV commented Apr 25, 2026

This PR adds a public-safe software-only replay layer for STRIX scenarios.

Included:

  • deterministic kinematic scenario replay generator
  • JSON timeline evidence and self-contained HTML canvas visualizer under target/ by default
  • smoke matrix entry for replay generation
  • tests for deterministic output, public-safe paths, HTML generation, and zero-index attrition handling
  • docs for using replay as pre-field behavior evidence, not hardware/RF validation

Validation:

  • python scripts/verify_public_surface.py -> passed
  • python -m pytest -q python/tests/test_strix_sim_replay.py python/tests/test_strix_test_matrix.py -> 13 passed
  • pytest -q -> 147 passed
  • python scripts/strix_test_matrix.py --select smoke --output target/strix-test-reports/smoke.json -> passed, 5/5
  • cargo test -p strix-optimizer doctrine_profile_accepts_neutral_aliases --lib -> passed
  • cargo check -p strix-python -> passed

@RMANOV RMANOV marked this pull request as ready for review April 25, 2026 11:39
Copilot AI review requested due to automatic review settings April 25, 2026 11:39
@RMANOV RMANOV merged commit 37891d9 into main Apr 25, 2026
10 checks passed
@RMANOV RMANOV deleted the software-sim-visual-harness branch April 25, 2026 11:39
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8174c75686

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +392 to +394
failed = [check for check in checks if check["status"] == "failed"]
return {
"status": "failed" if failed else "passed",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Fail envelope when required metrics are missing

The envelope verdict only treats checks with status "failed" as failures, so any metric missing from metrics is marked "not_observed" but still produces an overall "passed" result. In practice, a scenario with pass_envelope entries that this replay harness does not compute will exit 0 from main() and look successful, which can hide real regression gaps in automated evidence runs.

Useful? React with 👍 / 👎.

Comment on lines +57 to +58
if path.is_relative_to(ROOT):
return str(path.relative_to(ROOT))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Canonicalize scenario paths before public-path checks

Using path.is_relative_to(ROOT) on an unnormalized path allows parent traversal paths like --scenario ../secret/scenario.yaml to be treated as in-repo and emitted as ../secret/scenario.yaml instead of <external>/.... This breaks the public-safe path redaction guarantee by leaking parent-directory structure in replay JSON/HTML outputs.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a deterministic, software-only STRIX scenario replay generator to produce public-safe JSON timeline evidence and an optional self-contained HTML canvas visualizer for pre-field behavior inspection.

Changes:

  • Introduces scripts/strix_sim_replay.py to generate deterministic kinematic replays (JSON + optional HTML).
  • Adds pytest coverage for determinism, public-safe path redaction, HTML embedding, and zero-index attrition handling.
  • Updates docs and the public smoke test matrix to include replay generation as evidence.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
sim/scenarios/README.md Documents generating deterministic replay evidence from public scenarios.
scripts/strix_sim_replay.py Implements the replay generator, metrics/envelope evaluation, and HTML canvas visualizer.
python/tests/test_strix_sim_replay.py Adds tests for deterministic output, path redaction, HTML generation, and attrition handling.
demo/README.md Shows how to generate and view the replay HTML locally.
README.md Introduces “Software-Only Replay” usage at the top-level documentation.
Project_Docs/testing/public_test_matrix.json Adds a smoke test entry to generate replay artifacts under target/.
Project_Docs/testing/EVIDENCE_HARNESS.md Documents replay generation as part of the public evidence harness workflow.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +57 to +63
if path.is_relative_to(ROOT):
return str(path.relative_to(ROOT))
if path_str.startswith("\\\\") or path_str[:3].replace("\\", "/").endswith(":/"):
return f"<external>/{PureWindowsPath(path_str).name or '.'}"
if path.is_absolute():
return f"<external>/{path.name or '.'}"
return path_str
Comment on lines +513 to +516
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>STRIX Software Replay - {replay['scenario']['id']}</title>
<style>
Comment on lines +392 to +394
failed = [check for check in checks if check["status"] == "failed"]
return {
"status": "failed" if failed else "passed",
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants