Skip to content

Harden replay evidence safety checks#14

Merged
RMANOV merged 1 commit into
mainfrom
fix-replay-review-findings
Apr 25, 2026
Merged

Harden replay evidence safety checks#14
RMANOV merged 1 commit into
mainfrom
fix-replay-review-findings

Conversation

@RMANOV
Copy link
Copy Markdown
Owner

@RMANOV RMANOV commented Apr 25, 2026

This follow-up addresses the actionable PR #13 Copilot/Codex review findings after merge.\n\nFixes:\n- canonicalize scenario paths before public path reporting so parent traversal is redacted\n- fail replay envelopes when required metrics are not observed\n- HTML-escape the scenario id in the generated replay title\n- keep a generic area_coverage_pct metric available for public replay contracts\n\nValidation:\n- python -m pytest -q python/tests/test_strix_sim_replay.py -> 8 passed\n- python scripts/verify_public_surface.py -> passed\n- python scripts/strix_test_matrix.py --select smoke --output target/strix-test-reports/smoke.json -> passed\n- pytest -q -> 150 passed\n- cargo test -p strix-optimizer doctrine_profile_accepts_neutral_aliases --lib -> passed\n- cargo check -p strix-python -> passed

Copilot AI review requested due to automatic review settings April 25, 2026 11:48
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Hardens the STRIX software replay evidence pipeline by strengthening public-safety redactions and tightening pass/fail envelope behavior, ensuring generated JSON/HTML artifacts remain safe to share.

Changes:

  • Canonicalizes scenario paths before applying public_path() redaction to prevent .. traversal leaks.
  • Treats missing required envelope metrics as an envelope failure (not_observed now fails the envelope).
  • HTML-escapes the scenario id in the generated replay page <title> and ensures a generic area_coverage_pct metric is always present.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
scripts/strix_sim_replay.py Strengthens path redaction, envelope failure rules, and HTML title escaping; adds a generic coverage metric.
python/tests/test_strix_sim_replay.py Adds regression tests for traversal redaction, HTML title escaping, and missing-metric envelope failure.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +67 to +69
if normalized.is_absolute():
return f"<external>/{normalized.name or '.'}"
return f"<external>/{path.name or '.'}"
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public_path() ends with a fallback branch that appears unreachable: candidate.resolve(strict=False) returns an absolute path, so normalized.is_absolute() will always be true and the final return f"<external>/{path.name or '.'}" won’t be hit. Consider simplifying the control flow (or adding a comment explaining why this can still happen) to avoid dead code and make the redaction logic easier to reason about.

Suggested change
if normalized.is_absolute():
return f"<external>/{normalized.name or '.'}"
return f"<external>/{path.name or '.'}"
return f"<external>/{normalized.name or '.'}"

Copilot uses AI. Check for mistakes.
Comment on lines 334 to 340
base_coverage = 100.0 * alive_fraction
metrics: dict[str, float | int] = {
"active_agents": active_agents,
"area_coverage_pct": round(base_coverage, 3),
"offline_agents": total_agents - active_agents,
"frame_count": len(frames),
"mean_energy_remaining_pct": round(
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that area_coverage_pct is part of the base metrics dict, the gps_denied_recon branch later in this function still re-adds the same key via metrics.update(...). Consider removing that duplicate assignment to reduce redundancy and avoid future divergence between the two values.

Copilot uses AI. Check for mistakes.
@RMANOV RMANOV merged commit c4a27f2 into main Apr 25, 2026
14 checks passed
@RMANOV RMANOV deleted the fix-replay-review-findings branch April 25, 2026 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants