Skip to content

Keep evidence harness reports public-safe#9

Merged
RMANOV merged 1 commit into
mainfrom
fix-evidence-harness-public-safety
Apr 25, 2026
Merged

Keep evidence harness reports public-safe#9
RMANOV merged 1 commit into
mainfrom
fix-evidence-harness-public-safety

Conversation

@RMANOV
Copy link
Copy Markdown
Owner

@RMANOV RMANOV commented Apr 25, 2026

Follow-up to Copilot review on PR #8.\n\nIncluded:\n- use public-safe path labels in scenario contract reports and error strings\n- avoid leaking external absolute paths in generated evidence reports\n- reject empty matrix command lists during matrix loading\n- make run_entry return a failed result for empty command lists instead of risking a subprocess crash\n- add focused regression tests for both cases\n\nValidation:\n- python scripts/verify_public_surface.py -> passed\n- pytest -q -> 138 passed\n- pytest -q python/tests/test_strix_test_matrix.py python/tests/test_strix_scenario_contract.py -> 15 passed\n- python scripts/strix_scenario_contract.py --scenario-dir /tmp/does-not-exist --output /tmp/strix-missing-scenario-dir-public-safe.json -> exits nonzero with public-safe failure report\n- python scripts/strix_test_matrix.py --select smoke --output /tmp/strix-smoke-public-safe.json -> passed\n- cargo test -p strix-optimizer doctrine_profile_accepts_neutral_aliases --lib -> passed\n- cargo check -p strix-python -> passed

Copilot AI review requested due to automatic review settings April 25, 2026 07:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens the evidence-harness Python scripts to keep generated reports “public-safe” by avoiding leakage of local absolute paths and by rejecting/handling empty command lists more robustly.

Changes:

  • Sanitize scenario contract report paths (and directory error strings) using a public-safe path formatter.
  • Reject empty command arrays when loading a test matrix and fail run_entry cleanly for empty commands.
  • Add regression tests for empty-command handling and for path-safety in scenario contract reports.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
scripts/strix_test_matrix.py Rejects empty command lists at load time; makes run_entry fail safely for empty commands.
scripts/strix_scenario_contract.py Introduces public_path() and uses it in scenario/directory reports and directory error strings.
python/tests/test_strix_test_matrix.py Adds regression tests for rejecting empty matrix commands and for run_entry behavior.
python/tests/test_strix_scenario_contract.py Adds assertions that reports/error strings do not contain the local temp directory path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 136 to 138
return {
"path": str(path.relative_to(ROOT) if path.is_relative_to(ROOT) else path),
"path": public_path(path),
"scenario_id": scenario_id,
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b962b10d22

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

status = "failed" if errors else "passed"
return {
"path": str(path.relative_to(ROOT) if path.is_relative_to(ROOT) else path),
"path": public_path(path),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Sanitize parse-error paths in scenario reports

The new public-safe path handling is only applied in the normal return path, so malformed/unreadable scenario files can still leak absolute local paths via the early exception return in validate_scenario ("path": str(path)) and its raw exception text. In practice, running the contract against external directories with a bad YAML or read error will still publish host-specific filesystem details, which contradicts the public-safe reporting goal of this change. Please ensure the exception branch also uses public_path(path) (and avoid path-bearing raw exception strings).

Useful? React with 👍 / 👎.

@RMANOV RMANOV merged commit b745370 into main Apr 25, 2026
14 checks passed
@RMANOV RMANOV deleted the fix-evidence-harness-public-safety branch April 25, 2026 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants