Skip to content

feat(proof-core): P1 Determinism Claims v0.1#291

Merged
flyingrobots merged 25 commits intomainfrom
proof-core
Mar 7, 2026
Merged

feat(proof-core): P1 Determinism Claims v0.1#291
flyingrobots merged 25 commits intomainfrom
proof-core

Conversation

@flyingrobots
Copy link
Owner

Summary

  • DETERMINISM_CLAIMS_v0.1.md: Documents five determinism claims (DET-001 through DET-005) covering fixed-point arithmetic, PRNG reproducibility, parallel execution determinism, trig oracle golden vectors, and torture-rerun reproducibility
  • Trig Golden Vectors: New test (trig_golden_vectors.rs) with a 2048-sample golden binary that locks down dfix64 sin/cos/tan outputs across platforms
  • Torture Rerun Script: scripts/torture-100-reruns.sh — turnkey repro script that runs 100 sequential simulations and asserts identical hashes
  • CI Gate: Trig Oracle gate added to det-gates.yml for macOS and Linux, with log artifacts
  • CLAIM_MAP.yaml: DET-004 and DET-005 entries added
  • Roadmap: P1 exit criteria checked off, milestone marked "In Progress"
  • Also includes prior work on this branch: CLI fixes, clippy cleanup, roadmap resequencing

Test plan

  • cargo test passes locally (all tests green)
  • cargo clippy clean (zero warnings)
  • Pre-push hook passes (fmt, clippy, tests, rustdoc, patterns, determinism)
  • CI det-gates workflow passes on GitHub Actions (macOS + Linux)
  • Golden vectors produce identical hashes on CI runners vs local

BREAKING CHANGE: The `warp-ffi` crate has been deleted. C ABI is
abandoned because C's undefined behavior is incompatible with Echo's
determinism guarantees. Rust plugin extension via RewriteRule trait
registration and Rhai scripting replace this path.

Changes:
- Delete `crates/warp-ffi/` (C ABI surface for warp-core)
- Remove warp-ffi from workspace Cargo.toml
- Remove warp-ffi rustdoc gate from CI
- Clean all warp-ffi references from docs (code-map, spec-warp-core,
  rust-rhai-ts-division, phase1-plan, warp-demo-roadmap, project-tour)
- Fix pre-existing MD024 duplicate heading lint in rust-rhai-ts-division.md

TASKS-DAG updates:
- #26 Plugin ABI (C) v0 → Closed (Graveyard)
- #86 C header + host loader → Closed (Graveyard)
- #87 Version negotiation → Closed (Graveyard)
- #88 Capability tokens → Closed (Graveyard)
- #89 Example plugin + tests → Closed (Graveyard)
- #38 FFI limits and validation → Closed (Graveyard)
- #39 WASM input validation → Completed (was stale "In Progress")
- #202 Provenance Payload → Open + full Paper III task breakdown (PP-1 through PP-6)
- #231 Tumble Tower Stage 0 → Open (was stale "In Progress")

Closes #26, #86, #87, #88, #89, #38
Developer CLI (P0):
- Full clap 4 derive CLI with verify, bench, inspect subcommands
- WSC loader bridging columnar format to GraphStore reconstruction
- verify: validates WSC snapshots, recomputes state root hashes
- bench: parses Criterion JSON, renders ASCII tables via comfy-table
- inspect: metadata display, graph stats, --tree ASCII visualization
- Global --format text|json flag for machine-readable output
- Man page generation via clap_mangen in xtask

Provenance Payload Spec (PP-1):
- SPEC-0005 maps Paper III formalism to concrete Echo types
- Defines ProvenancePayload, BoundaryTransitionRecord,
  ProvenanceNode, DerivationGraph
- Wire format with CBOR encoding and domain separation tags
- Two worked examples and attestation envelope with SLSA alignment
--bench <filter> tells cargo to find a bench target by that name.
The correct form is `-- <filter>` which forwards the pattern to
Criterion as a regex filter.

Also fixes the misleading "Suppress benchmark stdout" comment —
stdout/stderr are inherited (visible), not suppressed.

Extracts build_bench_command() helper with two tests verifying
the filter produces ["--", "hotpath"] and omits "--" when absent.
When --expected is provided for a multi-warp WSC file, warps beyond
warp 0 now report "unchecked" instead of falsely claiming "pass".
A stderr warning is emitted when --expected is used with >1 warps.

Removes .to_uppercase() on the Result line so text and JSON output
use consistent lowercase status values ("pass"/"fail"/"unchecked").

Updates --expected help text to document the warp-0-only limitation.
…rting

- Remove `colored = "2"` from Cargo.toml (declared but never imported).
- Replace .expect() in emit() with match + eprintln fallback so the
  CLI never panics on JSON serialization failure.
- Replace unwrap_or(-1) in bench exit status with signal-aware
  reporting: on Unix, reports the actual signal number via
  ExitStatusExt::signal(); on other platforms, reports "unknown
  termination".
- TASKS-DAG.md: SPEC-PROVENANCE-PAYLOAD.md → SPEC-0005-provenance-payload.md
  (two occurrences: sub-task title and AC1).
- ROADMAP backlog: same stale path in security.md AC1.
- SPEC-0005 §5.2: fix domain separation tag byte counts:
  - echo:provenance_payload:v1\0 = 27 bytes (was 28)
  - echo:provenance_edge:v1\0 = 24 bytes (was 25)
  - echo:btr:v1\0 = 12 bytes (correct, unchanged)
Man pages for subcommands now show "echo-cli-bench", "echo-cli-verify",
"echo-cli-inspect" in their .TH headers instead of bare "bench",
"verify", "inspect". Overrides the clap Command name before passing
to clap_mangen::Man::new().

Regenerated docs/man/*.1 via `cargo xtask man-pages`.
- bench.rs: `if let Ok` → `match` with eprintln warning on
  parse_estimates failure (M3).
- bench.rs: guard format_duration for NaN/negative → return "N/A" (M4).
- wsc_loader.rs: `unwrap_or(&[])` → `match` with eprintln warning on
  missing blob (M7).
- wsc_loader.rs: add debug_assert!(atts.len() <= 1) at both attachment
  reconstruction sites (L2).
- inspect.rs: BTreeMap/BTreeSet → HashMap/HashSet in
  count_connected_components (CLI-only, not engine) (L3).
- inspect.rs: extract const TREE_MAX_DEPTH = 5 (L4).
- lib.rs: remove blanket #![allow(dead_code)], add targeted
  #[allow(dead_code)] on output module only (L5).
- project-tour-2025-12-28.md: replace "Placeholder CLI home" with
  actual warp-cli subcommand descriptions (verify, bench, inspect).
- ci.yml: remove blank line between warp-geom and warp-wasm rustdoc
  steps for consistent formatting.
- .ok() → `let _ =` for writeln! Result discard in verify.rs and
  inspect.rs (idiomatic explicit discard).
- .or_insert(0) → .or_default() in inspect.rs type breakdown maps.
- pub → pub(crate) on all structs and functions in bench.rs,
  inspect.rs, verify.rs, and wsc_loader.rs. These types are only
  used within the binary target. cli.rs types remain pub because
  xtask imports them via the lib target for man page generation.
Add missing CHANGELOG entries for commits 4-5 and 7-8 (doc path
corrections, SPEC-0005 byte counts, man page .TH headers, visibility
narrowing, project tour update, CI blank line).

Update warp-cli README to note --expected applies to warp 0 only.
…anup

- xtask/Cargo.toml: add version = "0.1.0" alongside path dep for
  warp-cli to satisfy cargo-deny's wildcard dependency check.
- README.md: clarify bench description — runs benchmarks AND parses
  results, not just parses.
- xtask man-pages: remove stale echo-cli*.1 files before regeneration
  so the output directory is an exact snapshot.
- SPEC-0005: fix derivation algorithm that dropped transitive causal
  dependencies (the backward-cone filter checked root query slot at
  every hop instead of accepting all frontier nodes unconditionally)
- SPEC-0005: reword global_tick invariant for non-zero-start payloads
- SPEC-0005: fix BTR verification to reference §5.4 hash formula
  instead of nonexistent `parents` field
- inspect: preserve warp identity in multi-warp tree output via new
  `warp_index` field on TreeNode; text renderer labels per-warp trees
- wsc_loader: replace debug_assert! with runtime warnings for
  attachment multiplicity violations (enforced in release builds)
- wsc_loader: add edge-attachment and Descend-attachment roundtrip tests
- verify: rename tampered_wsc_fails → tampered_wsc_does_not_panic
- det-policy.yaml: remove stale warp-ffi entry (crate deleted)
- phase1-plan: remove dead "Expose C ABI" reference
- rust-rhai-ts-division: update CLI refs to echo-cli verify/bench/inspect
- CHANGELOG: add Removed entry for warp-ffi breaking change
…ustness

- Remove stale warp-ffi references from git hooks, README, AGENTS.md
- Fix broken docs/specs/ → docs/spec/ paths in security.md
- Change emit() to return Result<()> and propagate at all call sites
- SPEC-0005: bind index var in verification, document missing-producer
  behavior, clarify multi-producer semantics, add cross-references,
  specify composition errors, formalize identity element, expand
  pseudocode constructors, add Paper III citation
- Add is_infinite() check in format_duration()
- Use usize::try_from() for safe edge_ix cast in inspect.rs
- Add bench test ordering assertion and empty-results warning
- Include entity IDs in wsc_loader warning messages
- Fix fragile len()-1 pattern in tree formatter
Add #![allow(...)] attributes to warp-core test files to suppress
pedantic and restriction clippy lints that are appropriate in test
code (expect_used, unwrap_used, panic, float_cmp, cast_possible_truncation,
etc.). No logic changes — only lint suppression attributes added.
Fix all clippy errors in echo-session-service, echo-session-ws-gateway,
ttd-browser, warp-benches, warp-viewer, and echo-dind-harness.

Production code fixes:
- Move items (const, enum, use) before statements
- Replace Default::default() with concrete type defaults in gpu.rs
- Use clone_from instead of reassigning clone in ws-gateway
- Inline format string variables where possible
- Replace map().unwrap_or() with map_or()
- Fix redundant closure, unnecessary struct building
- Merge identical match arms, replace wildcard enum import
- Fix unnecessary boolean negation (flip if/else branches)
- Change &Option<T> to Option<&T> in public API
- Change &mut to & where mutability not needed
- Use f32::from(u8), u64::from(u32), std::iter::once()
- Replace Debug formatting {:?} with Display (.display()) for PathBuf

Test/bench code allows:
- Add #[allow] for unwrap_used, expect_used, panic on test modules
- Add #![allow] for bench and integration test files
- Add targeted #[allow] for safe rendering casts and GPU init
These test files only compile under `det_fixed` and `golden_prng`
features respectively, so they were missed during the default-feature
clippy pass.

- dfix64_tests.rs: allow float_cmp (exact fixed-point comparisons)
- prng_golden_regression.rs: allow missing_docs (test file)
Uppercase hex input from external tools was incorrectly treated as a
mismatch since state_root_hex is always lowercase from hex::encode.

Addresses PR #290 review feedback.
- Mark P0 Lock the Hashes and Developer CLI as Verified
- Resequence: Proof Core (P1) now gates First Light (P2)
- First Light demoted from P1→P2 (prove correctness before demoing)
- Time Travel / Convergence / game demos pushed to P3
- Update dependency DAG and cross-project notes
… script, roadmap updates

- Add DETERMINISM_CLAIMS_v0.1.md documenting DET-001 through DET-005
- Add DET-004 (trig golden vectors) and DET-005 (torture rerun) to CLAIM_MAP.yaml
- Create trig_golden_vectors.rs test with 2048-sample golden binary
- Create scripts/torture-100-reruns.sh turnkey repro script
- Add Trig Oracle CI gate to det-gates.yml (macOS + Linux)
- Update proof-core roadmap: mark exit criteria, set milestone to In Progress
…nondeterminism

Test file reads/writes golden vector binaries to verify determinism.
This is test-only I/O, not simulation code.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 7, 2026

Important

Review skipped

Too many files!

This PR contains 185 files, which is 35 over the limit of 150.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 39b71b91-de07-4532-8286-c64f208efa5a

📥 Commits

Reviewing files that changed from the base of the PR and between 72257bf and 7cc2b17.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • testdata/trig_golden_2048.bin is excluded by !**/*.bin
📒 Files selected for processing (185)
  • .ban-nondeterminism-allowlist
  • .githooks/pre-push-parallel
  • .githooks/pre-push-sequential
  • .github/workflows/ci.yml
  • .github/workflows/det-gates.yml
  • AGENTS.md
  • CHANGELOG.md
  • Cargo.toml
  • TASKS-DAG.md
  • crates/echo-app-core/Cargo.toml
  • crates/echo-cas/Cargo.toml
  • crates/echo-cas/src/lib.rs
  • crates/echo-config-fs/Cargo.toml
  • crates/echo-dind-harness/Cargo.toml
  • crates/echo-dind-harness/src/dind.rs
  • crates/echo-dind-harness/src/lib.rs
  • crates/echo-dind-harness/tests/coverage.rs
  • crates/echo-dind-harness/tests/permutation_invariance.rs
  • crates/echo-dind-tests/Cargo.toml
  • crates/echo-dind-tests/src/codecs.generated.rs
  • crates/echo-dind-tests/src/lib.rs
  • crates/echo-dind-tests/src/rules.rs
  • crates/echo-dry-tests/Cargo.toml
  • crates/echo-dry-tests/src/config.rs
  • crates/echo-dry-tests/src/demo_rules.rs
  • crates/echo-dry-tests/src/hashes.rs
  • crates/echo-dry-tests/src/lib.rs
  • crates/echo-dry-tests/src/motion.rs
  • crates/echo-graph/Cargo.toml
  • crates/echo-graph/src/lib.rs
  • crates/echo-registry-api/Cargo.toml
  • crates/echo-scene-codec/Cargo.toml
  • crates/echo-scene-codec/src/cbor.rs
  • crates/echo-scene-codec/src/lib.rs
  • crates/echo-scene-port/Cargo.toml
  • crates/echo-scene-port/src/canon.rs
  • crates/echo-scene-port/src/lib.rs
  • crates/echo-session-client/Cargo.toml
  • crates/echo-session-client/src/lib.rs
  • crates/echo-session-client/src/tool.rs
  • crates/echo-session-proto/Cargo.toml
  • crates/echo-session-proto/src/canonical.rs
  • crates/echo-session-proto/src/eint_v2.rs
  • crates/echo-session-proto/src/integrity_tests.rs
  • crates/echo-session-proto/src/lib.rs
  • crates/echo-session-proto/src/ttdr_v2.rs
  • crates/echo-session-proto/src/wire.rs
  • crates/echo-session-service/Cargo.toml
  • crates/echo-session-service/src/main.rs
  • crates/echo-session-ws-gateway/Cargo.toml
  • crates/echo-session-ws-gateway/src/main.rs
  • crates/echo-ttd/Cargo.toml
  • crates/echo-ttd/src/lib.rs
  • crates/echo-wasm-abi/Cargo.toml
  • crates/echo-wasm-abi/src/canonical.rs
  • crates/echo-wasm-abi/src/codec.rs
  • crates/echo-wasm-abi/src/eintlog.rs
  • crates/echo-wasm-abi/src/lib.rs
  • crates/echo-wasm-abi/tests/canonical_vectors.rs
  • crates/echo-wasm-abi/tests/codec.rs
  • crates/echo-wasm-abi/tests/fuzz_wire.rs
  • crates/echo-wasm-abi/tests/non_canonical_floats.rs
  • crates/echo-wasm-bindings/Cargo.toml
  • crates/echo-wasm-bindings/src/lib.rs
  • crates/echo-wasm-bindings/tests/api_tests.rs
  • crates/echo-wasm-bindings/tests/ttd_tests.rs
  • crates/echo-wesley-gen/Cargo.toml
  • crates/echo-wesley-gen/src/main.rs
  • crates/echo-wesley-gen/tests/generation.rs
  • crates/ttd-browser/Cargo.toml
  • crates/ttd-browser/src/lib.rs
  • crates/ttd-manifest/Cargo.toml
  • crates/ttd-protocol-rs/Cargo.toml
  • crates/ttd-protocol-rs/lib.rs
  • crates/warp-benches/Cargo.toml
  • crates/warp-benches/benches/boaw_baseline.rs
  • crates/warp-benches/benches/materialization_hotpath.rs
  • crates/warp-benches/benches/motion_throughput.rs
  • crates/warp-benches/benches/scheduler_adversarial.rs
  • crates/warp-benches/benches/scheduler_drain.rs
  • crates/warp-benches/benches/snapshot_hash.rs
  • crates/warp-cli/Cargo.toml
  • crates/warp-cli/README.md
  • crates/warp-cli/src/bench.rs
  • crates/warp-cli/src/cli.rs
  • crates/warp-cli/src/inspect.rs
  • crates/warp-cli/src/lib.rs
  • crates/warp-cli/src/main.rs
  • crates/warp-cli/src/output.rs
  • crates/warp-cli/src/verify.rs
  • crates/warp-cli/src/wsc_loader.rs
  • crates/warp-cli/tests/cli_integration.rs
  • crates/warp-core/Cargo.toml
  • crates/warp-core/README.md
  • crates/warp-core/build.rs
  • crates/warp-core/src/bin/gen_sin_qtr_lut.rs
  • crates/warp-core/src/lib.rs
  • crates/warp-core/tests/atom_payload_digest_tests.rs
  • crates/warp-core/tests/boaw_cow.rs
  • crates/warp-core/tests/boaw_determinism.rs
  • crates/warp-core/tests/boaw_engine_worker_invariance.rs
  • crates/warp-core/tests/boaw_footprints.rs
  • crates/warp-core/tests/boaw_merge.rs
  • crates/warp-core/tests/boaw_privacy.rs
  • crates/warp-core/tests/checkpoint_fork_tests.rs
  • crates/warp-core/tests/common/mod.rs
  • crates/warp-core/tests/determinism_audit.rs
  • crates/warp-core/tests/deterministic_sin_cos_tests.rs
  • crates/warp-core/tests/dfix64_tests.rs
  • crates/warp-core/tests/dispatch_inbox.rs
  • crates/warp-core/tests/duplicate_rule_registration_tests.rs
  • crates/warp-core/tests/engine_motion_negative_tests.rs
  • crates/warp-core/tests/engine_motion_tests.rs
  • crates/warp-core/tests/footprint_independence_tests.rs
  • crates/warp-core/tests/inbox.rs
  • crates/warp-core/tests/ledger_tests.rs
  • crates/warp-core/tests/mat4_mul_tests.rs
  • crates/warp-core/tests/materialization_determinism.rs
  • crates/warp-core/tests/materialization_engine_integration.rs
  • crates/warp-core/tests/materialization_spec_police.rs
  • crates/warp-core/tests/math_additional_tests.rs
  • crates/warp-core/tests/math_convenience_tests.rs
  • crates/warp-core/tests/math_scalar_tests.rs
  • crates/warp-core/tests/math_validation.rs
  • crates/warp-core/tests/nan_exhaustive_tests.rs
  • crates/warp-core/tests/outputs_playback_tests.rs
  • crates/warp-core/tests/permutation_commute_tests.rs
  • crates/warp-core/tests/playback_cursor_tests.rs
  • crates/warp-core/tests/prng_golden_regression.rs
  • crates/warp-core/tests/property_commute_tests.rs
  • crates/warp-core/tests/proptest_seed_pinning.rs
  • crates/warp-core/tests/reducer_emission_tests.rs
  • crates/warp-core/tests/reserve_gate_tests.rs
  • crates/warp-core/tests/slice_theorem_proof.rs
  • crates/warp-core/tests/tick_receipt_tests.rs
  • crates/warp-core/tests/trig_golden_vectors.rs
  • crates/warp-core/tests/tx_lifecycle_tests.rs
  • crates/warp-core/tests/vec3_ops_tests.rs
  • crates/warp-core/tests/view_session_tests.rs
  • crates/warp-ffi/Cargo.toml
  • crates/warp-ffi/README.md
  • crates/warp-ffi/src/lib.rs
  • crates/warp-geom/Cargo.toml
  • crates/warp-geom/src/lib.rs
  • crates/warp-geom/tests/geom_broad_tests.rs
  • crates/warp-viewer/Cargo.toml
  • crates/warp-viewer/src/app.rs
  • crates/warp-viewer/src/app_frame.rs
  • crates/warp-viewer/src/gpu.rs
  • crates/warp-viewer/src/input.rs
  • crates/warp-viewer/src/main.rs
  • crates/warp-viewer/src/perf.rs
  • crates/warp-viewer/src/render.rs
  • crates/warp-viewer/src/scene.rs
  • crates/warp-viewer/src/session_logic.rs
  • crates/warp-viewer/src/ui.rs
  • crates/warp-viewer/src/ui_effects.rs
  • crates/warp-viewer/src/ui_state.rs
  • crates/warp-wasm/Cargo.toml
  • crates/warp-wasm/src/lib.rs
  • det-policy.yaml
  • docs/ROADMAP.md
  • docs/ROADMAP/backlog/security.md
  • docs/ROADMAP/developer-cli/README.md
  • docs/ROADMAP/first-light/README.md
  • docs/ROADMAP/proof-core/README.md
  • docs/code-map.md
  • docs/determinism/CLAIM_MAP.yaml
  • docs/determinism/DETERMINISM_CLAIMS_v0.1.md
  • docs/man/echo-cli-bench.1
  • docs/man/echo-cli-inspect.1
  • docs/man/echo-cli-verify.1
  • docs/man/echo-cli.1
  • docs/notes/project-tour-2025-12-28.md
  • docs/phase1-plan.md
  • docs/rust-rhai-ts-division.md
  • docs/spec-warp-core.md
  • docs/spec/SPEC-0005-provenance-payload.md
  • docs/warp-demo-roadmap.md
  • scripts/generate_evidence.cjs
  • scripts/torture-100-reruns.sh
  • specs/spec-000-rewrite/Cargo.toml
  • specs/spec-000-rewrite/src/lib.rs
  • xtask/Cargo.toml
  • xtask/src/main.rs

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch proof-core

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a85788bac6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +123 to +126
let entries = std::fs::read_dir(criterion_dir)
.with_context(|| format!("failed to read {}", criterion_dir.display()))?;

for entry in entries {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Recursively collect Criterion estimates files

collect_criterion_results only scans one directory level under target/criterion and then looks for <child>/new/estimates.json, but Criterion commonly stores grouped benchmarks in deeper paths (e.g. from benchmark_group/BenchmarkId), so successful cargo bench runs can still yield missing or empty echo-cli bench output. This makes the new bench command unreliable for most real benchmark layouts unless the scanner walks directories recursively (or parses Criterion’s report index).

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Addressed in 0b22e53collect_criterion_results now walks the directory tree recursively, finding new/estimates.json at any depth. Benchmark names are derived from the path relative to the criterion root (e.g. group/bench_a). Added tests for grouped and parameterised layouts.

Comment on lines +146 to +148
if let Some(f) = filter {
if !bench_name.contains(f) {
continue;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply bench filter with Criterion regex semantics

The results pass re-filters benchmark names with contains, but build_bench_command forwards the same --filter value to Criterion as a regex (cargo bench ... -- <pattern>). Any valid regex filter containing anchors or metacharacters can execute benchmarks successfully yet be dropped here, producing false "no results"/partial reports. Reusing regex matching (or skipping this second filter) would keep post-processing aligned with what Criterion actually ran.

Useful? React with 👍 / 👎.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Addressed in 0b22e53 — Post-filter now uses regex::Regex instead of contains, matching Criterion's own regex semantics. Added a test verifying that anchors (^hotpath, ^coldpath_alloc$) work correctly.

The evidence policy requires bidirectional sync between CLAIM_MAP.yaml
and evidence.json. Wire new claims to their CI artifact sources.
- collect_criterion_results now walks directories recursively to find
  grouped and parameterised benchmarks (benchmark_group, BenchmarkId)
- Post-filter uses regex::Regex to match Criterion's own regex
  semantics instead of substring contains
- Add tests: nested groups, parameterised benchmarks, regex anchors

Addresses Codex review comments (P1, P2) on PR #291.
@flyingrobots
Copy link
Owner Author

@codex review

@flyingrobots
Copy link
Owner Author

@coderabbitai review please 🥕

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 7, 2026

@flyingrobots Carrot received. 🥕 Initiating GOD-TIER NIGHTMARE review of your determinism claims — no mercy, no handwaving, no "close enough." DET-001 through DET-005 will be interrogated with extreme prejudice.


🧠 Learnings used
Learnt from: flyingrobots
Repo: flyingrobots/echo PR: 0
File: :0-0
Timestamp: 2026-01-24T23:22:15.388Z
Learning: The user (flyingrobots) prefers extremely detailed, pedantic code reviews with no mercy — "GOD-TIER NIGHTMARE CODE REVIEWER" style. They want critical scrutiny, exhaustive analysis, identification of landmines, design questions, and actionable demands on every review.

Learnt from: CR
Repo: flyingrobots/echo PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-14T15:35:39.611Z
Learning: Use the DIND (Deterministic Ironclad Nightmare Drills) harness to verify any changes against golden hash chains

Learnt from: CR
Repo: flyingrobots/echo PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-14T15:35:39.611Z
Learning: All determinism-critical crates (`warp-core`, `echo-wasm-abi`, `echo-scene-port`) must be compiled with `RUSTFLAGS="-Dwarnings"`; unused imports, dead code, or silenced lints are treated as build failures
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7cc2b177cc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- **Honor the Vision**: Echo is a deterministic, multiverse-aware ECS. Consult `docs/architecture-outline.md` before touching runtime code.
- **Document Ruthlessly**: Every meaningful design choice should land in `docs/` (specs, diagrams, ADRs) or PR descriptions.
- **Docstrings Aren't Optional**: Public APIs across crates (`warp-core`, `warp-ffi`, `warp-wasm`, etc.) must carry rustdoc comments that explain intent, invariants, and usage. Treat missing docs as a failing test.
- **Docstrings Aren't Optional**: Public APIs across crates (`warp-core`, `warp-wasm`, etc.) must carry rustdoc comments that explain intent, invariants, and usage. Treat missing docs as a failing test.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve AGENTS.md as append-only

This edit rewrites an existing line in AGENTS.md, but the repository’s append-only gate rejects any deletions/line replacements in that file (scripts/check-append-only.js errors when removed > 0 for AGENTS.md). I verified this commit against its parent with node scripts/check-append-only.js --base <parent> and it fails, so this change will block the documented pre-merge invariant check.

Useful? React with 👍 / 👎.

- Evidence: Inferred: Epic completion depends on constituent task

## [#21: Spec: Security Contexts (FFI/WASM/CLI)](https://github.com/flyingrobots/echo/issues/21)
## [#21: Spec: Security Contexts (WASM/CLI)](https://github.com/flyingrobots/echo/issues/21)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep TASKS-DAG.md append-only

This commit removes/replaces existing TASKS-DAG entries, but scripts/check-append-only.js enforces the same append-only rule for TASKS-DAG.md and fails whenever deletions are present. Running the checker against this commit’s parent reports deletions in this file, so the append-only policy gate will fail until these edits are converted to pure append operations.

Useful? React with 👍 / 👎.

@flyingrobots flyingrobots merged commit d3d1267 into main Mar 7, 2026
28 checks passed
@flyingrobots flyingrobots deleted the proof-core branch March 7, 2026 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant