feat(proof-core): P1 Determinism Claims v0.1#291
Conversation
BREAKING CHANGE: The `warp-ffi` crate has been deleted. C ABI is abandoned because C's undefined behavior is incompatible with Echo's determinism guarantees. Rust plugin extension via RewriteRule trait registration and Rhai scripting replace this path. Changes: - Delete `crates/warp-ffi/` (C ABI surface for warp-core) - Remove warp-ffi from workspace Cargo.toml - Remove warp-ffi rustdoc gate from CI - Clean all warp-ffi references from docs (code-map, spec-warp-core, rust-rhai-ts-division, phase1-plan, warp-demo-roadmap, project-tour) - Fix pre-existing MD024 duplicate heading lint in rust-rhai-ts-division.md TASKS-DAG updates: - #26 Plugin ABI (C) v0 → Closed (Graveyard) - #86 C header + host loader → Closed (Graveyard) - #87 Version negotiation → Closed (Graveyard) - #88 Capability tokens → Closed (Graveyard) - #89 Example plugin + tests → Closed (Graveyard) - #38 FFI limits and validation → Closed (Graveyard) - #39 WASM input validation → Completed (was stale "In Progress") - #202 Provenance Payload → Open + full Paper III task breakdown (PP-1 through PP-6) - #231 Tumble Tower Stage 0 → Open (was stale "In Progress") Closes #26, #86, #87, #88, #89, #38
Developer CLI (P0): - Full clap 4 derive CLI with verify, bench, inspect subcommands - WSC loader bridging columnar format to GraphStore reconstruction - verify: validates WSC snapshots, recomputes state root hashes - bench: parses Criterion JSON, renders ASCII tables via comfy-table - inspect: metadata display, graph stats, --tree ASCII visualization - Global --format text|json flag for machine-readable output - Man page generation via clap_mangen in xtask Provenance Payload Spec (PP-1): - SPEC-0005 maps Paper III formalism to concrete Echo types - Defines ProvenancePayload, BoundaryTransitionRecord, ProvenanceNode, DerivationGraph - Wire format with CBOR encoding and domain separation tags - Two worked examples and attestation envelope with SLSA alignment
--bench <filter> tells cargo to find a bench target by that name. The correct form is `-- <filter>` which forwards the pattern to Criterion as a regex filter. Also fixes the misleading "Suppress benchmark stdout" comment — stdout/stderr are inherited (visible), not suppressed. Extracts build_bench_command() helper with two tests verifying the filter produces ["--", "hotpath"] and omits "--" when absent.
When --expected is provided for a multi-warp WSC file, warps beyond
warp 0 now report "unchecked" instead of falsely claiming "pass".
A stderr warning is emitted when --expected is used with >1 warps.
Removes .to_uppercase() on the Result line so text and JSON output
use consistent lowercase status values ("pass"/"fail"/"unchecked").
Updates --expected help text to document the warp-0-only limitation.
…rting - Remove `colored = "2"` from Cargo.toml (declared but never imported). - Replace .expect() in emit() with match + eprintln fallback so the CLI never panics on JSON serialization failure. - Replace unwrap_or(-1) in bench exit status with signal-aware reporting: on Unix, reports the actual signal number via ExitStatusExt::signal(); on other platforms, reports "unknown termination".
- TASKS-DAG.md: SPEC-PROVENANCE-PAYLOAD.md → SPEC-0005-provenance-payload.md (two occurrences: sub-task title and AC1). - ROADMAP backlog: same stale path in security.md AC1. - SPEC-0005 §5.2: fix domain separation tag byte counts: - echo:provenance_payload:v1\0 = 27 bytes (was 28) - echo:provenance_edge:v1\0 = 24 bytes (was 25) - echo:btr:v1\0 = 12 bytes (correct, unchanged)
Man pages for subcommands now show "echo-cli-bench", "echo-cli-verify", "echo-cli-inspect" in their .TH headers instead of bare "bench", "verify", "inspect". Overrides the clap Command name before passing to clap_mangen::Man::new(). Regenerated docs/man/*.1 via `cargo xtask man-pages`.
- bench.rs: `if let Ok` → `match` with eprintln warning on parse_estimates failure (M3). - bench.rs: guard format_duration for NaN/negative → return "N/A" (M4). - wsc_loader.rs: `unwrap_or(&[])` → `match` with eprintln warning on missing blob (M7). - wsc_loader.rs: add debug_assert!(atts.len() <= 1) at both attachment reconstruction sites (L2). - inspect.rs: BTreeMap/BTreeSet → HashMap/HashSet in count_connected_components (CLI-only, not engine) (L3). - inspect.rs: extract const TREE_MAX_DEPTH = 5 (L4). - lib.rs: remove blanket #![allow(dead_code)], add targeted #[allow(dead_code)] on output module only (L5).
- project-tour-2025-12-28.md: replace "Placeholder CLI home" with actual warp-cli subcommand descriptions (verify, bench, inspect). - ci.yml: remove blank line between warp-geom and warp-wasm rustdoc steps for consistent formatting.
- .ok() → `let _ =` for writeln! Result discard in verify.rs and inspect.rs (idiomatic explicit discard). - .or_insert(0) → .or_default() in inspect.rs type breakdown maps. - pub → pub(crate) on all structs and functions in bench.rs, inspect.rs, verify.rs, and wsc_loader.rs. These types are only used within the binary target. cli.rs types remain pub because xtask imports them via the lib target for man page generation.
Add missing CHANGELOG entries for commits 4-5 and 7-8 (doc path corrections, SPEC-0005 byte counts, man page .TH headers, visibility narrowing, project tour update, CI blank line). Update warp-cli README to note --expected applies to warp 0 only.
…anup - xtask/Cargo.toml: add version = "0.1.0" alongside path dep for warp-cli to satisfy cargo-deny's wildcard dependency check. - README.md: clarify bench description — runs benchmarks AND parses results, not just parses. - xtask man-pages: remove stale echo-cli*.1 files before regeneration so the output directory is an exact snapshot.
- SPEC-0005: fix derivation algorithm that dropped transitive causal dependencies (the backward-cone filter checked root query slot at every hop instead of accepting all frontier nodes unconditionally) - SPEC-0005: reword global_tick invariant for non-zero-start payloads - SPEC-0005: fix BTR verification to reference §5.4 hash formula instead of nonexistent `parents` field - inspect: preserve warp identity in multi-warp tree output via new `warp_index` field on TreeNode; text renderer labels per-warp trees - wsc_loader: replace debug_assert! with runtime warnings for attachment multiplicity violations (enforced in release builds) - wsc_loader: add edge-attachment and Descend-attachment roundtrip tests - verify: rename tampered_wsc_fails → tampered_wsc_does_not_panic - det-policy.yaml: remove stale warp-ffi entry (crate deleted) - phase1-plan: remove dead "Expose C ABI" reference - rust-rhai-ts-division: update CLI refs to echo-cli verify/bench/inspect - CHANGELOG: add Removed entry for warp-ffi breaking change
…ustness - Remove stale warp-ffi references from git hooks, README, AGENTS.md - Fix broken docs/specs/ → docs/spec/ paths in security.md - Change emit() to return Result<()> and propagate at all call sites - SPEC-0005: bind index var in verification, document missing-producer behavior, clarify multi-producer semantics, add cross-references, specify composition errors, formalize identity element, expand pseudocode constructors, add Paper III citation - Add is_infinite() check in format_duration() - Use usize::try_from() for safe edge_ix cast in inspect.rs - Add bench test ordering assertion and empty-results warning - Include entity IDs in wsc_loader warning messages - Fix fragile len()-1 pattern in tree formatter
Add #![allow(...)] attributes to warp-core test files to suppress pedantic and restriction clippy lints that are appropriate in test code (expect_used, unwrap_used, panic, float_cmp, cast_possible_truncation, etc.). No logic changes — only lint suppression attributes added.
Fix all clippy errors in echo-session-service, echo-session-ws-gateway,
ttd-browser, warp-benches, warp-viewer, and echo-dind-harness.
Production code fixes:
- Move items (const, enum, use) before statements
- Replace Default::default() with concrete type defaults in gpu.rs
- Use clone_from instead of reassigning clone in ws-gateway
- Inline format string variables where possible
- Replace map().unwrap_or() with map_or()
- Fix redundant closure, unnecessary struct building
- Merge identical match arms, replace wildcard enum import
- Fix unnecessary boolean negation (flip if/else branches)
- Change &Option<T> to Option<&T> in public API
- Change &mut to & where mutability not needed
- Use f32::from(u8), u64::from(u32), std::iter::once()
- Replace Debug formatting {:?} with Display (.display()) for PathBuf
Test/bench code allows:
- Add #[allow] for unwrap_used, expect_used, panic on test modules
- Add #![allow] for bench and integration test files
- Add targeted #[allow] for safe rendering casts and GPU init
These test files only compile under `det_fixed` and `golden_prng` features respectively, so they were missed during the default-feature clippy pass. - dfix64_tests.rs: allow float_cmp (exact fixed-point comparisons) - prng_golden_regression.rs: allow missing_docs (test file)
Uppercase hex input from external tools was incorrectly treated as a mismatch since state_root_hex is always lowercase from hex::encode. Addresses PR #290 review feedback.
- Mark P0 Lock the Hashes and Developer CLI as Verified - Resequence: Proof Core (P1) now gates First Light (P2) - First Light demoted from P1→P2 (prove correctness before demoing) - Time Travel / Convergence / game demos pushed to P3 - Update dependency DAG and cross-project notes
… script, roadmap updates - Add DETERMINISM_CLAIMS_v0.1.md documenting DET-001 through DET-005 - Add DET-004 (trig golden vectors) and DET-005 (torture rerun) to CLAIM_MAP.yaml - Create trig_golden_vectors.rs test with 2048-sample golden binary - Create scripts/torture-100-reruns.sh turnkey repro script - Add Trig Oracle CI gate to det-gates.yml (macOS + Linux) - Update proof-core roadmap: mark exit criteria, set milestone to In Progress
…nondeterminism Test file reads/writes golden vector binaries to verify determinism. This is test-only I/O, not simulation code.
|
Important Review skippedToo many files! This PR contains 185 files, which is 35 over the limit of 150. ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (2)
📒 Files selected for processing (185)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a85788bac6
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
crates/warp-cli/src/bench.rs
Outdated
| let entries = std::fs::read_dir(criterion_dir) | ||
| .with_context(|| format!("failed to read {}", criterion_dir.display()))?; | ||
|
|
||
| for entry in entries { |
There was a problem hiding this comment.
Recursively collect Criterion estimates files
collect_criterion_results only scans one directory level under target/criterion and then looks for <child>/new/estimates.json, but Criterion commonly stores grouped benchmarks in deeper paths (e.g. from benchmark_group/BenchmarkId), so successful cargo bench runs can still yield missing or empty echo-cli bench output. This makes the new bench command unreliable for most real benchmark layouts unless the scanner walks directories recursively (or parses Criterion’s report index).
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
✅ Addressed in 0b22e53 — collect_criterion_results now walks the directory tree recursively, finding new/estimates.json at any depth. Benchmark names are derived from the path relative to the criterion root (e.g. group/bench_a). Added tests for grouped and parameterised layouts.
crates/warp-cli/src/bench.rs
Outdated
| if let Some(f) = filter { | ||
| if !bench_name.contains(f) { | ||
| continue; |
There was a problem hiding this comment.
Apply bench filter with Criterion regex semantics
The results pass re-filters benchmark names with contains, but build_bench_command forwards the same --filter value to Criterion as a regex (cargo bench ... -- <pattern>). Any valid regex filter containing anchors or metacharacters can execute benchmarks successfully yet be dropped here, producing false "no results"/partial reports. Reusing regex matching (or skipping this second filter) would keep post-processing aligned with what Criterion actually ran.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
✅ Addressed in 0b22e53 — Post-filter now uses regex::Regex instead of contains, matching Criterion's own regex semantics. Added a test verifying that anchors (^hotpath, ^coldpath_alloc$) work correctly.
The evidence policy requires bidirectional sync between CLAIM_MAP.yaml and evidence.json. Wire new claims to their CI artifact sources.
- collect_criterion_results now walks directories recursively to find grouped and parameterised benchmarks (benchmark_group, BenchmarkId) - Post-filter uses regex::Regex to match Criterion's own regex semantics instead of substring contains - Add tests: nested groups, parameterised benchmarks, regex anchors Addresses Codex review comments (P1, P2) on PR #291.
|
@codex review |
|
@coderabbitai review please 🥕 |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7cc2b177cc
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| - **Honor the Vision**: Echo is a deterministic, multiverse-aware ECS. Consult `docs/architecture-outline.md` before touching runtime code. | ||
| - **Document Ruthlessly**: Every meaningful design choice should land in `docs/` (specs, diagrams, ADRs) or PR descriptions. | ||
| - **Docstrings Aren't Optional**: Public APIs across crates (`warp-core`, `warp-ffi`, `warp-wasm`, etc.) must carry rustdoc comments that explain intent, invariants, and usage. Treat missing docs as a failing test. | ||
| - **Docstrings Aren't Optional**: Public APIs across crates (`warp-core`, `warp-wasm`, etc.) must carry rustdoc comments that explain intent, invariants, and usage. Treat missing docs as a failing test. |
There was a problem hiding this comment.
Preserve AGENTS.md as append-only
This edit rewrites an existing line in AGENTS.md, but the repository’s append-only gate rejects any deletions/line replacements in that file (scripts/check-append-only.js errors when removed > 0 for AGENTS.md). I verified this commit against its parent with node scripts/check-append-only.js --base <parent> and it fails, so this change will block the documented pre-merge invariant check.
Useful? React with 👍 / 👎.
| - Evidence: Inferred: Epic completion depends on constituent task | ||
|
|
||
| ## [#21: Spec: Security Contexts (FFI/WASM/CLI)](https://github.com/flyingrobots/echo/issues/21) | ||
| ## [#21: Spec: Security Contexts (WASM/CLI)](https://github.com/flyingrobots/echo/issues/21) |
There was a problem hiding this comment.
This commit removes/replaces existing TASKS-DAG entries, but scripts/check-append-only.js enforces the same append-only rule for TASKS-DAG.md and fails whenever deletions are present. Running the checker against this commit’s parent reports deletions in this file, so the append-only policy gate will fail until these edits are converted to pure append operations.
Useful? React with 👍 / 👎.
Summary
trig_golden_vectors.rs) with a 2048-sample golden binary that locks downdfix64sin/cos/tan outputs across platformsscripts/torture-100-reruns.sh— turnkey repro script that runs 100 sequential simulations and asserts identical hashesdet-gates.ymlfor macOS and Linux, with log artifactsTest plan
cargo testpasses locally (all tests green)cargo clippyclean (zero warnings)det-gatesworkflow passes on GitHub Actions (macOS + Linux)