feat: TTD Hardening Sprint S1 — Gates and Evidence Integrity#283
feat: TTD Hardening Sprint S1 — Gates and Evidence Integrity#283flyingrobots merged 26 commits intomainfrom
Conversation
- Established det-policy.yaml for path-aware CI gate triggering. - Implemented .github/workflows/det-gates.yml with G1-G4 hardening gates. - Added negative security tests to echo-scene-codec for decoder robustness. - Created materialization_hotpath Criterion benchmark in warp-benches (G3). - Added evidence generation and validation scripts to enforce artifact-backed VERIFIED claims. - Published RELEASE_POLICY.md and ROLLBACK_TTD.md for governance. - Updated ECHO_ROADMAP.md to track active hardening progress.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
Important Review skippedThis PR was authored by the user configured for CodeRabbit reviews. By default, CodeRabbit skips reviewing PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback. To trigger a single review, invoke the You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughAdds a multi-stage determinism gating system: new GitHub Actions workflow that classifies changed files against det-policy, conditionally runs gated CI jobs (determinism, security, perf, repro, reproducibility), collects and uploads artifacts, and generates+validates an evidence pack; includes policy, claim maps, scripts, tests, benchmarks, and docs. Changes
Sequence Diagram(s)sequenceDiagram
participant PR as Pull Request
participant GH as GitHub Actions
participant Classifier as classify-changes.cjs
participant Policy as det-policy.yaml
participant GateJobs as Gated Jobs
participant Artifacts as Artifact Storage
participant Evidence as generate_evidence.cjs
participant Validator as validate_claims.cjs
PR->>GH: trigger det-gates workflow
GH->>Classifier: run classify-changes job
Classifier->>Policy: load policy & match changed files
Policy-->>Classifier: return classification (run_full/run_reduced/run_none)
Classifier-->>GH: set job outputs/conditions
GH->>GateJobs: run enabled gate jobs (determinism, static, security, perf, repro, build-repro)
GateJobs->>Artifacts: upload artifact groups
GH->>Evidence: gather artifacts and invoke generator
Evidence->>Validator: produce and validate evidence.json
Validator-->>GH: PASS / FAIL
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
|
@coderabbitai review please 🥕 |
|
Expect exhaustive analysis on:
No stone unturned, no landmine undetected. 🥕 🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 25
🤖 Fix all issues with AI agents
In @.github/workflows/det-gates.yml:
- Around line 150-167: The build-repro job currently runs "cargo check" which
only type-checks and produces no artifacts (see job name build-repro and
wasm-check.log), so either rename the job to reflect compilation-only or
implement true reproducibility: change the steps to run a deterministic build
(e.g., cargo build --release --target wasm32-unknown-unknown) twice in isolated
steps, capture the built artifacts (the target/.../wasm files) into distinct
logs/artifact paths, and add a comparison step (e.g., sha256sum both outputs or
diff the files) that fails the job if hashes differ; also update the Upload
build artifacts step to upload the actual build outputs rather than
wasm-check.log.
- Around line 169-187: The validate-evidence job (validate-evidence) currently
runs ./scripts/generate_evidence.cjs and ./scripts/validate_claims.cjs but never
downloads upstream artifacts, so referenced artifact names like
det-linux-artifacts and sec-artifacts may be missing; update the job to use
actions/download-artifact to fetch the expected artifacts (e.g.,
det-linux-artifacts, sec-artifacts) before running validation, and add a
verification step that checks the downloaded files/directories exist (return
non-zero or fail the job) so scripts/validate_claims.cjs can assert the
artifacts truly exist rather than only validating JSON structure.
- Around line 38-46: The run block interpolates `${{ github.base_ref }}`
directly into shell commands (`git fetch origin ${{ github.base_ref }}
--depth=1` and `git diff --name-only origin/${{ github.base_ref }}...HEAD >
changed.txt`), which allows script injection via crafted branch names; fix by
exposing the value as an environment variable (e.g., `BASE_REF`) in the workflow
`env` and then reference it inside the run block as a shell variable (e.g.,
"$BASE_REF") so the shell treats it as data, and ensure you always quote the
variable when used in `git fetch`/`git diff` to avoid word-splitting or
globbing.
In `@crates/echo-scene-codec/src/cbor.rs`:
- Around line 1029-1044: The test reject_invalid_enum_tags currently checks
invalid tags for NodeShape and EdgeStyle but misses the identical invalid-tag
paths in decode_projection_kind and decode_label_anchor; update that test (or
add a new one) to also encode an out-of-range u8 (e.g., 2) and assert an Err
from decode_projection_kind(&mut Decoder::new(&buf)) and likewise for
decode_label_anchor(&mut Decoder::new(&buf)) so both decoder functions are
explicitly covered for invalid enum tags.
- Around line 986-1015: The test reject_exceeding_max_ops allocates and encodes
MAX_OPS+1 SceneOp entries unnecessarily; change it to only build the minimal
CBOR header and ops-array-length prefix so the decoder's early check triggers
without encoding the ops body. In the reject_exceeding_max_ops test replace the
loop and heavy Vec usage with writing the version, session, cursor, epoch, and
then Encoder::array((MAX_OPS + 1) as u64) (i.e. the ops array header) and skip
calling encode_scene_op or pushing SceneOp entries, then call decode_scene_delta
and assert the same error; keep references to MAX_OPS, Encoder::array,
decode_scene_delta, and the test name to locate the change.
In `@crates/warp-benches/benches/materialization_hotpath.rs`:
- Around line 19-30: The benchmarks currently clone the 64-byte Vec payload
inside the hot loop (e.g., in the closure for materialization_emit_log_1000
calling bus.emit with payload.clone()), which measures allocation cost rather
than pure MaterializationBus::emit throughput; fix by removing payload.clone()
from the inner loop and either pre-allocating a reusable payload (reuse the same
Vec or pre-create a Vec/Bytes per iteration outside the timing), or switch the
payload type to bytes::Bytes so clone is a cheap refcount bump, and apply the
same change to the other three benchmarks that use payload.clone(); if you
intentionally want to include allocation cost, instead add a comment documenting
that the benchmark measures allocation+emit.
In `@det-policy.yaml`:
- Around line 128-143: The warp-benches entry currently classed as
DET_NONCRITICAL has no gates; update the warp-benches block (the "warp-benches"
YAML mapping) to include required_gates: [G3] so changes to the benchmark crate
trigger a performance validation (e.g., add required_gates: ["G3"] under the
warp-benches key while keeping other fields like class and owner_role intact).
- Around line 7-15: The policy omits G3 (Performance) from all tiers so
classify_changes.cjs will never trigger the G3 gate; either add G3 to the
required_gates for the appropriate classifications (e.g., include G3 in
DET_CRITICAL required_gates) if G3 should be path-triggered, or mark G3 as
always-run by adding an explanatory comment to det-policy.yaml and update the CI
workflow to run the G3 job unconditionally (and update RELEASE_POLICY.md to
reflect which choice); adjust DET_CRITICAL/DET_IMPORTANT/DET_NONCRITICAL
required_gates or CI job configuration and corresponding docs (referencing
DET_CRITICAL, DET_IMPORTANT, DET_NONCRITICAL, G3, classify_changes.cjs, and
RELEASE_POLICY.md) accordingly.
- Around line 145-153: classify_changes.cjs currently ignores
require_full_classification and defaults unmatched files to DET_NONCRITICAL;
update classify_changes.cjs to read require_full_classification from
det-policy.yaml and fail/emit an error (non-zero exit) when a changed file/crate
has no explicit classification (i.e., no matching crates entry) instead of
defaulting to DET_NONCRITICAL; also have validate_det_policy.cjs load
deny_patterns from det-policy.yaml and either validate that
ban-nondeterminism.sh and ban-unordered-abi.sh consume those patterns or fail
the policy check if deny_patterns exist but are not enforced; finally refactor
ban-nondeterminism.sh and ban-unordered-abi.sh to accept deny_patterns (e.g.,
via env var or a generated file) instead of hardcoded crate lists, or remove
deny_patterns from det-policy.yaml if you choose not to support dynamic deny
pattern enforcement.
In `@docs/determinism/CLAIM_MAP.yaml`:
- Around line 19-31: CLAIM_MAP.yaml currently defines SEC-001 and SEC-002 but is
missing governance claims for the remaining negative test controls referenced by
sec-claim-map.json; update CLAIM_MAP.yaml to add SEC-003, SEC-004, and SEC-005
(or a consolidated broader claim) that correspond to "truncated payload
rejection", "bad version handling", and "invalid enum tag rejection" so
validate_claims.cjs finds a claim for each mapped test control (use the exact
claim IDs SEC-003..SEC-005 and include statement and required_evidence entries
consistent with SEC-001/SEC-002 formats to match sec-claim-map.json mappings).
- Around line 5-10: DET-001 declares required_evidence: static_inspection but
there is no enforcement producing that artifact; add a CI/static step that runs
a deterministic deny-pattern check over the echo-wasm-abi source and emits a
static_inspection artifact. Concretely, implement a script or CI job referenced
by det-policy.yaml's deny_patterns for HashMap/HashSet that greps or runs
clippy/rs-lint over the echo-wasm-abi codebase, fail the job on matches, and
upload a machine-readable report (e.g., JSON) as the static_inspection artifact
so DET-001 can be VERIFIED; tie the job to the policy entry for DET-001 so the
pipeline knows to produce the required artifact.
In `@docs/RELEASE_POLICY.md`:
- Around line 39-41: Add explicit definitions in RELEASE_POLICY.md for the gate
states "INFERRED" and "UNVERIFIED" and a clear definition of what an "approved
closeout plan" is, and specify who can approve it (e.g., reference the
owner_role from det-policy.yaml or name the exact role/team). Either add a brief
glossary section in this file or link to an existing glossary/policy, and ensure
the wording clearly distinguishes INFERRED (evidence-based inference without
direct verification) from UNVERIFIED (no supporting evidence) and states the
approval authority and required contents of a valid closeout plan.
In `@package.json`:
- Line 17: Update the js-yaml dependency floor version to ensure the vulnerable
4.1.0 isn't used; in package.json change the dependency entry "js-yaml":
"^4.1.0" to "js-yaml": "^4.1.1" so installations will require at least 4.1.1
(which contains the CVE-2025-64718 fix).
In `@scripts/classify_changes.cjs`:
- Around line 15-22: Inside classifyChanges, replace the direct process.exit(1)
call with a thrown Error (or return a rejected Promise) so the function is
testable and calling code can decide exit behavior; specifically throw a new
Error that includes policyPath context instead of calling process.exit. Also add
an existence check for changedFilesPath (fs.existsSync or try/catch around
fs.readFileSync) before reading it and throw a descriptive Error (including
changedFilesPath) if it's missing or unreadable so CI gets a clear failure
message; update callers of classifyChanges to handle the thrown errors
appropriately.
- Around line 6-13: The matches function currently corrupts the '**' replacement
when later replacing '*' — update matches(file, pattern) to protect '**' first
by replacing it with a unique placeholder (e.g., '__GLOB_DBL__'), then escape
dots and replace single '*' with '[^/]*', and finally replace the placeholder
with '.*' to restore the recursive directory semantics; ensure you still wrap
the result with '^' and '$' and construct the RegExp from the final string.
In `@scripts/generate_evidence.cjs`:
- Around line 6-28: The claims array currently unconditionally sets DET-002 and
SEC-001 to 'VERIFIED'; update generate_evidence.cjs so each claim's status is
computed instead: accept gate outcomes (e.g. inputs or environment vars like
determinismOutcome, securityOutcome, performanceOutcome, reproducibilityOutcome)
or probe artifact existence for each artifact_name (e.g. check the artifact via
the CI API or filesystem) and set status to 'VERIFIED' only if the corresponding
gate succeeded or the artifact exists, otherwise set to 'FAILED' or 'SKIPPED';
also add entries for the missing G3 and G4 claims (e.g. IDs for Performance and
Build Reproducibility) using the same conditional status logic and reuse the
existing workflow, runId, commitSha and artifact_name symbols when building each
evidence object.
- Line 3: The import const path = require('path') is unused dead code; either
remove that require statement or actually use the path module when constructing
the output filename (e.g., use path.join to build the path to evidence.json).
Locate the require('path') binding named path in the top of
scripts/generate_evidence.cjs and either delete that line or replace direct
string file paths with path.join(...) when writing evidence.json so the path
module is meaningfully used.
- Around line 5-42: The artifactsDir parameter is accepted by generateEvidence
(and populated from process.argv[2]) but ignored when writing output; update
generateEvidence to use artifactsDir to build the output path (e.g., join
artifactsDir with 'evidence.json') instead of the hardcoded 'evidence.json', and
call fs.writeFileSync with that path; ensure you reference the generateEvidence
function, the artifactsDir parameter, and the fs.writeFileSync invocation when
making the change (optionally ensure the directory exists before writing), or
alternatively remove artifactsDir from the signature and callers if you decide
not to support a custom output directory.
In `@scripts/validate_claims.cjs`:
- Around line 15-25: The validator currently only checks presence of
requiredFields on claim.evidence (in the data.claims loop) but not their
semantic validity; update the validation inside the loop for VERIFIED claims to
(1) enforce commit_sha matches /^[0-9a-f]{40}$/ and reject the literal "local"
when CI/GITHUB_SHA is expected, (2) enforce run_id is numeric in CI contexts (or
reject "local"), (3) ensure workflow and artifact_name are non-empty and not the
default "local"/placeholder, and (4) push descriptive violations to violations
array when these checks fail; use the existing data.claims, claim.evidence and
requiredFields references to locate and extend the checks.
- Around line 15-25: Validate that data.claims is an array before iterating: add
an Array.isArray(data.claims) guard around the loop (or early-return/push a
violation if it's not an array) so that for (const claim of data.claims) only
runs on real arrays; when skipping/flagging malformed input, reference
requiredFields, violations, claim.id and claim.evidence semantics so downstream
checks remain the same.
- Around line 5-8: The current check in scripts/validate_claims.cjs that handles
a missing evidence file (the if (!fs.existsSync(evidenceFile)) block) should
fail hard instead of returning true; change the behavior to log an error
(process.stderr or console.error) and return false so CI fails when
evidence.json is absent. If you want configurability, add a --strict flag to the
validation entrypoint (parse args in the main function) and only allow a
non-fatal warning when strict mode is explicitly disabled; otherwise default to
the hard-fail behavior.
In `@scripts/validate_det_policy.cjs`:
- Around line 55-59: validateDetPolicy is only invoked when the script is run as
main and is not exported, preventing importing for tests or reuse; modify
scripts/validate_det_policy.cjs to export the validator (e.g., attach
validateDetPolicy to module.exports or export it as a named export) so other
modules and unit tests can require/import validateDetPolicy, while keeping the
existing if (require.main === module) CLI invocation intact.
- Around line 32-44: Validate that each crate entry has an explicit class and a
well-formed paths array before using them: inside the for loop iterating over
Object.entries(crates) check that crateInfo.class is defined and non-empty and
if missing emit a clear error like "Crate <name> missing class" and return false
(instead of the current "unknown class undefined" path), then validate
crateInfo.paths exists and Array.isArray(crateInfo.paths) (and optionally
non-empty) and if not emit a clear error like "Crate <name> missing or invalid
paths" and return false; keep the existing DET_CRITICAL owner_role check
(policy.require_owners_for_critical && cls === 'DET_CRITICAL') but only after
the class presence check so you reference a valid cls.
- Around line 24-29: The loop that checks classes for presence of required_gates
(iterating over Object.entries(classes)) does not validate that each gate name
is one of the allowed gates, so typos like G5 slip through; update the
validation inside that loop to iterate each value in classInfo.required_gates
and ensure it exists in a canonical set (e.g., ALLOWED_GATES = new
Set(["G1","G2","G3","G4"])) or by cross-referencing defined gates, and if any
unknown gate is found log an error (include className and invalid gate name) and
return false; keep the existing missing-key check but add this membership check
for required_gates to the same block.
In `@sec-claim-map.json`:
- Around line 1-30: The sec-claim-map.json file is placed at the repo root while
its companion governance file CLAIM_MAP.yaml lives in docs/determinism/, which
makes consumers hunt for paired artifacts and loses the ability to annotate when
using JSON; move sec-claim-map.json into the docs/determinism/ directory
alongside CLAIM_MAP.yaml (or rename/migrate it to sec-claim-map.yaml) and update
any tooling/paths that reference sec-claim-map.json; additionally add a
top-level cross-reference comment/field in the claim map (or a short README)
pointing to det-policy.yaml and CLAIM_MAP.yaml so consumers can find both
artifacts and preserve annotations if you convert to YAML.
- Hardened CI workflow against script injection. - Implemented actual reproducible WASM build checks (G4). - Added static inspection job for DET-001 (HashMap usage). - Optimized materialization benchmarks by removing allocation cost from hot loops. - Expanded CBOR decoder negative tests for missing enum tags and MAX_OPS check. - Updated evidence generation to verify artifact existence and support conditional claims. - Formalized gate states and closeout plan definitions in RELEASE_POLICY.md. - Improved script validation logic and docstring coverage. - Moved sec-claim-map.json to docs/determinism/ for better alignment.
|
@coderabbitai review please |
|
Zero tolerance for handwaving. Every assertion will be challenged. 🔥🥕 🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 23
🤖 Fix all issues with AI agents
In @.github/workflows/det-gates.yml:
- Around line 62-65: Replace inline expression interpolation of github.sha and
github.run_id inside the run: shell blocks with environment variables: add env
entries (e.g., DIGEST_SHA: ${{ github.sha }}, RUN_ID: ${{ github.run_id }}) for
the steps that write artifacts/digest-table.csv and then reference those env
vars in the shell script (use $DIGEST_SHA and $RUN_ID) so both occurrences that
compute the sha256 of dind-report.json use the env-injected values rather than
${ { github.sha } }/${ { github.run_id } } inline; make the same change for the
second occurrence as well.
- Around line 120-122: The current "Create report" step unconditionally writes
{"claim_id":"DET-001","status":"PASSED"}; change it to capture the exit code and
output of ban-nondeterminism.sh (or parse its stdout/stderr) and emit a JSON
report into static-inspection.json that includes the exit_code, raw_output (or
summary), and a computed status (e.g., PASSED if exit_code==0 and no warnings,
WARN if exit_code==0 with warnings/partial matches, FAILED if exit_code!=0).
Update the step that runs ban-nondeterminism.sh to tee or save its output to a
variable/file, inspect that content for warning markers and the tool's exit
code, then write the dynamic JSON (with claim_id "DET-001") reflecting those
values instead of the hardcoded PASSED.
- Around line 115-119: DETERMINISM_PATHS in the Run determinism check step
currently only targets "crates/echo-wasm-abi", leaving other DET_CRITICAL crates
uninspected; update the workflow so DETERMINISM_PATHS includes all crates listed
as DET_CRITICAL in det-policy.yaml (e.g., warp-core, warp-wasm, warp-geom,
warp-ffi, echo-scene-port, echo-scene-codec, echo-graph, echo-ttd, etc.), or
alternatively add separate determinism check steps/claims per crate so DET-001’s
scope aligns with the policy; ensure the DETERMINISM_PATHS environment variable
(or added steps) matches det-policy.yaml entries and that the
ban-nondeterminism.sh invocation will inspect each listed path.
- Around line 213-238: The validate-evidence job currently always runs but
unconditionally fails when full-tier artifacts are absent; update its artifact
presence checks to be conditional on the classification outputs (or gate the
entire job to only run for full runs). Specifically, in the validate-evidence
job adjust the shell step that verifies gathered-artifacts to only test for
det-linux-artifacts, perf-artifacts, static-inspection, and
build-repro-artifacts when needs.classify-changes.outputs.run_full == 'true' (or
conversely use if: needs.classify-changes.outputs.run_full == 'true' on the
whole validate-evidence job) so run_reduced cases (where only decoder-security
and build-repro run) do not hard-fail.
In `@crates/echo-scene-codec/src/cbor.rs`:
- Around line 1007-1017: The test reject_invalid_version currently writes
array(5) but only two CBOR items, relying on decode_scene_delta_inner's current
ordering to catch the unsupported version; instead, after Encoder::new(&mut buf)
and encoder.array(5).unwrap() and encoder.u8(99).unwrap(), write valid dummy
encodings for the remaining four fields expected by
decode_scene_delta/decode_scene_delta_inner (use the correct types expected for
session_id, timestamp, sequence, delta — e.g., empty/zero values or empty
bytes/strings as appropriate) so the buffer contains a full 5-field array and
the decoder will fail strictly on the version check regardless of field-read
ordering. Ensure this change is made in the reject_invalid_version test that
calls decode_scene_delta so the failure is unambiguously due to the version
value.
In `@crates/warp-benches/benches/materialization_hotpath.rs`:
- Around line 27-33: The benchmark currently swallows errors by using "let _ ="
on the bus.emit(...) calls, which can hide emit failures and corrupt benchmark
validity; update the emit invocations in the materialization_hotpath benchmarks
to assert success (e.g., call unwrap() on the Result returned by bus.emit) for
all emits that feed the benchmark (leave finalization calls that are being
measured as "let _ =" if desired), locating the calls that use
bus.emit(black_box(ch), black_box(EmitKey::new(h(i as u64), 1)),
black_box(p.clone())) and replacing the silent discard with an unwrap so any Err
from bus.emit surfaces immediately.
In `@det-policy.yaml`:
- Around line 140-147: The policy mistakenly classifies "tests/**" and "e2e/**"
under the docs entry (docs.paths) as DET_NONCRITICAL which lets changes to
determinism tests bypass gates; remove "tests/**" and "e2e/**" from the
docs.paths array and add a new top-level policy entry (or update the relevant
crate-level entries) that classifies those patterns as DET_IMPORTANT or matches
the DET_CRITICAL level of the code they test (reference the docs entry and its
paths array and the echo-dry-tests entry for structure when adding the new
entry).
- Around line 76-79: The current det-policy entry for the "ci" crate marks a
broad set of paths as DET_CRITICAL (ci.class = DET_CRITICAL) — including
"scripts/**", ".github/workflows/**", "det-policy.yaml", "Makefile", and
"xtask/**" — which forces G1+G2+G3+G4 on trivial script edits; narrow the blast
radius by either (a) changing ci.class from DET_CRITICAL to a less strict class
for non-determinism-sensitive paths, or (b) splitting the single "ci" entry into
two entries (e.g., "ci" for critical items like ".github/workflows/**" and
"ci-docs" or "ci-tools" for non-critical scripts such as "scripts/**" and
"xtask/**") and assigning a lower classification to the non-critical entry;
update the paths array in det-policy.yaml accordingly so only true
determinism-sensitive files remain DET_CRITICAL.
- Around line 148-152: The crate-level required_gates override for
"warp-benches" is ignored because classify_changes.cjs only reads
crateInfo.class; update classify_changes.cjs to also read and merge
crateInfo.required_gates when matching a crate (e.g., where it reads
crateInfo.class), so that required_gates from the det-policy entry for
warp-benches (required_gates: ["G3"]) are applied, or remove the override from
det-policy and set warp-benches to DET_CRITICAL if G3 should always apply;
specifically modify the logic that processes matchedCrates / crateInfo (in
classify_changes.cjs) to append/merge crateInfo.required_gates into the computed
requiredGates instead of ignoring it.
In `@docs/determinism/CLAIM_MAP.yaml`:
- Around line 1-59: Add a claim for the build-repro gate and wire it into
evidence generation: create a new claim ID (e.g. REPRO-001) in CLAIM_MAP.yaml
with a clear statement like "Dual WASM build reproducibility verified" including
required_evidence types (ci_artifact, behavior_test or static_inspection as
appropriate) and owner_role, then update generate_evidence.cjs to emit that
claim when det-gates.yml's build-repro job completes (ensure the job's artifact
name build-repro-artifacts is referenced and the hash-comparison result maps to
REPRO-001); alternatively, if reproducibility is intentionally excluded, add a
documented exclusion entry in CLAIM_MAP.yaml referencing
det-gates.yml/build-repro and explain why so the evidence pipeline is explicit.
In `@docs/RELEASE_POLICY.md`:
- Around line 31-35: The release policy currently lists G3 under
production_blockers but omits it from staging_blockers, which allows performance
regressions to pass staging; update the release_policy YAML so G3 is included in
staging_blockers (i.e., change staging_blockers: [G1, G2, G4] to
staging_blockers: [G1, G2, G3, G4]) or, if omission is intentional, add a short
rationale comment in RELEASE_POLICY.md documenting why G3 is staging-optional
and the conditions that justify that trade-off.
In `@docs/ROLLBACK_TTD.md`:
- Around line 29-36: Update the revert sequence instructions to handle merge
commits: after listing the commits (e201c9b, fd98b91, ce98d80, a02ea86, 3187e6a,
6e34a77, f138b8a) add a note that if any listed commit is a merge commit the
operator must run git revert with a mainline parent flag (e.g., "git revert -m 1
<sha>") and explain to use -m 1 to pick the first parent as the mainline; ensure
this guidance is adjacent to the revert list so operators see it before running
git revert.
- Around line 73-79: The Post-Rollback Evidence Packet section lists required
contents but omits where to file them; update the "Post-Rollback Evidence Packet
(required)" section to add explicit filing instructions: specify that the packet
must be attached to the incident ticket and linked in the rollback PR
description, uploaded as a CI artifact for traceability, and added to the
incident runbook/knowledge-base entry; mention required naming conventions
(e.g., incident-<ID>-post-rollback-evidence) and retention/location guidance so
reviewers can find it easily.
In `@ECHO_ROADMAP.md`:
- Around line 6-15: The G4 entry in the roadmap is inconsistent with
RELEASE_POLICY.md: update the G4 line in ECHO_ROADMAP.md (currently "Enforce
artifact-backed VERIFIED claims and path-aware gates") to match the
RELEASE_POLICY.md definition of G4 as "Build Reproducibility" (e.g., "G4 (REP):
Reproducible deterministic build constraints validated in CI"), and move the
artifact-backed VERIFIED claims and path-aware gates text into a cross-cutting
GOV or separate bullet (or note) so evidence enforcement remains clearly
identified as a cross-cutting concern rather than the G4 description.
In `@scripts/classify_changes.cjs`:
- Around line 56-63: The loop that picks the highest class silently treats
unknown crateInfo.class values as lower priority because classPriority[cls] can
be undefined; update the logic in the block using matches(file, pattern) so it
validates cls before comparing (e.g., check that
classPriority.hasOwnProperty(cls) or classPriority[cls] !== undefined), and if
the class is unknown throw or log a hard failure (exit non-zero) mentioning the
crate and invalid class; modify the code paths around classPriority,
crateInfo.class, and maxClass so unknown classifications do not fall through
silently but instead cause an explicit error.
- Around line 67-70: The code in classifyChanges uses process.exit(1) when
requireFull && !matched which prevents the outer try/catch from handling the
error and makes tests impossible; change that process.exit(1) to throw new
Error(...) with the same message so the classifyChanges function consistently
throws on error (matching the behavior at lines where missing files use throw)
and the require.main try/catch around the call can intercept and handle/report
the error; update any unit tests or callers that relied on process.exit if
present.
In `@scripts/generate_evidence.cjs`:
- Around line 16-23: The current checkArtifact function only verifies a
directory is non-empty; update it to validate presence of required artifact
files instead: in checkArtifact(name) (using gatheredArtifactsDir and fullPath)
replace the generic readdir/length check with explicit existence checks for a
list of expected filenames (e.g., "sec-tests.log", "static-inspection.json" —
add any other required names) using fs.existsSync or fs.statSync for each target
file, return true only if all required files are present, and keep the try/catch
to return false on any filesystem errors.
- Around line 36-60: The five SEC claims (SEC-001 .. SEC-005) all use the same
directory existence check via checkArtifact('sec-artifacts'), which falsely
marks all controls VERIFIED when any artifact exists; update
generate_evidence.cjs to perform per-claim verification instead: for each claim
ID (SEC-001, SEC-002, SEC-003, SEC-004, SEC-005) replace the generic
checkArtifact call with a specific check that either looks for a dedicated
artifact file (e.g., sec-001.json / sec-001.log) or parses a central
sec-claim-map.json/test results to confirm that that particular control passed,
set status to VERIFIED only if that specific artifact/result indicates success,
and update the evidence.evidence.artifact_name to reference the specific
artifact or include parsed result metadata so each claim proves its own outcome.
- Around line 83-86: The script currently defines generateEvidence and invokes
it when run as main but does not export it, making it inconsistent with sibling
scripts like validate_det_policy.cjs and hard to test; update the file to export
the main function by assigning module.exports = { generateEvidence } (or
module.exports.generateEvidence = generateEvidence) so tests and other modules
can require and call generateEvidence; keep the existing require.main === module
invocation intact so CLI behavior is unchanged.
In `@scripts/validate_claims.cjs`:
- Around line 40-48: The current validation allows evidence.run_id === 'local'
but only flags 'local' for evidence.workflow and evidence.artifact_name in CI;
add CI-time checks for run_id and commit_sha as well. Specifically, after the
existing run_id format check and alongside the workflow/artifact_name checks
that use process.env.GITHUB_ACTIONS and push to violations, add conditions that
if evidence.run_id === 'local' or evidence.commit_sha === 'local' (and
process.env.GITHUB_ACTIONS is truthy) then push a violation like `Claim
${claim.id} has placeholder evidence ('local') in CI environment.` so VERIFIED
claims with run_id or commit_sha='local' are caught. Ensure you reference the
same violations array and use the existing logging/message wording for
consistency.
- Line 30: The filter is using a falsy check which treats 0/""/false as missing;
update the predicate in the requiredFields.filter call to explicitly check for
null/undefined (e.g., requiredFields.filter(f => evidence[f] == null) or
evidence[f] === undefined) so only absent or null values are flagged; target the
expression that constructs missing (the requiredFields.filter callback that
references evidence[f]).
In `@scripts/validate_det_policy.cjs`:
- Around line 32-42: The loop over classInfo.required_gates assumes it's an
array but doesn't validate types, so string values iterate per-character or
empty string silently; update the validation in the iteration over
Object.entries(classes) to first assert that classInfo.required_gates is an
Array (e.g., using Array.isArray(classInfo.required_gates)); if it's missing or
not an array, log a clear error like "Class <name> required_gates must be an
array" and return false; only after that use for (const gate of
classInfo.required_gates) and keep the existing ALLOWED_GATES membership checks
to reject invalid gates.
- Around line 21-24: The error/success messages hardcode filenames instead of
using the actual input path; update the console messages to use the filePath
variable: change the version check error in the block that tests data.version
!== 1 (and currently prints 'Error: Invalid version in det-policy.yaml') to
include filePath, and change the final success message (currently
'det-policy.json is valid.') to include filePath as well so both messages
reflect the actual file being validated (refer to the filePath variable and the
data.version check).
scripts/validate_claims.cjs
Outdated
| if (!/^\d+$/.test(String(evidence.run_id)) && evidence.run_id !== 'local') { | ||
| violations.push(`Claim ${claim.id} has invalid run_id: ${evidence.run_id}`); | ||
| } | ||
| if (evidence.workflow === 'local' || evidence.artifact_name === 'local') { | ||
| // Warning or violation depending on CI context | ||
| if (process.env.GITHUB_ACTIONS) { | ||
| violations.push(`Claim ${claim.id} has placeholder evidence ('local') in CI environment.`); | ||
| } | ||
| } |
There was a problem hiding this comment.
run_id === 'local' passes format validation but isn't flagged in CI context.
Line 40 explicitly allows 'local' as a valid run_id. Lines 43-47 check for placeholder 'local' values on workflow and artifact_name in CI, but not on run_id or commit_sha. A VERIFIED claim with run_id: "local" and commit_sha: "deadbeef..." in GitHub Actions will pass silently. That's an evidence integrity hole — the very thing this script exists to prevent.
Proposed fix — check all four fields for placeholder values in CI
- if (evidence.workflow === 'local' || evidence.artifact_name === 'local') {
- // Warning or violation depending on CI context
- if (process.env.GITHUB_ACTIONS) {
- violations.push(`Claim ${claim.id} has placeholder evidence ('local') in CI environment.`);
- }
- }
+ if (process.env.GITHUB_ACTIONS) {
+ const placeholders = requiredFields.filter(f => evidence[f] === 'local');
+ if (placeholders.length > 0) {
+ violations.push(`Claim ${claim.id} has placeholder evidence ('local') for [${placeholders.join(', ')}] in CI environment.`);
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (!/^\d+$/.test(String(evidence.run_id)) && evidence.run_id !== 'local') { | |
| violations.push(`Claim ${claim.id} has invalid run_id: ${evidence.run_id}`); | |
| } | |
| if (evidence.workflow === 'local' || evidence.artifact_name === 'local') { | |
| // Warning or violation depending on CI context | |
| if (process.env.GITHUB_ACTIONS) { | |
| violations.push(`Claim ${claim.id} has placeholder evidence ('local') in CI environment.`); | |
| } | |
| } | |
| if (!/^\d+$/.test(String(evidence.run_id)) && evidence.run_id !== 'local') { | |
| violations.push(`Claim ${claim.id} has invalid run_id: ${evidence.run_id}`); | |
| } | |
| if (process.env.GITHUB_ACTIONS) { | |
| const placeholders = requiredFields.filter(f => evidence[f] === 'local'); | |
| if (placeholders.length > 0) { | |
| violations.push(`Claim ${claim.id} has placeholder evidence ('local') for [${placeholders.join(', ')}] in CI environment.`); | |
| } | |
| } |
🤖 Prompt for AI Agents
In `@scripts/validate_claims.cjs` around lines 40 - 48, The current validation
allows evidence.run_id === 'local' but only flags 'local' for evidence.workflow
and evidence.artifact_name in CI; add CI-time checks for run_id and commit_sha
as well. Specifically, after the existing run_id format check and alongside the
workflow/artifact_name checks that use process.env.GITHUB_ACTIONS and push to
violations, add conditions that if evidence.run_id === 'local' or
evidence.commit_sha === 'local' (and process.env.GITHUB_ACTIONS is truthy) then
push a violation like `Claim ${claim.id} has placeholder evidence ('local') in
CI environment.` so VERIFIED claims with run_id or commit_sha='local' are
caught. Ensure you reference the same violations array and use the existing
logging/message wording for consistency.
…ctness - Complete script injection hardening (event_name, sha, run_id via env vars) - Expand DET-001 static inspection to all 14 DET_CRITICAL crates - Make static inspection report conditional on check outcome - Make validate-evidence artifact checks conditional on classification tier - Promote warp-benches to DET_IMPORTANT for reduced gate coverage - Replace process.exit(1) with throw in classify_changes.cjs - Replace let _ = with .unwrap() on benchmark emit calls - Update DET-001 claim statement to reflect expanded scope
|
@coderabbitai review please 🥕 |
Rate Limit Exceeded
|
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Fix all issues with AI agents
In @.github/workflows/det-gates.yml:
- Around line 247-260: The checks for gathered-artifacts/sec-artifacts and
gathered-artifacts/build-repro-artifacts are unconditional and will fail in
DET_NONCRITICAL where neither full nor reduced jobs run; update the job to
accept run_reduced output and make those two checks conditional on either
RUN_FULL or RUN_REDUCED being true. Add env: RUN_REDUCED: ${{
needs.classify-changes.outputs.run_reduced }} and change the two directory
checks to run only inside an if [ "$RUN_FULL" = "true" ] || [ "$RUN_REDUCED" =
"true" ] ; then ... fi block so that sec-artifacts and build-repro-artifacts are
only required when full or reduced runs are expected.
- Around line 229-239: The validate-evidence job currently lists
determinism-macos in needs but never checks for det-macos-artifacts; either add
a macOS artifact presence check under the same RUN_FULL conditional that checks
det-linux-artifacts/det-win-artifacts (i.e., mirror the existing artifact
verification logic to include det-macos-artifacts) or remove determinism-macos
from the needs list if those artifacts are not required for validation; update
the validate-evidence job to reference det-macos-artifacts (or drop
determinism-macos) so the workflow correctly gates on macOS determinism outputs.
In `@CHANGELOG.md`:
- Line 29: The changelog entry header "## [0.1.3] — 2026-02-15" should not
include a final release date on an open PR; update that header to either "##
[0.1.3] — Unreleased" or remove the date portion entirely so it reads "##
[0.1.3]" until the release actually lands, ensuring the CHANGELOG.md reflects
the correct unreleased state.
- Around line 6-28: The "Unreleased" section currently lists PR `#283`
review-feedback fixes that belong with the [0.1.3] release entry; merge the
bullet points under "Unreleased" (the items referencing CI Security, Static
Inspection Scope, Static Inspection Report, Evidence Validation, Policy
Classification, Script Quality, Benchmark Correctness, Claim Map) into the
existing [0.1.3] section (or alternatively remove the [0.1.3] header and keep
everything under Unreleased until release), ensuring PR `#283` is cited once under
the correct release entry and removing the duplicate entry to follow Keep a
Changelog conventions.
In `@crates/warp-benches/benches/materialization_hotpath.rs`:
- Around line 46-58: The benchmark uses the deprecated iter_with_setup; replace
the call to b.iter_with_setup in the "materialization_finalize_log_1000"
benchmark with b.iter_batched, passing the same setup closure (the loop emitting
payloads with bus.emit using ch, EmitKey::new(h(i as u64), 1), payloads) and the
same measurement closure (calling black_box(bus.finalize())), and supply
BatchSize::PerIteration as the third argument; also add the criterion::BatchSize
import if not present so the code compiles.
In `@det-policy.yaml`:
- Around line 76-79: The current "ci" policy entry assigns DET_CRITICAL to broad
paths (including scripts/** and xtask/**) causing excessive gate runs; update
the det-policy.yaml by keeping the existing "ci" entry but restrict its paths to
only truly critical items (e.g., ".github/workflows/**", "det-policy.yaml",
"Makefile") and create a new entry (e.g., "ci-tooling") that uses DET_IMPORTANT
for tooling paths like "scripts/**" and "xtask/**" (so edits such as
generate_evidence.cjs only trigger DET_IMPORTANT checks); ensure you reference
and rename the affected policy entries ("ci", new "ci-tooling") and the severity
classes (DET_CRITICAL -> retain for workflows/policy, DET_IMPORTANT -> use for
scripts/tooling).
In `@docs/determinism/CLAIM_MAP.yaml`:
- Around line 40-45: Update the SEC-004 claim entry so the "statement" is
explicit about the enforced behavior: replace the vague "Bad version handling
enforced." with a precise guarantee such as "CBOR payloads with unrecognized
version fields are rejected with an error." Ensure the SEC-004 block (keys:
statement, required_evidence, owner_role) is updated accordingly and that any
tests/artifacts referenced under required_evidence (behavior_test, ci_artifact)
align with this explicit rejection behavior.
- Around line 12-17: Rewrite DET-002's statement to be a concrete, falsifiable
invariant: replace the vague label "Float parity Rust/JS deterministic corpus"
with a clear pass condition such as "Given the canonical deterministic test
corpus, Rust and JS implementations produce bit-identical outputs for all
float-related transformations (e.g., canonicalize_f32) and for final serialized
results"; update required_evidence to require a behavior_test that compares
byte-for-byte outputs and a ci_artifact containing the corpus and diff logs, and
reference DET-002 in the claim text to ensure the test verifies exact parity
between implementations.
In `@scripts/classify_changes.cjs`:
- Around line 74-77: The current outputs (maxClass, run_full, run_reduced,
run_none) leave run_none=false when maxClass === 'DET_NONCRITICAL', causing
validate-evidence to run while downstream jobs skip; update
scripts/classify_changes.cjs so run_none becomes true when no gates need to fire
(e.g., set run_none = (changedFiles.length === 0) || (maxClass ===
'DET_NONCRITICAL')), or alternatively add a new output (run_gates) and change
det-gates.yml to gate validate-evidence on run_gates instead of run_none; update
references to maxClass, run_full, run_reduced, run_none (or the new run_gates)
and ensure det-gates.yml validate-evidence uses the new semantics.
| ci: | ||
| class: DET_CRITICAL | ||
| owner_role: "CI Engineer" | ||
| paths: [".github/workflows/**", "scripts/**", "det-policy.yaml", "Makefile", "xtask/**"] |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
The ci entry's blast radius is enormous — every script edit triggers G1+G2+G3+G4.
scripts/** is under DET_CRITICAL. Adding a comment to generate_evidence.cjs fires the full determinism + security + performance + reproducibility gate suite. This was flagged as a nitpick previously, but given that this PR adds 6 new scripts under scripts/, this is now a real friction concern. Every future tweak to evidence tooling, classification logic, or documentation helpers triggers 7+ CI jobs across Linux and macOS.
Consider splitting ci into ci-critical (workflows, policy, Makefile) and ci-tooling (scripts, xtask) with DET_IMPORTANT.
🤖 Prompt for AI Agents
In `@det-policy.yaml` around lines 76 - 79, The current "ci" policy entry assigns
DET_CRITICAL to broad paths (including scripts/** and xtask/**) causing
excessive gate runs; update the det-policy.yaml by keeping the existing "ci"
entry but restrict its paths to only truly critical items (e.g.,
".github/workflows/**", "det-policy.yaml", "Makefile") and create a new entry
(e.g., "ci-tooling") that uses DET_IMPORTANT for tooling paths like "scripts/**"
and "xtask/**" (so edits such as generate_evidence.cjs only trigger
DET_IMPORTANT checks); ensure you reference and rename the affected policy
entries ("ci", new "ci-tooling") and the severity classes (DET_CRITICAL ->
retain for workflows/policy, DET_IMPORTANT -> use for scripts/tooling).
There was a problem hiding this comment.
Acknowledged — Valid observation. The ci entry's DET_CRITICAL scope for scripts/** is intentionally broad: changes to evidence/classification scripts can undermine gate integrity. However, we've filed #284 (per-crate gate overrides) as a backlog item to enable finer-grained gate triggering. Splitting into ci-workflows (DET_CRITICAL) and ci-tooling (DET_IMPORTANT) is a reasonable future refinement.
There was a problem hiding this comment.
@flyingrobots, understood and acknowledged. Your security rationale is sound—evidence generation and classification scripts are gate-integrity-critical, and a rogue classify_changes.cjs edit could silently neuter the entire policy enforcement stack.
The interim friction is measurable (6 new scripts added in this PR alone), but #284 (per-crate gate overrides) provides a clear path forward. One residual question: do you have a rough timeline or trigger condition for prioritizing #284? If script churn becomes acute before then, you could consider a surgical mitigation—e.g., mark only classify_changes.cjs, validate_det_policy.cjs, and validate_claims.cjs as DET_CRITICAL (the policy/gate enforcement scripts), while allowing generate_evidence.cjs and documentation/reporting helpers to drop to DET_IMPORTANT. This would cut the blast radius by ~50% while preserving gate integrity for the true enforcement boundary.
But if #284 is near-term and the current friction is tolerable, the current design is defensible. Noted for future refinement.
🧠 Learnings used
Learnt from: flyingrobots
Repo: flyingrobots/echo PR: 283
File: det-policy.yaml:136-151
Timestamp: 2026-02-15T16:30:15.709Z
Learning: In the det-policy.yaml system, the classify_changes.cjs script only reads class-level `required_gates` definitions; crate-level `required_gates` fields are not consumed by the script and should not be added to individual crate entries.
Learnt from: CR
Repo: flyingrobots/echo PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-14T15:35:39.611Z
Learning: All determinism-critical crates (`warp-core`, `echo-wasm-abi`, `echo-scene-port`) must be compiled with `RUSTFLAGS="-Dwarnings"`; unused imports, dead code, or silenced lints are treated as build failures
Learnt from: CR
Repo: flyingrobots/echo PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-14T15:35:39.611Z
Learning: Applies to crates/{warp-core,warp-ffi,warp-wasm,echo-wasm-abi,echo-scene-port}/**/lib.rs : Public APIs across crates (`warp-core`, `warp-ffi`, `warp-wasm`, etc.) must carry rustdoc comments that explain intent, invariants, and usage
Learnt from: flyingrobots
Repo: flyingrobots/echo PR: 0
File: :0-0
Timestamp: 2026-01-24T23:22:15.388Z
Learning: The user (flyingrobots) prefers extremely detailed, pedantic code reviews with no mercy — "GOD-TIER NIGHTMARE CODE REVIEWER" style. They want critical scrutiny, exhaustive analysis, identification of landmines, design questions, and actionable demands on every review.
- Escape all regex metacharacters before glob-to-regex conversion - Add zero-test-match guard to cargo test filter steps (G1, G2) - Tighten all claim statements to be concrete and falsifiable - Add if: always() to all artifact upload steps for evidence preservation - Upload both builds in build-repro for failure diagnostics
|
@coderabbitai review please 🥕 |
Rate Limit Exceeded
|
|
@coderabbitai review please 🥕 |
|
Core Attack Surface:
Every assertion scrutinized. Every edge case probed. Every assumption challenged. 🔥🥕 🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In @.github/workflows/det-gates.yml:
- Around line 93-94: The grep command in the macOS gate is matching "0 passed"
as a substring (false-positive for "10 passed" etc.); update the check that
follows the cargo test run (the command that writes det-macos.log) to match the
exact line "0 passed" instead of any occurrence — e.g., change the grep
invocation used after det-macos.log to anchor the pattern (match "^0 passed$" or
use a word-boundary/regex) so only a true "0 passed" result triggers the fatal
exit.
- Line 59: The grep guard currently uses the vulnerable substring match grep -q
"0 passed" against det-linux.log; update all occurrences of grep -q "0 passed"
to a safe match such as grep -q -E '\b0 passed\b' or grep -q "running 0 tests"
so it only triggers on an exact zero-test report (apply the change to every
instance of the grep -q "0 passed" command in the workflow, including where
det-linux.log is checked).
In `@scripts/classify_changes.cjs`:
- Line 59: The loop comparing classPriority[cls] to classPriority[maxClass]
should short-circuit when maxClass reaches 'DET_CRITICAL' to avoid wasted work;
inside the outer iteration where you currently do if (classPriority[cls] >
classPriority[maxClass]) { … }, add a check for if (maxClass === 'DET_CRITICAL')
then break/return immediately (or otherwise abort the remaining nested
file×crate×pattern iterations) so the function exits early once the highest
priority class is found.
Match " 0 passed" (with leading space) instead of "0 passed" to prevent false positives on "10 passed", "20 passed", etc. in all three cargo test filter guards (G1 linux, G1 macos, G2 codec).
|
@coderabbitai review please 🥕 |
|
Note: CodeRabbit operates as an incremental review system. If commits have already been reviewed, they won't be re-analyzed unless new changes are pushed. The review will cover any new or unreviewed changes in this PR. Given the scope (gates G1-G4, evidence integrity, CBOR security tests, benchmarks, governance docs), the review will examine:
Executing now. 🔥 🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @.github/workflows/det-gates.yml:
- Around line 10-46: The workflow jobs lack explicit timeouts; add a
timeout-minutes field to every job definition (e.g. the classify-changes job) to
avoid 6-hour GitHub default hangs — insert timeout-minutes: 5 under the
classify-changes job, set timeout-minutes: 15–20 for Rust build/test jobs and
WASM builds, timeout-minutes: 30 for Criterion/benchmark jobs, and
timeout-minutes: 5 for validate-evidence (and any short utility jobs); ensure
the timeout-minutes key is placed at the top-level of each job block (same
indentation as name and runs-on) so GitHub Actions will enforce the limits.
…acklog - Add permissions: contents: read to det-gates.yml (least privilege) - Make ripgrep install idempotent; gate validate-evidence on classify success - Invoke CJS scripts via node for cross-platform portability - Accept 'local' commit_sha sentinel in validate_claims.cjs - Export functions from evidence scripts for unit testing (#286) - Sharpen PRF-001 claim statement; add 5 backlog items to TASKS-DAG - Set 0.1.3 release date; update CHANGELOG with round-4 entries
Script hardening: - Add Array.isArray guard for required_gates in validate_det_policy.cjs - Use explicit null/undefined check instead of falsy in validate_claims.cjs - Fix hardcoded filenames in validate_det_policy.cjs error messages Evidence completeness: - Add REPRO-001 claim (G4 build reproducibility) to CLAIM_MAP.yaml - Wire REPRO-001 into generate_evidence.cjs via build-repro-artifacts Test robustness: - Encode all 5 CBOR fields in reject_invalid_version to prevent false passes from decoder field-read reordering Docs and nits: - Document G3 staging-optional rationale in RELEASE_POLICY.md - Add merge-commit revert guidance (-m 1) to ROLLBACK_TTD.md - Add evidence packet filing instructions to ROLLBACK_TTD.md - Document tests/**/e2e/** DET_NONCRITICAL rationale in det-policy.yaml - Add payload.clone() allocation+emit comment in benchmark
Round 5 — Review Feedback Addressed (18ded17)Fixes applied
False positive / design decision responses
@coderabbitai review please 🥕 |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 3297 🏁 Script executed: Length of output: 3034 🏁 Script executed: Length of output: 3402 🏁 Script executed: Length of output: 3680 🏁 Script executed: Length of output: 1345 🏁 Script executed: Length of output: 3050 🏁 Script executed: Length of output: 6691 🏁 Script executed: Length of output: 538 🏁 Script executed: Length of output: 2354 🏁 Script executed: Length of output: 15116 🏁 Script executed: Length of output: 3906 🏁 Script executed: Length of output: 1523 🏁 Script executed: Length of output: 642
Your false-positive responses contain a critical logic error that undermines the entire gate coverage model, plus several over-classifications that dilute policy precision. Zero mercy mode engaged. 🔥 ✅ VERIFIED FIXES
🔥 CRITICAL FLAW: G3 "UNCONDITIONAL RUN" CLAIM IS FALSEYou stated:
This is demonstrably incorrect. Forensic evidence from perf-regression:
name: G3 perf regression (criterion)
needs: classify-changes
if: needs.classify-changes.outputs.run_full == 'true' # ← CONDITIONAL, NOT UNCONDITIONALG3 ONLY runs when Impact AnalysisScenario: A developer modifies only Expected gate coverage (based on your claim): G1 ❌, G2 ✅, G3 ✅, G4 ✅ Actual gate coverage (based on workflow code):
Result: Changes to the benchmark crate itself skip the performance regression gate entirely. 🚨 SeverityCRITICAL DESIGN FLAW. This is precisely the scenario G3 exists to catch: a well-intentioned "optimization" to Remediation DemandsFix det-gates.yml Line 108: - if: needs.classify-changes.outputs.run_full == 'true'
+ if: needs.classify-changes.outputs.run_full == 'true' || needs.classify-changes.outputs.run_reduced == 'true'OR reclassify OR add a Do not merge this PR until G3 coverage for benchmark changes is assured. The current state is a ticking landmine. 🔥 OVER-CLASSIFICATION: scripts/** Blanket DET_CRITICALYou justified:
Partially correct for gate-affecting scripts; wildly over-broad for utility scripts. Evidence from Gate-critical scripts (legitimately DET_CRITICAL):
Non-gate-critical scripts (should be DET_NONCRITICAL or DET_IMPORTANT at most):
Problem: Changing Demand: Split the
This is a precision vs. cost tradeoff. Every false-positive DET_CRITICAL triggers ~15 minutes of CI overhead for zero security gain. 🔥 QUESTIONABLE: xtask and Makefile as DET_CRITICALxtask (
Observation: This is a developer convenience CLI, not a CI gatekeeper. The actual gate logic lives in Makefile:
Observation: Developer productivity targets. Zero runtime determinism impact. The git hooks themselves ( Demand: Downgrade 🔥 MISLEADING RATIONALE: tests/** ClassificationYour comment in det-policy.yaml Line 151:
Forensic evidence from
Observation: The actual E2E tests are in Demand: Either:
The current rationale is factually incorrect. 🔥 EDGE CASE: validate_det_policy.cjs Paths Validation IncompleteLine 39 of if (!crateInfo.paths || !Array.isArray(crateInfo.paths) || crateInfo.paths.length === 0) {
console.error(`Error: Crate ${crateName} missing or invalid paths`);
return false;
}What it validates:
What it does NOT validate:
Attack vector: crates:
malicious:
class: DET_CRITICAL
paths: ["valid/path", null, "", "../../etc/passwd"]
owner_role: "Attacker"This would pass validation because the array is non-empty, but Demand: Add per-element validation: for (const pattern of crateInfo.paths) {
if (typeof pattern !== 'string' || pattern.trim() === '') {
console.error(`Error: Crate ${crateName} contains invalid path pattern: ${JSON.stringify(pattern)}`);
return false;
}
}🔥 MINOR: classify_changes.cjs Placeholder Collision RiskLine 14–19 of const regexPattern = pattern
.replace(/\*\*/g, '___DBL_STAR___')
.replace(/\*/g, '___SGL_STAR___')
.replace(/[.+?^${}()|[\]\\]/g, '\\$&')
.replace(/___SGL_STAR___/g, '[^/]*')
.replace(/___DBL_STAR___/g, '.*');Theoretical issue: If a path pattern contains the literal string
Likelihood: Extremely low (file paths don't contain triple underscores). Severity: Low (pattern matching would behave unexpectedly). Best practice: Use Unicode private-use characters or UUIDs as placeholders to guarantee no collision: const DBL_STAR = '\uE000'; // Unicode private-use area
const SGL_STAR = '\uE001';Not a blocker, but document this as a known limitation or harden it. VERDICTFixes: ACCEPTED. ✅ Blocking issues for merge:
Non-blocking but strongly recommended: Do not merge until item 🧠 Learnings used |
- Make perf-regression (G3) run for all non-run_none paths, not just run_full. Ensures PRF-001 fires for DET_IMPORTANT changes like warp-benches. Corrects a gate coverage gap identified in review. - Move perf-artifacts presence check to always-required in validate-evidence job (matches new G3 unconditional scope). - Carve tests/dind* and testdata/dind/** out of docs DET_NONCRITICAL into dedicated dind-tests-root entry at DET_IMPORTANT, preventing gate evasion for DIND determinism test modifications. - Update classification comment to clarify DIND carve-out.
Round 6 — Fixes Applied (4057812)Accepted (2 legitimate issues fixed)Issue #1: G3 gate coverage gap — You were right, and I was wrong in my round-5 rebuttal. G3 was gated on Fix: Changed Issue #3: tests/dind.bats misclassification — Confirmed. Fix: Created dedicated Deferred to backlog (not blockers)Issue #2: Over-classification of CI/tooling paths — Valid refinement, but over-classification is the safe direction (triggers extra gates, never fewer). Splitting Issue #4: validate_det_policy.cjs glob validation — Good hardening idea. Per-element glob validation belongs in a follow-up PR alongside the broader #286 test coverage expansion. Issue #5: classify_changes.cjs placeholder collision — The class names come from @coderabbitai review please 🥕 |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 4452 🏁 Script executed: Length of output: 5125
✅ VERIFIED FIXESIssue
|
…ass semantics Address reviewer feedback with documentation fixes: - Clarify that required_gates defines merge blockers, not workflow execution (G3 runs for all non-NONCRITICAL changes via run_none != 'true') - Document intentional fail-safe for scripts/** as DET_CRITICAL (#284) - Explain max-class resolution for overlapping test classifications
Round 6 — Rebuttals & Fixes (f04c529)Thanks for the thorough review. Three blocking issues addressed below — two are rebuttals with clarifying documentation added, one is acknowledged with rationale. Issue 1 — G3 gate coverage: By design, already unconditionalG3's
Issue 2 — Over-classification of scripts: Intentional fail-safe
Issue 3 — tests/dind misclassification: Already handled by max-class semantics
Issues 4–5: Acknowledged as nits, deferred per your suggestion.@coderabbitai review please 🥕 |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 3884
🚨 CONTRADICTION DETECTED: Issue
|
Address all 13 code-review issues (2H, 3M, 5L, 3N): - H1: CI cross-check claim IDs between evidence.json and CLAIM_MAP.yaml - H2: Add DET-003 macOS parity claim + evidence wiring - M1: Replace 20+ docs paths with ** catch-all (max-class safe) - M2: Add concurrency block to det-gates.yml - M3: Dynamic DETERMINISM_PATHS from det-policy.yaml via yq/jq - L1: Strengthen enum tag test assertions with error message checks - L2: Remove redundant rustup target add in build-repro - L3: Fix REPRO-001 evidence type (static_inspection → hash_comparison) - L4: Push-event empty changelist guard (defaults to full run) - L5: Verify sec-claim-map test IDs exist in source - N2: Flatten verbose required_evidence YAML syntax - N3: Remove trailing whitespace in classify_changes.cjs
…timization Fixes CI blocker and addresses CodeRabbit review feedback: - Restore `rustup target add wasm32-unknown-unknown` in build-repro Build 1 and Build 2 steps. The dtolnay/rust-toolchain@stable action installs the wasm target for `stable`, but rust-toolchain.toml pins 1.90.0 — a different toolchain — so the target must be added per-build. - Add timeout-minutes to all 8 det-gates.yml jobs (5-30 min) to prevent the 6-hour GitHub Actions default from burning runner time on hung jobs. - Add early-exit in classify_changes.cjs when maxClass reaches DET_CRITICAL, guarded by !requireFull to preserve require_full_classification checking. - Fix CHANGELOG: remove incorrect "removed redundant rustup target add" entry.
Round 8 — CI Fix + Review FeedbackCI Blocker fixed: Restored New in round 8:
Triage of outstanding CodeRabbit itemsAlready resolved in prior rounds (false positives in latest triage):
Deferred to backlog (not blocking merge):
G3 clarification (re: contradiction between two "round 6" messages): @coderabbitai review please 🥕 |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 3457 🏁 Script executed: Length of output: 2294 🏁 Script executed: Length of output: 1133
✅ VERIFIED: Round 8 CI Fix + FeaturesCritical CI Blocker — RESOLVED ✅rustup target restoration: Confirmed on lines 238 and 249 of det-gates.yml. Both Build 1 and Build 2 explicitly add wasm32-unknown-unknown target. The round 7 removal was indeed incorrect given rust-toolchain.toml pins Rust 1.90.0. CHANGELOG documentation: Line 87 correctly documents the restoration. New Round 8 Features — VERIFIED ✅1. timeout-minutes on all 8 jobs: ✅ All 8 jobs protected against infinite hangs. 2. Early-exit optimization in classify_changes.cjs: if (maxClass === 'DET_CRITICAL' && !requireFull) break;✅ Confirmed on line 59 with 3. CHANGELOG correction: ✅ VERIFIED: "Already Resolved" Triage Claims1. Static inspection outcome check — CONFIRMED ✅Lines 159-162: DET_OUTCOME: ${{ steps.det_check.outcome }}
run: |
if [ "$DET_OUTCOME" = "success" ]; then
echo '{"claim_id": "DET-001", "status": "PASSED"}' > static-inspection.jsonObservation: The string "PASSED" is still hardcoded, BUT it's gated by the outcome check. If det_check fails, the if-block doesn't execute and PASSED is never written. This is correct by design — "PASSED" is the status value when a check succeeds, not a blind assertion. The original concern (always reporting PASSED regardless of actual result) is resolved. 2. Upload steps if: always() — CONFIRMED ✅8 instances found (lines 85, 121, 157, 167, 190, 214, 258, 278). All critical artifact uploads execute even on job failure, ensuring evidence preservation. 3. build-repro uploads both builds — CONFIRMED ✅Lines 263-266 upload all 4 files: hash1.txt, hash2.txt, build1.wasm, build2.wasm. 4. validate-evidence conditional on RUN_FULL — CONFIRMED ✅Lines 289-301 use if [ "$RUN_FULL" = "true" ]; then
[ -d gathered-artifacts/det-linux-artifacts ] || (echo "Missing det-linux-artifacts" && exit 1)
# ... static-inspection, det-macos-artifacts checks
fiJob-level condition (line 278) ensures validate-evidence runs for all non-DET_NONCRITICAL changes. 5. Scripts invoked with node — CONFIRMED ✅All 5 script invocations use explicit interpreter:
No direct JS/shell file execution without interpreter. 6. Claim statements tightened — ACCEPTED ✅CLAIM_MAP.yaml statements are falsifiable (e.g., "Reject CBOR payloads with ops.len() > MAX_OPS", "Dual WASM builds produce byte-identical outputs"). Cross-verified against test/benchmark implementations in prior rounds. ✅ VERIFIED: Deferred Items Rationale
All deferred items are non-blocking and appropriately tracked in backlog. ✅ RESOLVED: G3 Contradiction ClarificationYour explanation:
ACCEPTED. This resolves the apparent contradiction in my previous review. The sequence was:
The two "Round 6" messages in the comment chain referred to different commits in the same round. No actual contradiction — just sequential fixes + docs. My confusion stemmed from treating them as conflicting rather than sequential. Apology for the confusion. 🎯 FINAL VERDICTMERGE APPROVED. ✅🚀 Sprint S1 Deliverables — COMPLETE:
No blocking issues remaining. All Round 8 changes verified. All prior-round "already resolved" items confirmed accurate. Deferred items appropriately tracked in backlog. Technical debt tracked in TASKS-DAG:
Ship it. The TTD Hardening Sprint S1 machinery is production-ready. 🔥🥕 ✏️ Learnings added
🧠 Learnings used |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/det-gates.yml:
- Around line 156-165: generate_evidence.cjs currently treats the presence of
the static-inspection directory as proof of DET-001 success; update it to open
and parse static-inspection.json (the artifact produced by the job that writes
{"claim_id":"DET-001","status":"PASSED"/"FAILED"}) and derive DET-001’s status
from the parsed "status" field (handle missing file, invalid JSON, or missing
field as a FAILED/unverified condition and surface an error), then use that
concrete status when creating the evidence claim instead of only checking
directory existence; ensure the code references the static-inspection.json
filename and DET-001 claim id so reviewers can locate the logic to change.
In `@crates/echo-scene-codec/src/cbor.rs`:
- Around line 1058-1062: The test reject_invalid_enum_tags builds a malformed
CBOR array by calling Encoder::array(2) but only writing the tag
(encoder.u8(2)), so fix it by appending a valid dummy second element after the
invalid tag (e.g., encoder.u8(0) or encoder.null()) to produce a well-formed
2-element array; keep the call to decode_label_anchor(&mut
Decoder::new(&buf)).unwrap_err() and the existing assertion so the failure is
caused by the invalid LabelAnchor tag rather than a truncated CBOR array.
In `@docs/ROLLBACK_TTD.md`:
- Around line 12-38: The rollback instructions use 7‑char SHAs (e.g., e201c9b,
fd98b91, ce98d80, a02ea86, 3187e6a, 6e34a77, f138b8a) which risks collisions;
update the ordered actions under "Scenario A — Full TTD Rollback" to use full
40‑character commit SHAs for each revert, or add a prior step that
expands/validates each short SHA (for example by running git rev-parse
<short‑sha>) and records the full SHA before creating the rollback branch
`rollback/ttd-full-<date>` and reverting commits, ensuring all revert commands
reference the full hashes.
In `@scripts/validate_det_policy.cjs`:
- Around line 57-60: Validate each entry of crateInfo.paths to ensure it's a
non-empty string before returning true; iterate over crateInfo.paths (inside the
same block that checks Array.isArray and length) and if any element is not a
string or is an empty string, call console.error with crateName and the
offending index/value and return false so matches() only ever receives valid
strings. Ensure the error message clearly references crateName and the path
index/value to aid debugging.
---
Duplicate comments:
In @.github/workflows/det-gates.yml:
- Around line 238-239: Remove the redundant rustup target installation calls
that duplicate the dtolnay/rust-toolchain@stable step: delete the extra "rustup
target add wasm32-unknown-unknown" lines that appear immediately before the
cargo build commands (the steps that run "cargo build --release --target
wasm32-unknown-unknown -p ttd-browser"); rely on the toolchain action's targets:
wasm32-unknown-unknown configuration instead to avoid duplicate work and noise
in the workflow.
In `@CHANGELOG.md`:
- Line 8: The changelog entry header "## [0.1.3] — 2026-02-21" is pre-dated;
update it to be undated or mark it as unreleased by replacing that header with
"## [0.1.3] — Unreleased" (or remove the date portion entirely) so the change
remains undated until the release is actually cut; locate the header string "##
[0.1.3] — 2026-02-21" and edit it accordingly.
In `@det-policy.yaml`:
- Around line 152-166: The docs entry currently uses paths: ["**"] and class:
DET_NONCRITICAL which creates an unconditional catch‑all that nullifies
require_full_classification; either remove or narrow the docs paths (remove the
"**" catch‑all or replace with explicit globs for docs files) so unknown files
will fail classification, or alternatively set require_full_classification:
false and add a clear comment documenting the intentional fail‑open behavior;
update the det-policy.yaml entry for docs (and keep DET_NONCRITICAL semantics)
or flip require_full_classification accordingly so classify_changes.cjs no
longer contradicts the policy.
In `@docs/determinism/CLAIM_MAP.yaml`:
- Around line 68-71: PRF-001's statement is too vague for auditability—update
the CLAIM_MAP entry for PRF-001 to name the exact benchmark ID (e.g.,
materialization_hotpath) and the numeric noise threshold or a versioned config
reference instead of "Criterion noise threshold"; modify the keys under PRF-001
(the statement string and/or add a noise_threshold field) so the claim reads
like "MaterializationBus hot-path benchmark (materialization_hotpath) latency
remains within noise_threshold = <numeric_value> (or config://path@vX)" and
ensure required_evidence still includes the benchmark artifact.
In `@scripts/classify_changes.cjs`:
- Around line 59-62: The code currently compares classPriority[cls] without
validating crateInfo.class, so typos like "DET_CRITCAL" yield undefined and
silently lose coverage; modify the loop that uses crateInfo.class and
classPriority to first validate that classPriority[cls] is a defined value
(e.g., if (classPriority[cls] === undefined) throw or process.exit with an error
mentioning the offending crate and its class) before performing the comparison
to update maxClass, ensuring unknown classifications fail loudly with clear
diagnostic info.
- Line 38: Wrap the JSON.parse call that builds the policy variable so malformed
JSON errors include the file context: read the string into a variable
(policyRaw), then try JSON.parse(policyRaw) inside a try/catch and, on catch,
throw or log a new Error that includes policyPath and the original error
message/stack (or rethrow with that context). Update the code around JSON.parse
and policyPath to ensure any SyntaxError includes which file failed and the
underlying parse error details.
In `@scripts/generate_evidence.cjs`:
- Around line 16-23: checkArtifact currently treats any non-empty directory as a
valid artifact; change it to verify presence of expected files instead. Update
the checkArtifact function to accept (or use a local list of) required filenames
(e.g., "sec-tests.log", "static-inspection.json") and return true only if each
required file exists and is non-empty inside the artifact directory; still catch
and return false on FS errors. Use path.join(gatheredArtifactsDir, name,
filename) and fs.existsSync / fs.statSync to validate existence and non-zero
size rather than relying on readdirSync().length > 0.
- Around line 41-65: The five SEC claims (SEC-001..SEC-005) all use a single
checkArtifact('sec-artifacts') call and must instead verify per-claim evidence;
update the logic that builds each claim (the objects with id: 'SEC-001' etc.,
evidence: {workflow, run_id: runId, commit_sha: commitSha, artifact_name:
'sec-artifacts'}) so status is determined by a per-claim check rather than the
global directory existence: either parse sec-tests.log or sec-claim-map.json for
the specific test name for each claim (optionally implemented as
checkClaimResult(claimId) or checkArtifactForClaim('sec-001.json')) and set
status to 'VERIFIED' only if that claim's test result is present and passed,
otherwise 'UNVERIFIED', and include the specific per-claim artifact name (e.g.,
sec-001.json) in the evidence object instead of the generic sec-artifacts.
In `@scripts/validate_claims.cjs`:
- Around line 36-43: The semantic validation currently allows
evidence.commit_sha and evidence.run_id to be "local" unconditionally; change
validate logic in scripts/validate_claims.cjs (the block that pushes to
violations for commit_sha/run_id) to treat "local" as invalid when running in CI
by checking process.env.CI (or equivalent CI env var). Specifically, inside the
same validation block, if process.env.CI is truthy, add violations.push(...)
when evidence.commit_sha === 'local' or evidence.run_id === 'local' (using the
same message format as existing pushes) so CI runs fail on placeholder values
while local runs keep current behavior.
| - name: Create report | ||
| if: always() | ||
| env: | ||
| DET_OUTCOME: ${{ steps.det_check.outcome }} | ||
| run: | | ||
| if [ "$DET_OUTCOME" = "success" ]; then | ||
| echo '{"claim_id": "DET-001", "status": "PASSED"}' > static-inspection.json | ||
| else | ||
| echo '{"claim_id": "DET-001", "status": "FAILED"}' > static-inspection.json | ||
| fi |
There was a problem hiding this comment.
Evidence integrity gap: generate_evidence.cjs ignores the content of static-inspection.json.
The static-inspection job correctly writes PASSED or FAILED based on steps.det_check.outcome (lines 161-165). However, generate_evidence.cjs (invoked at line 304) determines DET-001 status purely by checking if the static-inspection directory exists and is non-empty — it never reads static-inspection.json to verify the actual outcome.
Attack scenario:
ban-nondeterminism.shexits non-zero →static-inspection.jsoncontains"status": "FAILED"- Upload step runs with
if: always()→ directory is uploaded generate_evidence.cjssees the directory exists → marks DET-001 asVERIFIEDvalidate_claims.cjsonly validates structure, not outcome → passes
Net result: A failed DET-001 check produces a VERIFIED evidence claim. The entire evidence chain becomes untrustworthy.
🔥 Proposed fix — read and validate artifact content
Modify generate_evidence.cjs to read static-inspection.json and set DET-001 status based on its content:
+ const checkStaticInspection = () => {
+ try {
+ const report = JSON.parse(fs.readFileSync(
+ path.join(gatheredArtifactsDir, 'static-inspection', 'static-inspection.json'), 'utf8'));
+ return report.status === 'PASSED';
+ } catch {
+ return false;
+ }
+ };
+
const claims = [
{
id: 'DET-001',
- status: checkArtifact('static-inspection') ? 'VERIFIED' : 'UNVERIFIED',
+ status: checkStaticInspection() ? 'VERIFIED' : 'UNVERIFIED',
evidence: { workflow, run_id: runId, commit_sha: commitSha, artifact_name: 'static-inspection' }
},Also applies to: 302-307
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/det-gates.yml around lines 156 - 165,
generate_evidence.cjs currently treats the presence of the static-inspection
directory as proof of DET-001 success; update it to open and parse
static-inspection.json (the artifact produced by the job that writes
{"claim_id":"DET-001","status":"PASSED"/"FAILED"}) and derive DET-001’s status
from the parsed "status" field (handle missing file, invalid JSON, or missing
field as a FAILED/unverified condition and surface an error), then use that
concrete status when creating the evidence claim instead of only checking
directory existence; ensure the code references the static-inspection.json
filename and DET-001 claim id so reviewers can locate the logic to change.
| let mut encoder = Encoder::new(&mut buf); | ||
| encoder.array(2).unwrap(); | ||
| encoder.u8(2).unwrap(); // Invalid tag | ||
| let err = decode_label_anchor(&mut Decoder::new(&buf)).unwrap_err(); | ||
| assert!( |
There was a problem hiding this comment.
reject_invalid_enum_tags builds a malformed LabelAnchor CBOR array.
Line 1059 declares array(2) but only encodes the tag. That’s a truncated CBOR array, so the test can pass for the wrong reason if decode order changes. Encode a dummy second element to keep the buffer well‑formed and ensure the failure is truly “invalid LabelAnchor tag.”
🛠️ Suggested fix
// LabelAnchor tag: allowed 0..=1
buf.clear();
let mut encoder = Encoder::new(&mut buf);
encoder.array(2).unwrap();
encoder.u8(2).unwrap(); // Invalid tag
+ encoder.bytes(&make_test_hash(1)).unwrap(); // dummy payload
let err = decode_label_anchor(&mut Decoder::new(&buf)).unwrap_err();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/echo-scene-codec/src/cbor.rs` around lines 1058 - 1062, The test
reject_invalid_enum_tags builds a malformed CBOR array by calling
Encoder::array(2) but only writing the tag (encoder.u8(2)), so fix it by
appending a valid dummy second element after the invalid tag (e.g.,
encoder.u8(0) or encoder.null()) to produce a well-formed 2-element array; keep
the call to decode_label_anchor(&mut Decoder::new(&buf)).unwrap_err() and the
existing assertion so the failure is caused by the invalid LabelAnchor tag
rather than a truncated CBOR array.
| - Base: `efae3e8` | ||
| - Head: `e201c9b` | ||
|
|
||
| ## Preconditions | ||
|
|
||
| - Release owner approval logged. | ||
| - Current branch state saved/tagged. | ||
| - Incident ticket created. | ||
|
|
||
| ## Scenario A — Full TTD Rollback | ||
|
|
||
| ### Objective (Scenario A) | ||
|
|
||
| Return repository to pre-TTD integration state. | ||
|
|
||
| ### Ordered actions | ||
|
|
||
| 1. Create rollback branch: | ||
| - `rollback/ttd-full-<date>` | ||
| 2. Revert commits in reverse order from head to base+1: | ||
| - `e201c9b` | ||
| - `fd98b91` | ||
| - `ce98d80` | ||
| - `a02ea86` | ||
| - `3187e6a` | ||
| - `6e34a77` | ||
| - `f138b8a` |
There was a problem hiding this comment.
Use full‑length SHAs (or require expansion) in rollback steps.
Lines 12–38 list 7‑char SHAs; collisions are possible in a growing repo, which is unacceptable during rollback. Use full 40‑char SHAs or add a step to expand with git rev-parse before reverting.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/ROLLBACK_TTD.md` around lines 12 - 38, The rollback instructions use
7‑char SHAs (e.g., e201c9b, fd98b91, ce98d80, a02ea86, 3187e6a, 6e34a77,
f138b8a) which risks collisions; update the ordered actions under "Scenario A —
Full TTD Rollback" to use full 40‑character commit SHAs for each revert, or add
a prior step that expands/validates each short SHA (for example by running git
rev-parse <short‑sha>) and records the full SHA before creating the rollback
branch `rollback/ttd-full-<date>` and reverting commits, ensuring all revert
commands reference the full hashes.
| if (!crateInfo.paths || !Array.isArray(crateInfo.paths) || crateInfo.paths.length === 0) { | ||
| console.error(`Error: Crate ${crateName} missing or invalid paths`); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Optional: Per-element path validation is missing — future-proofing only.
crateInfo.paths is validated as a non-empty array, but individual elements aren't checked as strings. A malformed entry like paths: [123, null, {}] would pass validation and cause downstream matches() to receive non-string input. Not a current problem if policy is authored carefully, but for defense-in-depth:
🛡️ Optional hardening
if (!crateInfo.paths || !Array.isArray(crateInfo.paths) || crateInfo.paths.length === 0) {
console.error(`Error: Crate ${crateName} missing or invalid paths`);
return false;
}
+ for (const p of crateInfo.paths) {
+ if (typeof p !== 'string') {
+ console.error(`Error: Crate ${crateName} has non-string path: ${JSON.stringify(p)}`);
+ return false;
+ }
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@scripts/validate_det_policy.cjs` around lines 57 - 60, Validate each entry of
crateInfo.paths to ensure it's a non-empty string before returning true; iterate
over crateInfo.paths (inside the same block that checks Array.isArray and
length) and if any element is not a string or is an empty string, call
console.error with crateName and the offending index/value and return false so
matches() only ever receives valid strings. Ensure the error message clearly
references crateName and the path index/value to aid debugging.
This PR delivers the TTD Hardening Sprint S1 goals: