From e8bf5891b5a01ca53ff685d51a87a67bbf12c7a8 Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Mon, 4 May 2026 16:32:52 +0000
Subject: [PATCH 01/12] feat(agentic-ci): decision-ready triage and daily PR
 fixes

Reorganize the weekly issue-triage report around recommended actions
(close as resolved, close as duplicate, needs maintainer decision,
ready for assignment, stuck PR, duplicate PRs, stale) so each flagged
item carries action + evidence + rationale and can be resolved without
opening it. Multi-comment split with i/N markers and orphan
reconciliation when the report grows or shrinks.

Flip the four daily audit suites with mechanical fix categories from
read-only reports to opening one PR per run:

- docs-and-references: broken-link, docstring-drift, arch-ref-rename
- structure: missing-future, lazy-import
- dependencies: transitive-gap, unused
- code-quality: bare-except (draft until landing rate proven)

test-health stays report-only (all candidates require inferring intent).

The shared procedure - fix_backlog selection, finding-hash spec for
stable cross-run identification, attempted_fixes lifecycle with
two-strike escalation, allowlists, ranking, branch/PR conventions -
lives in .agents/recipes/_fix-policy.md. Each suite recipe declares
only its eligible categories, branch types, and test requirements.

Workflow runs claude twice per suite (audit, then conditionally fix),
each capped at the existing --max-turns 50. Fix call is gated on
non-empty fix_backlog and skipped entirely for test-health.
---
 .agents/recipes/_fix-policy.md                | 204 ++++++++++++++++
 .agents/recipes/_phase-audit.md               |  17 ++
 .agents/recipes/_phase-fix.md                 |  23 ++
 .agents/recipes/_runner.md                    |  14 +-
 .agents/recipes/code-quality/recipe.md        |  53 +++-
 .agents/recipes/dependencies/recipe.md        |  38 ++-
 .agents/recipes/docs-and-references/recipe.md |  28 ++-
 .agents/recipes/issue-triage/recipe.md        | 231 ++++++++++--------
 .agents/recipes/structure/recipe.md           |  28 ++-
 .agents/recipes/test-health/recipe.md         |  22 +-
 .github/workflows/agentic-ci-daily.yml        |  67 ++++-
 11 files changed, 607 insertions(+), 118 deletions(-)
 create mode 100644 .agents/recipes/_fix-policy.md
 create mode 100644 .agents/recipes/_phase-audit.md
 create mode 100644 .agents/recipes/_phase-fix.md
diff --git a/.agents/recipes/_fix-policy.md b/.agents/recipes/_fix-policy.md
new file mode 100644
index 000000000..963824cab
--- /dev/null
+++ b/.agents/recipes/_fix-policy.md
@@ -0,0 +1,204 @@
+# Agentic CI Fix Policy
+
+Prepended to every daily-suite recipe alongside `_runner.md`. Defines what
+"open a PR" means for these recipes and the rules that apply across all of
+them. Each suite recipe declares only its eligible finding categories, its
+branch types, and any risk-specific notes — everything else is here.
+
+When in doubt, fall back to report-only.
+
+## Localized fix bar
+
+A finding may be converted to a fix only if all hold:
+
+- **Bounded scope**: ≤3 files, ≤50 LOC net.
+- **Reversible**: no public API changes, no `__all__` deletions, no version
+  bumps (Dependabot owns those), no schema changes, no migrations.
+- **Self-evident**: the audit established both the problem *and* the unique
+  correct fix. Mechanical, not interpretive.
+- **Test-safe**: when the recipe declares `test_required`, run
+  `make test-<package>` for the affected package and abort on failure.
+- **Single concern**: one finding per PR.
+- **Allowlisted paths**: matches the suite's path allowlist.
+
+If the top-ranked candidate fails the bar, try the next. If none of the top
+5 qualify, skip the fix step and emit report-only.
+
+## Allowlists
+
+### Per-suite path allowlist
+
+| Suite | Paths the recipe MAY modify |
+|-------|-----------------------------|
+| docs-and-references | `architecture/**`, `docs/**`, `README.md`, `CONTRIBUTING.md`, `DEVELOPMENT.md`, `STYLEGUIDE.md`, `packages/*/src/**/*.py` (docstring-only edits) |
+| dependencies | `packages/*/pyproject.toml` |
+| structure | `packages/*/src/**/*.py` |
+| code-quality | `packages/*/src/**/*.py` |
+| test-health | (no fix phase) |
+
+### Shared forbidden paths (all suites)
+
+- `.github/workflows/**`, `.agents/**`, repo-root `pyproject.toml`,
+  `.git/**`, anything in `.gitignore`.
+
+### Shared forbidden commands
+
+- `git push --force` (any variant), `git rebase`, `git reset --hard`,
+  `git branch -D`/`-d`/`--delete`.
+- `gh pr merge`, `gh pr close`, `gh pr review`.
+- `pip install`, `uv pip install` (use `make install-dev` only).
+
+## Runner-state schema
+
+Each daily recipe maintains two arrays in
+`{{memory_path}}/runner-state.json` beyond the existing `known_issues` /
+`baselines`:
+
+```json
+{
+  "fix_backlog": [
+    { "id": "<hash>", "category": "...", "first_seen": "YYYY-MM-DD",
+      "last_seen": "YYYY-MM-DD", "data": { /* category fields */ } }
+  ],
+  "attempted_fixes": [
+    { "id": "<hash>", "attempts": [
+      { "pr_number": 612, "outcome": "merged", "at": "YYYY-MM-DD",
+        "branch": "agentic-ci/..." }
+    ] }
+  ]
+}
+```
+
+Also: `draft_until_proven` (boolean, per-suite, default `true` for
+code-quality and unset elsewhere) controls draft-PR mode.
+
+### `fix_backlog` rules (audit phase populates this)
+
+- Append every detected finding in an eligible category. Update `last_seen`
+  if `id` already present.
+- Drop entries with `last_seen` older than 30 days.
+- Cap at 200 entries (drop oldest by `first_seen`).
+- Populated **before** the `known_issues` filter so fixable findings persist
+  even when their report row is suppressed for being unchanged.
+
+### `attempted_fixes` rules
+
+`outcome` ∈ `{open, merged, closed, abandoned}`.
+
+- `abandoned` means the recipe could not produce a PR (tests failed,
+  conflict, lint failed, allowlist rejected, etc.).
+- Reconcile against open PRs (`gh pr list`) at the start of each fix run
+  to recover from crashes that left state un-updated.
+- Prune: drop `merged` >90d, drop single `closed`/`abandoned` >30d.
+- Two-strike entries (≥2 `closed`/`abandoned`) are NOT pruned; they
+  surface in the report under `Repeatedly-failed fix attempts`.
+
+## Finding hash
+
+`finding_id = sha1(suite + ":" + canonical_key)[:12]`, where
+`canonical_key` uses durable identifiers only — never line numbers or free
+text:
+
+| Suite (category) | canonical_key |
+|------------------|---------------|
+| docs (broken-link) | `<source-file>:<target>` |
+| docs (docstring-drift) | `<source-file>:<symbol>:<param-or-empty>:<drift-type>` |
+| docs (arch-ref-rename) | `<doc-file>:<old-symbol>` |
+| dependencies (transitive-gap) | `<package>:<dep>:transitive` |
+| dependencies (unused) | `<package>:<dep>:unused` |
+| structure (missing-future) | `<source-file>:missing-future` |
+| structure (lazy-import) | `<source-file>:lazy-import:<imported-module>` |
+| code-quality (bare-except) | `<source-file>:<enclosing-symbol>:bare-except` |
+
+Symbols use fully-qualified Python names.
+
+## Ranking
+
+Earlier criteria override later ones:
+
+1. **Fix confidence** (per-category):
+
+   | Category | Confidence |
+   |----------|-----------|
+   | structure / missing-future | 1.0 |
+   | structure / lazy-import | 0.9 |
+   | docs / broken-link | 0.9 |
+   | dependencies / transitive-gap | 0.85 |
+   | docs / arch-ref-rename | 0.8 |
+   | dependencies / unused | 0.75 |
+   | docs / docstring-drift | 0.75 |
+   | code-quality / bare-except | 0.6 |
+
+2. **Defect severity**:
+
+   | Severity | Examples |
+   |----------|----------|
+   | high | missing transitive dep, heavy import bypassing lazy system |
+   | medium | broken doc link visible on docs site, bare-except hiding errors, docstring drift on public API |
+   | low | broken link in dev-notes, missing `__future__ import annotations`, unused dep |
+
+3. **User-facing impact** — visible to docs-site readers or plugin
+   consumers vs internal-only.
+
+4. **Recency** — newer findings rank above long-standing ones.
+
+Record the chosen finding's id, scores, and rationale at the top of
+`/tmp/audit-{{suite}}.md`.
+
+## Standard fix procedure
+
+The fix phase of every eligible recipe follows these steps. Suite recipes
+declare only the parts that vary (eligible categories, branch type,
+`test_required`, suite-specific quirks).
+
+1. Reconcile `attempted_fixes` against open PRs (`gh pr list`) to recover
+   any state lost to a prior crash.
+2. Filter `fix_backlog`: drop entries whose latest attempt is `open` or
+   `merged`; surface two-strike entries in the report's
+   `Repeatedly-failed fix attempts` section and drop them from selection.
+3. Rank the remainder per the Ranking section.
+4. For each candidate, top 5 max:
+   1. Re-verify the finding still applies (re-grep / re-read). If not,
+      remove from `fix_backlog` and continue.
+   2. Apply the fix. If the diff exceeds the localized-fix bar or touches
+      a non-allowlisted path, abandon and continue.
+   3. If the category sets `test_required: true`, run
+      `make test-<package>` for the package containing the change. On
+      failure: abandon and continue.
+   4. Branch: `agentic-ci/<type>/<suite>-YYYYMMDD-<short-slug>`. Commit:
+      `<type>(agentic-ci): <one-line>`. Push.
+   5. Write the PR body to `/tmp/pr-body-{{suite}}.md`, including the
+      hidden metadata block:
+      `<!-- agentic-ci finding=<id> suite=<suite> -->`
+   6. `gh pr create --body-file /tmp/pr-body-{{suite}}.md` with `--draft`
+      iff `draft_until_proven` is true for the suite.
+   7. `gh pr edit <num> --add-label agentic-ci --add-label agentic-ci/<suite>`.
+   8. Record `attempted_fixes` entry with `outcome: "open"` and exit.
+5. If all 5 candidates were abandoned, append a one-line note to the
+   report and exit cleanly. The state already reflects the abandonments.
+
+On any failure mid-flow: record `outcome: "abandoned"` for the chosen
+finding (with `pr_number: null`), leave any pushed branch in place
+(`pr-stale.yml` will reap it; branch deletion is forbidden), and continue
+to the next candidate.
+
+## PR conventions
+
+- **Use `gh pr create --body-file`**, not `/create-pr`. The skill is
+  interactive-only and shells the body inline; CI needs determinism.
+- **Title**: conventional, `<type>(agentic-ci): <one-line>`.
+- **Labels**: `agentic-ci`, `agentic-ci/<suite>`.
+- **Draft PRs**: `code-quality` opens draft until a maintainer flips
+  `draft_until_proven` to `false` in runner-state, after at least two
+  non-draft PRs from that suite have landed clean.
+
+## Atomicity
+
+Each fix-phase invocation produces exactly one of:
+
+- **Report-only** — runner-state updated; no branch, commit, or PR.
+- **Report + PR** — same, plus a pushed branch, a commit, and a PR. The
+  `attempted_fixes` entry is recorded *before* the recipe exits.
+
+No half-states. The runner state is the source of truth for what the
+recipe has tried; never silently drop a failed attempt.
diff --git a/.agents/recipes/_phase-audit.md b/.agents/recipes/_phase-audit.md
new file mode 100644
index 000000000..5caefa0c8
--- /dev/null
+++ b/.agents/recipes/_phase-audit.md
@@ -0,0 +1,17 @@
+## Phase directive
+
+This invocation runs the **AUDIT** phase only.
+
+- Execute the audit steps from the recipe and write the report to
+  `/tmp/audit-{{suite}}.md`.
+- Update `{{memory_path}}/runner-state.json` with detected findings,
+  including `fix_backlog` entries per `_fix-policy.md` (populated BEFORE
+  applying the `known_issues` filter to the report, so fixable findings
+  persist across runs even when their report row is suppressed).
+- Do NOT attempt any fix. Do NOT create any branches, commits, or PRs.
+- Do NOT modify any files outside `{{memory_path}}/`.
+- A separate invocation will run the FIX phase if `fix_backlog` has
+  eligible candidates and the suite has a fix phase.
+- Read the recipe in full for context; the "Fix phase" section informs
+  which finding categories should populate `fix_backlog`, but you must
+  not act on them in this invocation.
diff --git a/.agents/recipes/_phase-fix.md b/.agents/recipes/_phase-fix.md
new file mode 100644
index 000000000..48bf77a09
--- /dev/null
+++ b/.agents/recipes/_phase-fix.md
@@ -0,0 +1,23 @@
+## Phase directive
+
+This invocation runs the **FIX** phase only.
+
+- The audit phase has already completed in a previous invocation. Its
+  report is at `/tmp/audit-{{suite}}.md` and
+  `{{memory_path}}/runner-state.json` has the populated `fix_backlog`.
+- Execute only the recipe's "Fix phase" section per `_fix-policy.md`.
+  Do NOT redo audit work; do NOT re-scan the codebase to rebuild
+  findings.
+- Pick the highest-ranked eligible candidate from `fix_backlog`, apply
+  the fix, run the package's tests if applicable, commit, push, and open
+  the PR using `gh pr create --body-file`.
+- Record the attempt in `attempted_fixes` (whether successful, abandoned,
+  or failed through the top-5 fallback) before exiting.
+- If no candidate qualifies after trying up to 5 of them, exit cleanly,
+  append a short note to `/tmp/audit-{{suite}}.md` describing what was
+  tried, and update `attempted_fixes` accordingly. Do NOT open a PR.
+- Do NOT delete branches, even on failure (per `_runner.md` and
+  `_fix-policy.md`). Leave them for the existing `pr-stale.yml` workflow
+  to reap over time.
+- Read the recipe in full for context, but treat the audit phase as
+  already done.
diff --git a/.agents/recipes/_runner.md b/.agents/recipes/_runner.md
index 98081279b..9705633d5 100644
--- a/.agents/recipes/_runner.md
+++ b/.agents/recipes/_runner.md
@@ -76,6 +76,14 @@ Write all output to a temp file (e.g., `/tmp/recipe-output.md`). The workflow
 will handle posting it. Do not post directly to GitHub - the workflow controls
 output routing.
 
-If your recipe produces code changes, commit them on a new branch and use
-`/create-pr` to open a pull request. The branch name should follow the
-pattern `agentic-ci/chore/{suite}-YYYYMMDD`.
+If your recipe produces code changes, commit them on a new branch following
+the pattern `agentic-ci/{type}/{suite}-YYYYMMDD-{short-slug}` where `{type}`
+matches the change kind (`chore`/`docs`/`fix`/`refactor`).
+
+For PR creation in CI, use `gh pr create --body-file /tmp/pr-body-<suite>.md`
+directly rather than the `/create-pr` skill. The skill assumes an interactive
+session (it can prompt about uncommitted changes, base branch, etc.) and
+shells the body inline, which breaks on backticks and special characters.
+Daily-suite recipes that open PRs are governed by `_fix-policy.md` — read it
+for the full PR contract (allowlists, draft mode, hidden metadata, branch
+naming, atomicity).
diff --git a/.agents/recipes/code-quality/recipe.md b/.agents/recipes/code-quality/recipe.md
index ab8fcb793..302a8f5ad 100644
--- a/.agents/recipes/code-quality/recipe.md
+++ b/.agents/recipes/code-quality/recipe.md
@@ -35,6 +35,15 @@ Read `{{memory_path}}/runner-state.json` for baselines from previous runs
 re-reporting known issues. Flag metrics that are trending in the wrong
 direction compared to the previous baseline.
 
+This recipe also maintains `fix_backlog` and `attempted_fixes` per
+`_fix-policy.md`. Update `fix_backlog` for every detected bare-except
+finding *before* the `known_issues` filter applies. (Other categories
+remain report-only and do not enter `fix_backlog`.)
+
+The `draft_until_proven` flag in runner-state controls whether this
+suite's PRs are opened as draft. Default `true` until a maintainer flips
+it to `false`.
+
 ## Instructions
 
 ### 1. Complexity hotspots
@@ -238,9 +247,51 @@ Write the report to `/tmp/audit-{{suite}}.md`:
 
 If no findings in any category, write `NO_FINDINGS` on the first line instead.
 
+## Fix phase
+
+Follow the standard fix procedure in `_fix-policy.md`. Suite-specific bits:
+
+### Eligible categories
+
+| Category | Branch type | test_required | Eligibility note |
+|----------|-------------|---------------|------------------|
+| bare-except | `refactor` | yes | Replace `except:` / `except BaseException:` with the specific exception type. Eligible only when grep across the try-block confirms **exactly one** exception type is plausibly raised, verified by inspecting the called functions or imported library docs. Multiple plausible types → ineligible. Test files are excluded (different exception-handling standards). |
+
+`fix_backlog.data` should record the proposed replacement exception type
+and the grep evidence used to determine it. Within bare-except findings,
+prefer ones in user-facing modules (`packages/data-designer/src/`) over
+internal helpers (the ranking impact criterion handles this once
+`data.user_facing` is set).
+
+The PR body should include the before/after of the try-block plus the
+grep evidence that justified the chosen exception type, and a note that
+the PR is draft until landing rate is proven (ask reviewers to mark
+ready-for-review if the change is correct).
+
+**Draft mode**: this suite opens PRs as draft until a maintainer flips
+`draft_until_proven` to `false` in runner-state, after at least two
+non-draft PRs have landed clean. Bare-except narrowing is the most
+inference-heavy fix in any suite (confidence 0.6); recipe judgement has
+to be earned before promotion. Two-strike findings here are an
+especially important signal — they suggest the detector is producing
+false positives in an already-cautious category.
+
+**Not eligible** — stays report-only:
+
+- Complexity refactors, type annotation additions, exception hierarchy
+  normalization (judgement-heavy).
+- **TODO line deletion** — the audit's "looks done" judgement is not
+  mechanical enough to delete code on. Deletion is forbidden.
+
 ## Constraints
 
-- Do not modify any files. This is a read-only audit.
+- Outside the fix phase, this recipe is read-only — do not modify files.
+- Within the fix phase, only modify paths in the suite's path allowlist
+  (`packages/*/src/**/*.py`). Test files are excluded.
+- **TODO line deletion is forbidden.** The audit phase still inventories
+  TODOs, but the fix phase does not act on them.
+- Bare-except narrowing is only eligible when the exception type is
+  unambiguous. When in doubt, skip.
 - Do not flag test files for type coverage or exception hygiene. Tests have
   different standards.
 - Do not duplicate ruff checks (W, F, I, ICN, PIE, TID, UP*). Those are
diff --git a/.agents/recipes/dependencies/recipe.md b/.agents/recipes/dependencies/recipe.md
index 074e23857..deedf742b 100644
--- a/.agents/recipes/dependencies/recipe.md
+++ b/.agents/recipes/dependencies/recipe.md
@@ -26,6 +26,10 @@ dependency versions. After the audit, update `known_issues` and
 `baselines.dependency_versions` with the current state. Skip reporting issues
 that already appear in `known_issues`.
 
+This recipe also maintains `fix_backlog` and `attempted_fixes` per
+`_fix-policy.md`. Update `fix_backlog` for every detected finding *before*
+the `known_issues` filter applies.
+
 ## Instructions
 
 ### 1. Inventory current dependencies
@@ -154,12 +158,40 @@ Write the report to `/tmp/audit-{{suite}}.md`:
 
 If no findings in any category, write `NO_FINDINGS` on the first line instead.
 
+## Fix phase
+
+Follow the standard fix procedure in `_fix-policy.md`. Suite-specific bits:
+
+### Eligible categories
+
+| Category | Branch type | test_required | Eligibility note |
+|----------|-------------|---------------|------------------|
+| transitive-gap | `chore` | yes | Add the imported module to `[project.dependencies]` of the package that imports it. Use the version specifier from a package that already declares it; otherwise the latest stable specifier. Insert in alphabetical order; match existing quote/specifier style. |
+| unused | `chore` | yes | Remove the declaration. Eligible only when grep across the package's `src/`, lazy-import system, plugin entry points, and tests turns up zero references. |
+
+`fix_backlog.data` should record: for transitive-gap, the importing source
+files and proposed version specifier; for unused, which other packages
+also declare the dep.
+
+Before running `make test-<package>`, run `make install-dev` to confirm
+the lockfile resolves cleanly. `make install-dev` is the only sanctioned
+install command (no direct `pip install` or `uv pip install`).
+
+**Not eligible** — stays report-only:
+
+- Cross-package version reconciliation, version pinning concerns
+  (judgement-heavy).
+- CVE response (Dependabot's job).
+
 ## Constraints
 
-- Do not modify any files. This is a read-only audit.
-- Do not install packages or run `pip install`. Only inspect `pyproject.toml`
-  and source files.
+- Outside the fix phase, this recipe is read-only — do not modify files.
+- Within the fix phase, only modify `packages/*/pyproject.toml`. The
+  repo-root `pyproject.toml` is forbidden.
+- `make install-dev` is the only sanctioned install command. Do not
+  invoke `pip install` or `uv pip install` directly.
 - Do not run `pip audit` (may not be available on the runner). Focus on
   structural dependency analysis, not CVE scanning (Dependabot handles that).
 - Do not recommend changes to dependencies you haven't verified are actually
   problematic. False positives erode trust in the audit.
+- Version pinning changes are explicitly out of scope for the fix phase.
diff --git a/.agents/recipes/docs-and-references/recipe.md b/.agents/recipes/docs-and-references/recipe.md
index f8a7073dc..45a54b324 100644
--- a/.agents/recipes/docs-and-references/recipe.md
+++ b/.agents/recipes/docs-and-references/recipe.md
@@ -26,6 +26,11 @@ After completing the audit, update the file with any new findings (add to
 `known_issues` array with a short hash of the finding). Skip reporting issues
 that already appear in `known_issues`.
 
+This recipe also maintains `fix_backlog` and `attempted_fixes` per
+`_fix-policy.md`. Update `fix_backlog` for every detected finding *before*
+the `known_issues` filter applies, so fixable findings persist across runs
+even when their report row is suppressed for being unchanged.
+
 ## Instructions
 
 ### 1. Docstring vs signature drift
@@ -147,9 +152,30 @@ Write the report to `/tmp/audit-{{suite}}.md`:
 
 If no findings in any category, write `NO_FINDINGS` on the first line instead.
 
+## Fix phase
+
+Follow the standard fix procedure in `_fix-policy.md`. Suite-specific bits:
+
+### Eligible categories
+
+| Category | Branch type | test_required | Eligibility note |
+|----------|-------------|---------------|------------------|
+| broken-link | `docs` | no | Only when the corrected target is unambiguous (exact-match file at a different path, or a single similar anchor). Multiple candidates → ineligible. |
+| docstring-drift | `docs` | yes | Purely signature-driven `Args:`/`Returns:`/`Raises:` updates. Rename a param to its current name, drop entries for removed params, add placeholder entries for added params (note the signature; do not invent semantic descriptions). |
+| arch-ref-rename | `docs` | no | Only when grep confirms the old symbol is gone and exactly one similarly-named new symbol exists at the same role. |
+
+`fix_backlog.data` should carry whatever the fix step needs without
+re-scanning: the proposed target for broken-link, the signature-vs-Args
+delta for docstring-drift, the new symbol name for arch-ref-rename.
+
+All other audit categories (docs-site rewrites, dev-note edits, external
+URL breakage) stay report-only.
+
 ## Constraints
 
-- Do not modify any files. This is a read-only audit.
+- Outside the fix phase, this recipe is read-only — do not modify files.
+- Within the fix phase, only modify paths in the suite's path allowlist.
+  See `_fix-policy.md` for the shared command/path baseline.
 - Do not read file contents unless needed to verify a specific reference.
   Use `grep` and `head` for targeted checks rather than reading entire files.
 - Skip vendored or generated files.
diff --git a/.agents/recipes/issue-triage/recipe.md b/.agents/recipes/issue-triage/recipe.md
index 396b0667a..3edc061ae 100644
--- a/.agents/recipes/issue-triage/recipe.md
+++ b/.agents/recipes/issue-triage/recipe.md
@@ -1,6 +1,6 @@
 ---
 name: issue-triage
-description: Weekly triage of open issues and PRs - classify, verify, detect staleness, duplicates, and cross-reference
+description: Weekly triage of open issues and PRs - decision-ready report organized by recommended action
 trigger: schedule
 tool: claude-code
 timeout_minutes: 15
@@ -14,7 +14,9 @@ permissions:
 # Repository Triage
 
 Triage all open issues and pull requests in this repository, then post a
-combined report to the tracking issue.
+decision-ready report to the tracking issue. The report is organized by
+**recommended action** so a maintainer can resolve flagged items without
+opening each one.
 
 ## Instructions
 
@@ -23,164 +25,185 @@ combined report to the tracking issue.
 Collect all open issues, open PRs, and recent merge activity:
 
 ```bash
-# All open issues with metadata
 gh issue list --state open --limit 200 \
   --json number,title,state,createdAt,updatedAt,labels,assignees,author,body
 
-# All open PRs with metadata
 gh pr list --state open --limit 200 \
   --json number,title,state,createdAt,updatedAt,labels,author,headRefName,body
 
-# Recently merged PRs (last 60 days) to cross-reference
 gh pr list --state merged --limit 100 \
   --json number,title,headRefName,body,mergedAt
 
-# PR check status for open PRs
+# Failing-check counts for open PRs
 for pr in $(gh pr list --state open --json number --jq '.[].number'); do
-  echo "=== PR #${pr} ==="
-  gh pr checks "$pr" --json name,state --jq '[.[] | select(.state == "FAILURE" or .state == "ERROR")] | length'
+  FAILING=$(gh pr checks "$pr" --json name,state \
+    --jq '[.[] | select(.state == "FAILURE" or .state == "ERROR")] | length')
+  echo "${pr} ${FAILING}"
 done
 ```
 
-### 2. Triage issues
+### 2. Decide an action for every flagged item
 
-For each open issue, determine:
+For each open issue and PR, decide whether it needs maintainer action and, if
+so, which **action bucket** it belongs in. Buckets are exclusive — every
+flagged item appears under exactly one heading.
 
-**Classification** (pick one):
-- `bug` - something is broken
-- `feature` - new capability or enhancement
-- `chore` - maintenance, CI, docs, refactoring
-- `discussion` - needs design input or decision before work starts
+Buckets and the criteria for each:
 
-**Staleness** (based on last update, today's date, and activity):
-- `active` - updated within the last 14 days
-- `aging` - updated 14-30 days ago
-- `stale` - no update for 30+ days
+| Bucket | Apply when |
+|--------|-----------|
+| `Close as resolved` | A merged PR closes the issue via `Fixes/Closes/Resolves #N`, OR a merged PR's title/branch/body strongly indicates it addressed the issue. The issue is still open. |
+| `Close as duplicate` | An older open issue covers the same scope. Pick the older issue as the canonical one. |
+| `Needs maintainer decision` | Issue labeled `discussion`, design-input items with no clear scope, or items labeled `needs-attention` (flagged by the stale-PR workflow because their linked PR was auto-closed). |
+| `Ready for assignment` | Well-scoped issue, no assignee, no linked open PR, not stale (updated within 30 days). Brief enough that someone could pick it up today. |
+| `Stuck PR` | Open PR with one or more failing checks, OR no author activity (push/comment) for 14+ days. |
+| `Duplicate PRs` | Two or more open PRs reference the same issue (`Fixes/Closes/Resolves #N`). |
+| `Stale, consider closing` | 60+ days since last activity, no assignee, no linked open PR. Older than `Ready for assignment` and without traction. |
 
-**Verification** - check if the issue has been addressed:
-- Search merged PRs for closing keywords (`Fixes #N`, `Closes #N`, `Resolves #N`)
-  referencing this issue
-- Search merged PR titles and branches for keywords matching the issue
-- If a merged PR appears to fix the issue, flag it as `potentially resolved`
-- If there is an open PR linked to the issue, note the PR number
+Items that don't fit any of the above are **healthy** — count them but do
+not list them in the action sections.
 
-**Labels as signals** - issues with `needs-attention` were flagged by the stale
-PR workflow because their linked PR was auto-closed. Always include these in the
-"Action needed" section.
+Also check `attempted_fixes` in any daily-suite runner-state files (under
+`.agentic-ci-state/` if accessible) — findings with two `closed` or
+`abandoned` attempts are surfaced in their own section so the maintainer
+sees them alongside other action items. (This section may be empty if
+those state files are not available in the triage run; that's fine.)
 
-**Duplicates / related** - flag issues that overlap in scope or description.
+### 3. Build the report
 
-### 3. Triage PRs
+Write the report to `/tmp/issue-triage-report.md`. Each part of a
+multi-comment report is a separate file: `/tmp/issue-triage-report-1.md`,
+`/tmp/issue-triage-report-2.md`, etc.
 
-For each open PR, determine:
+Format:
 
-**Health flags** (check all that apply):
-- `no-issue` - PR body has no `Fixes/Closes/Resolves #N` reference (external
-  contributors only - collaborators are exempt)
-- `issue-closed` - PR links to an issue that is already closed (by another PR
-  or manually)
-- `checks-failing` - PR has failing CI checks
-- `stale` - no author activity (push or comment) for 14+ days with failing
-  checks
-- `duplicate-fix` - another open PR references the same issue
+````markdown
+<!-- agentic-ci-issue-triage:1/N -->
+## Repository Triage Report — YYYY-MM-DD
 
-**Cross-reference** - for each PR that references an issue:
-- Verify the linked issue exists and is open
-- Check if another open or merged PR also references the same issue
-- If two open PRs fix the same issue, flag both as `duplicate-fix`
+**Open issues:** N | **Open PRs:** N | **Healthy (no action needed):** N
 
-### 4. Build the report
+---
 
-Write the combined report to `/tmp/issue-triage-report.md` using this format:
+### Close as resolved (M)
 
-```markdown
-<!-- agentic-ci-issue-triage -->
-## Repository Triage Report
+| # | Title | Action | Evidence | Rationale |
+|---|-------|--------|----------|-----------|
+| #123 | ... | Close | Merged in #456 | PR title says "fix ..." matching issue scope |
 
-**Run date:** YYYY-MM-DD
-**Open issues:** N | **Open PRs:** N
+### Close as duplicate (M)
 
----
+| # | Title | Action | Evidence | Rationale |
+|---|-------|--------|----------|-----------|
+| #234 | ... | Close, point to #200 | Overlaps #200 (older) | Both describe the same crash on empty config |
 
-### Issues: action needed
+### Needs maintainer decision (M)
 
-Issues that need maintainer attention (potentially resolved, stale with no
-assignee, possible duplicates, needs-attention label).
+| # | Title | Action | Evidence | Rationale |
+|---|-------|--------|----------|-----------|
+| #345 | ... | Decide direction | `discussion` label, no consensus | Two competing approaches in comments |
 
-| # | Title | Category | Staleness | Flag | Notes |
-|---|-------|----------|-----------|------|-------|
+### Ready for assignment (M)
 
-### Issues: active work
+| # | Title | Action | Evidence | Rationale |
+|---|-------|--------|----------|-----------|
+| #456 | ... | Assign | Scope clear, no assignee | One-line repro, fix likely <50 LOC |
 
-Issues with assignees or linked open PRs.
+### Stuck PR (M)
 
-| # | Title | Category | Assignee | PR | Last updated |
-|---|-------|----------|----------|-----|-------------|
+| # | Title | Action | Evidence | Rationale |
+|---|-------|--------|----------|-----------|
+| #567 | ... | Nudge author or close | 3 failing checks, 21d since push | DCO + lint failing, author hasn't responded |
 
-### Issues: backlog
+### Duplicate PRs (M)
 
-Remaining open issues, ordered by staleness (most stale first).
+| # | Title | Action | Evidence | Rationale |
+|---|-------|--------|----------|-----------|
+| #678 / #679 | both fix #500 | Pick one, close other | Both reference #500 | #678 has tests, #679 is simpler |
 
-| # | Title | Category | Staleness | Last updated |
-|---|-------|----------|-----------|-------------|
+### Stale, consider closing (M)
 
----
+| # | Title | Action | Evidence | Rationale |
+|---|-------|--------|----------|-----------|
+| #789 | ... | Close with note | 87d no activity, no assignee | No traction; linked design discussion went silent |
 
-### PRs: action needed
+### Repeatedly-failed fix attempts (M)
 
-PRs with health flags that need maintainer attention.
+(Only emit this section if any items qualify. See `_fix-policy.md` —
+two-strike escalation.)
 
-| # | Title | Author | Flags | Notes |
-|---|-------|--------|-------|-------|
+| Finding | Suite | Attempts | Notes |
+|---------|-------|----------|-------|
+| ... | docs-and-references | 2 closed | Detector may be flagging a false positive |
 
-### PRs: healthy
+---
 
-Open PRs with no flags.
+<details>
+<summary>Healthy items (M issues, M PRs)</summary>
 
-| # | Title | Author | Linked issue | Last updated |
-|---|-------|--------|-------------|-------------|
+(One-line summary of each: `#N <title> — <author> — <last update>`. No
+action needed; this block is for completeness.)
 
----
+</details>
 
 ### Summary
 
-**Issues:**
-- N triaged, N flagged for action, N active, N backlog
-- Flags: X potentially resolved, Y stale, Z duplicates
+- N items flagged for action across 7 buckets
+- M PRs flagged (X stuck, Y duplicate)
+- K healthy items (collapsed above)
+````
 
-**PRs:**
-- N triaged, N flagged
-- Flags: X no linked issue, Y checks failing, Z stale, W duplicate fixes
-```
+The marker on the first line (`<!-- agentic-ci-issue-triage:1/N -->`) is
+required. If the report fits in one comment, set N = 1.
+
+### 4. Multi-comment split
+
+GitHub issue comments cap at 65,536 characters. Use a 60,000-char per-part
+budget to leave room for body manipulations.
+
+Build the parts:
+
+1. Render the full report. If `len(body) <= 60000`, you have one part. Use
+   marker `<!-- agentic-ci-issue-triage:1/1 -->`.
+2. Otherwise, split on **action-bucket boundaries** (never split a table
+   mid-row). Each part starts with its own marker
+   `<!-- agentic-ci-issue-triage:i/N -->` and a heading
+   `### Triage Report — Part i of N`.
+3. Place the summary and `Healthy items` `<details>` block at the end of
+   the last part.
 
 ### 5. Post the report
 
-Find the tracking issue number from the `ISSUE_TRIAGE_TRACKING_ISSUE`
-environment variable. Find the last comment by `github-actions[bot]` that
-contains `<!-- agentic-ci-issue-triage -->` and note its ID.
+Tracking issue number is in `ISSUE_TRIAGE_TRACKING_ISSUE`. List all
+existing bot comments containing `agentic-ci-issue-triage:` (in id
+order) and reconcile against the new parts:
 
-- If a previous comment exists, **edit it in place** using
-  `gh api -X PATCH repos/{owner}/{repo}/issues/comments/{id}`.
-- If no previous comment exists, post a new comment using `gh issue comment`.
+- For `i in 0..min(len(existing), len(parts))`: PATCH `existing[i]`
+  with `parts[i]` (`gh api -X PATCH .../comments/<id> -f body=...`).
+- Surplus parts: post via `gh issue comment --body-file`.
+- Surplus existing comments: delete via
+  `gh api -X DELETE .../comments/<id>`.
 
-```bash
-# Edit existing comment
-gh api -X PATCH "repos/${GITHUB_REPOSITORY}/issues/comments/${COMMENT_ID}" \
-  -f body="$(cat /tmp/issue-triage-report.md)"
+This keeps the report a coherent set across runs whether it grows,
+shrinks, or stays stable.
 
-# Or post new comment
-gh issue comment "$TRACKING_ISSUE" --body-file /tmp/issue-triage-report.md
-```
+### 6. Fallback
+
+If you cannot find the tracking issue or the API calls fail repeatedly,
+write the report parts to `/tmp/issue-triage-report-*.md` and stop. The
+workflow's fallback step will post `/tmp/issue-triage-report.md` (which,
+for a single-part report, is identical to part 1; the workflow falls back
+only on single-part output). The workflow does not handle multi-part
+fallback — make sure the report fits in one part if you cannot post it
+yourself.
 
 ## Constraints
 
-- **Read-only triage.** Do not close, label, or modify any issues or PRs. The
-  report is for maintainers to act on.
-- **Do not post the report yourself if you cannot find the tracking issue.**
-  Write the report to `/tmp/issue-triage-report.md` and stop. The workflow
-  will handle fallback posting.
-- **Stay concise.** Notes columns should be one sentence max. Link to the
-  relevant PR, issue, or duplicate - don't explain the fix.
+- **Read-only triage.** Do not close, label, or modify any issues or PRs.
+  The report is for maintainers to act on.
+- **Stay concise.** Rationale columns should be one sentence max.
+- **No fix authority.** This recipe never opens PRs or commits code. It
+  reads, classifies, and posts a report.
 - **Cost awareness.** Do not read full issue/PR bodies unless needed to
-  determine duplicates or verify cross-references. The metadata from
-  `gh issue list` and `gh pr list` is enough for most checks.
+  determine duplicates, verify cross-references, or decide an action. The
+  metadata from `gh issue list` / `gh pr list` is enough for most checks.
diff --git a/.agents/recipes/structure/recipe.md b/.agents/recipes/structure/recipe.md
index 2dc793d54..3df885ecd 100644
--- a/.agents/recipes/structure/recipe.md
+++ b/.agents/recipes/structure/recipe.md
@@ -51,6 +51,10 @@ regardless of test coverage.
 Read `{{memory_path}}/runner-state.json` for known issues from previous runs.
 Update after the audit. Skip re-reporting known issues.
 
+This recipe also maintains `fix_backlog` and `attempted_fixes` per
+`_fix-policy.md`. Update `fix_backlog` for every detected finding *before*
+the `known_issues` filter applies.
+
 ## Instructions
 
 ### 1. Import boundary violations
@@ -208,15 +212,35 @@ Write the report to `/tmp/audit-{{suite}}.md`:
 
 If no findings in any category, write `NO_FINDINGS` on the first line instead.
 
+## Fix phase
+
+Follow the standard fix procedure in `_fix-policy.md`. Suite-specific bits:
+
+### Eligible categories
+
+| Category | Branch type | test_required | Eligibility note |
+|----------|-------------|---------------|------------------|
+| missing-future | `chore` | yes | Insert `from __future__ import annotations` after the SPDX header block, before other imports. Fully deterministic. Tests required because `__future__` annotations can affect introspection-heavy code paths. |
+| lazy-import | `refactor` | yes | Move a top-level heavy import (pandas/numpy/polars/torch/duckdb/sqlfluff/faker) to the `data_designer.lazy_heavy_imports` accessor pattern. Eligible only when (a) file is under `packages/*/src/`, (b) the module is already wired in the lazy system, (c) the heavy module is used only inside function bodies. |
+
+**Not eligible** — stays report-only:
+
+- Import boundary violations (architectural judgement).
+- Dead exports (audit labels them "potentially dead"; external plugin
+  consumers may use them).
+
 ## Constraints
 
-- Do not modify any files. This is a read-only audit.
+- Outside the fix phase, this recipe is read-only — do not modify files.
+- Within the fix phase, only modify paths in the suite's path allowlist.
+  See `_fix-policy.md` for the shared command/path baseline.
 - Imports inside `if TYPE_CHECKING:` blocks are allowed and should not be
   flagged for any check.
 - Lazy imports in `__init__.py` (via `__getattr__`) are deferred and should
   not be treated as violations.
 - Dead export detection has false positives. Mark uncertain cases as
-  "potentially dead" rather than definitively dead.
+  "potentially dead" rather than definitively dead. **Do not auto-remove
+  them** — they are not in the fix-eligible list.
 - Always cite which rule or doc is violated so maintainers can verify.
 - Import boundaries are currently clean. No findings in that section is
   normal and expected.
diff --git a/.agents/recipes/test-health/recipe.md b/.agents/recipes/test-health/recipe.md
index 1fe82c4a8..2224684ff 100644
--- a/.agents/recipes/test-health/recipe.md
+++ b/.agents/recipes/test-health/recipe.md
@@ -317,9 +317,29 @@ Write the report to `/tmp/audit-{{suite}}.md`:
 
 If no findings in any category, write `NO_FINDINGS` on the first line instead.
 
+## Fix phase
+
+**This suite has no fix phase.** All categories — coverage gaps, hollow
+tests, import perf regressions, smoke check failures, test isolation
+violations — stay report-only.
+
+Rationale: the categories that look mechanical (rewriting hollow
+`assert ... is not None` checks, adding missing test files, fixing
+test-isolation violations) all require inferring intent or authoring new
+code. The audit phase only commits to *flagging* these conservatively;
+turning that into authored test changes is beyond the "self-evident"
+bar in `_fix-policy.md`.
+
+A future revision may add **test-isolation violations** (config tests
+importing engine, engine tests importing interface) as an eligible fix
+category — those are mechanical (replace the cross-boundary import with
+a test-local equivalent or fixture). Add only after `code-quality` (the
+other inference-heavy suite) has its draft-PR landing rate proven and
+`draft_until_proven` flipped to `false`.
+
 ## Constraints
 
-- Do not modify any test files. This is a read-only audit.
+- This recipe is fully read-only — do not modify any files at all.
 - Do not run the full test suite or coverage tool. Analysis is based on
   file structure and static inspection, not execution.
 - Be conservative with hollow test detection. Only flag tests you've read
diff --git a/.github/workflows/agentic-ci-daily.yml b/.github/workflows/agentic-ci-daily.yml
index d04a5d839..481ac937d 100644
--- a/.github/workflows/agentic-ci-daily.yml
+++ b/.github/workflows/agentic-ci-daily.yml
@@ -127,6 +127,11 @@ jobs:
           python -m venv /tmp/graphify-venv
           /tmp/graphify-venv/bin/python -m pip install graphifyy==0.4.23 --quiet 2>&1 | tail -3
 
+      - name: Configure git identity
+        run: |
+          git config user.email "agentic-ci@datadesigner"
+          git config user.name "Agentic CI"
+
       - name: Pre-flight checks
         env:
           ANTHROPIC_BASE_URL: ${{ secrets.AGENTIC_CI_API_BASE_URL }}
@@ -173,11 +178,13 @@ jobs:
             exit 1
           fi
 
-          # Build prompt: _runner.md + recipe body (strip YAML frontmatter)
+          # Build prompt: phase directive + _runner.md + _fix-policy.md + recipe body (strip YAML frontmatter)
+          PHASE_DIRECTIVE=$(cat .agents/recipes/_phase-audit.md)
           RUNNER_CTX=$(cat .agents/recipes/_runner.md)
+          FIX_POLICY=$(cat .agents/recipes/_fix-policy.md)
           RECIPE_BODY=$(sed '1,/^---$/{ /^---$/,/^---$/d }' "${RECIPE_DIR}/recipe.md")
 
-          PROMPT=$(printf '%s\n\n%s\n' "${RUNNER_CTX}" "${RECIPE_BODY}" \
+          PROMPT=$(printf '%s\n\n%s\n\n%s\n\n%s\n' "${PHASE_DIRECTIVE}" "${RUNNER_CTX}" "${FIX_POLICY}" "${RECIPE_BODY}" \
             | sed "s|{{suite}}|${SUITE}|g" \
             | sed "s|{{date}}|$(date -u +%Y-%m-%d)|g" \
             | sed "s|{{memory_path}}|.agentic-ci-state|g")
@@ -190,6 +197,49 @@ jobs:
             --verbose \
             2>&1 | tee /tmp/claude-audit-log.txt
 
+      - name: Check fix backlog
+        id: backlog
+        if: success() && matrix.suite != 'test-health'
+        run: |
+          BACKLOG_SIZE=$(jq '.fix_backlog // [] | length' .agentic-ci-state/runner-state.json 2>/dev/null || echo 0)
+          echo "size=${BACKLOG_SIZE}" >> "$GITHUB_OUTPUT"
+          echo "fix_backlog has ${BACKLOG_SIZE} entries"
+
+      - name: Run fix recipe
+        id: fix
+        if: success() && matrix.suite != 'test-health' && fromJSON(steps.backlog.outputs.size || '0') > 0
+        env:
+          ANTHROPIC_BASE_URL: ${{ secrets.AGENTIC_CI_API_BASE_URL }}
+          ANTHROPIC_API_KEY: ${{ secrets.AGENTIC_CI_API_KEY }}
+          AGENTIC_CI_MODEL: ${{ vars.AGENTIC_CI_MODEL }}
+          DISABLE_PROMPT_CACHING: "1"
+          GH_TOKEN: ${{ github.token }}
+          GITHUB_REPOSITORY: ${{ github.repository }}
+          SUITE: ${{ matrix.suite }}
+        run: |
+          set -o pipefail
+
+          RECIPE_DIR=".agents/recipes/${SUITE}"
+
+          # Build prompt: phase directive + _runner.md + _fix-policy.md + recipe body (strip YAML frontmatter)
+          PHASE_DIRECTIVE=$(cat .agents/recipes/_phase-fix.md)
+          RUNNER_CTX=$(cat .agents/recipes/_runner.md)
+          FIX_POLICY=$(cat .agents/recipes/_fix-policy.md)
+          RECIPE_BODY=$(sed '1,/^---$/{ /^---$/,/^---$/d }' "${RECIPE_DIR}/recipe.md")
+
+          PROMPT=$(printf '%s\n\n%s\n\n%s\n\n%s\n' "${PHASE_DIRECTIVE}" "${RUNNER_CTX}" "${FIX_POLICY}" "${RECIPE_BODY}" \
+            | sed "s|{{suite}}|${SUITE}|g" \
+            | sed "s|{{date}}|$(date -u +%Y-%m-%d)|g" \
+            | sed "s|{{memory_path}}|.agentic-ci-state|g")
+
+          stdbuf -oL -eL claude \
+            --model "$AGENTIC_CI_MODEL" \
+            -p "$PROMPT" \
+            --max-turns 50 \
+            --output-format stream-json \
+            --verbose \
+            2>&1 | tee /tmp/claude-fix-log.txt
+
       - name: Update runner memory
         if: always()
         env:
@@ -205,7 +255,9 @@ jobs:
           except (json.JSONDecodeError, FileNotFoundError) as e:
               print(f'::warning::runner-state.json is invalid ({e}), resetting')
               state = {'suite': os.environ['SUITE'], 'known_issues': [], 'baselines': {}}
-          # Only stamp last_run if the audit actually succeeded
+          # Only stamp last_run if the audit actually succeeded.
+          # Fix phase manages its own state via attempted_fixes; its outcome
+          # does not gate last_run.
           if os.environ.get('AUDIT_OUTCOME') == 'success':
               state['last_run'] = datetime.datetime.now(datetime.timezone.utc).isoformat()
           state['suite'] = os.environ['SUITE']
@@ -220,7 +272,9 @@ jobs:
           name: claude-audit-log-${{ matrix.suite }}-${{ github.run_id }}-${{ github.run_attempt }}
           path: |
             /tmp/claude-audit-log.txt
+            /tmp/claude-fix-log.txt
             /tmp/audit-${{ matrix.suite }}.md
+            /tmp/pr-body-${{ matrix.suite }}.md
           retention-days: 14
           if-no-files-found: ignore
 
@@ -228,9 +282,16 @@ jobs:
         if: always()
         env:
           SUITE: ${{ matrix.suite }}
+          AUDIT_OUTCOME: ${{ steps.audit.outcome }}
+          FIX_OUTCOME: ${{ steps.fix.outcome }}
+          BACKLOG_SIZE: ${{ steps.backlog.outputs.size }}
         run: |
           echo "## Daily Audit: ${SUITE}" >> "$GITHUB_STEP_SUMMARY"
           echo "" >> "$GITHUB_STEP_SUMMARY"
+          echo "- Audit: \`${AUDIT_OUTCOME:-unknown}\`" >> "$GITHUB_STEP_SUMMARY"
+          echo "- Fix backlog size: \`${BACKLOG_SIZE:-n/a}\`" >> "$GITHUB_STEP_SUMMARY"
+          echo "- Fix: \`${FIX_OUTCOME:-skipped}\`" >> "$GITHUB_STEP_SUMMARY"
+          echo "" >> "$GITHUB_STEP_SUMMARY"
 
           if [ -s "/tmp/audit-${SUITE}.md" ]; then
             cat "/tmp/audit-${SUITE}.md" >> "$GITHUB_STEP_SUMMARY"

From 91c0015d2ed2bc4c8676b8015277386f5f6390d4 Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Mon, 4 May 2026 17:13:55 +0000
Subject: [PATCH 02/12] fix(agentic-ci): address review findings before merge
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

- Map per-package test targets explicitly in _fix-policy.md (Makefile
  exposes test-config/test-engine/test-interface, not test-<package>).
- Use github-actions[bot] noreply identity for commits the recipes
  produce.
- Refresh fix_backlog.data when an id already exists so the fix phase
  cannot drive a PR from stale data after the underlying file changed.
- Stop time-pruning closed/abandoned attempted_fixes entries — pruning
  before the two-strike threshold erases the history needed to
  escalate. Single-strike entries now age out only via the 200-entry
  cap.
- Disambiguate bare-except findings within the same function by
  including a try-body hash in the finding id.
- Audit grep for code-quality now matches both `except:` and
  `except BaseException:`, in parity with the fix eligibility.
- Restrict transitive-gap fix eligibility to cases where a sibling
  package already declares the dep (avoids inventing version
  specifiers from scratch).
- Issue-triage workflow handles multi-part reports in both the fallback
  post step and the job summary; recipe always writes numbered parts.
---
 .agents/recipes/_fix-policy.md                | 35 ++++++++---
 .agents/recipes/code-quality/recipe.md        |  6 +-
 .agents/recipes/dependencies/recipe.md        |  8 ++-
 .agents/recipes/issue-triage/recipe.md        | 15 +++--
 .github/workflows/agentic-ci-daily.yml        |  4 +-
 .github/workflows/agentic-ci-issue-triage.yml | 62 +++++++++++--------
 6 files changed, 81 insertions(+), 49 deletions(-)

diff --git a/.agents/recipes/_fix-policy.md b/.agents/recipes/_fix-policy.md
index 963824cab..065842704 100644
--- a/.agents/recipes/_fix-policy.md
+++ b/.agents/recipes/_fix-policy.md
@@ -16,8 +16,15 @@ A finding may be converted to a fix only if all hold:
   bumps (Dependabot owns those), no schema changes, no migrations.
 - **Self-evident**: the audit established both the problem *and* the unique
   correct fix. Mechanical, not interpretive.
-- **Test-safe**: when the recipe declares `test_required`, run
-  `make test-<package>` for the affected package and abort on failure.
+- **Test-safe**: when the recipe declares `test_required`, run the
+  per-package test target for the affected package and abort on failure.
+  Mapping (the Makefile does not expose `test-<package>` directly):
+
+  | Package directory | Test target |
+  |-------------------|-------------|
+  | `packages/data-designer-config` | `make test-config` |
+  | `packages/data-designer-engine` | `make test-engine` |
+  | `packages/data-designer` | `make test-interface` |
 - **Single concern**: one finding per PR.
 - **Allowlisted paths**: matches the suite's path allowlist.
 
@@ -74,8 +81,11 @@ code-quality and unset elsewhere) controls draft-PR mode.
 
 ### `fix_backlog` rules (audit phase populates this)
 
-- Append every detected finding in an eligible category. Update `last_seen`
-  if `id` already present.
+- Append every detected finding in an eligible category. If `id` is already
+  present, **refresh both `last_seen` and `data`** with the current scan's
+  values. The `data` field is used by the fix phase to apply the change
+  without re-scanning, so stale `data` would let an old plan drive a new
+  PR after the underlying file moved or changed.
 - Drop entries with `last_seen` older than 30 days.
 - Cap at 200 entries (drop oldest by `first_seen`).
 - Populated **before** the `known_issues` filter so fixable findings persist
@@ -89,9 +99,14 @@ code-quality and unset elsewhere) controls draft-PR mode.
   conflict, lint failed, allowlist rejected, etc.).
 - Reconcile against open PRs (`gh pr list`) at the start of each fix run
   to recover from crashes that left state un-updated.
-- Prune: drop `merged` >90d, drop single `closed`/`abandoned` >30d.
-- Two-strike entries (≥2 `closed`/`abandoned`) are NOT pruned; they
-  surface in the report under `Repeatedly-failed fix attempts`.
+- Prune: drop `merged` entries older than 90 days. Do **not** prune
+  `closed` or `abandoned` entries by age — pruning a single-strike entry
+  would erase the history needed to ever reach the two-strike threshold.
+  The 200-entry cap (with oldest-first eviction by `first_seen`) handles
+  long-tail cleanup.
+- Two-strike entries (≥2 `closed`/`abandoned`) surface in the report
+  under `Repeatedly-failed fix attempts` and are filtered from selection
+  permanently.
 
 ## Finding hash
 
@@ -108,9 +123,13 @@ text:
 | dependencies (unused) | `<package>:<dep>:unused` |
 | structure (missing-future) | `<source-file>:missing-future` |
 | structure (lazy-import) | `<source-file>:lazy-import:<imported-module>` |
-| code-quality (bare-except) | `<source-file>:<enclosing-symbol>:bare-except` |
+| code-quality (bare-except) | `<source-file>:<enclosing-symbol>:<try-body-hash>:bare-except` |
 
 Symbols use fully-qualified Python names.
+`try-body-hash` is `sha1(<try-block body, leading/trailing whitespace
+stripped, internal lines preserved>)[:8]` — needed because a function
+can contain multiple `try:` blocks with bare excepts that would
+otherwise collide on the same finding id.
 
 ## Ranking
 
diff --git a/.agents/recipes/code-quality/recipe.md b/.agents/recipes/code-quality/recipe.md
index 302a8f5ad..268e0adff 100644
--- a/.agents/recipes/code-quality/recipe.md
+++ b/.agents/recipes/code-quality/recipe.md
@@ -73,8 +73,10 @@ Check for patterns that violate the project's "errors normalize at
 boundaries" principle (AGENTS.md):
 
 ```bash
-# Bare except clauses (should use specific exception types)
-grep -rn "except:" packages/*/src/ --include='*.py' | grep -v "# noqa"
+# Bare except clauses (should use specific exception types).
+# Catches both `except:` and `except BaseException:` — both swallow
+# everything including KeyboardInterrupt and SystemExit.
+grep -rnE "except\s*:|except\s+BaseException" packages/*/src/ --include='*.py' | grep -v "# noqa"
 
 # Swallowed exceptions (except + pass/continue with no logging)
 grep -rn -A1 "except" packages/*/src/ --include='*.py' | grep -B1 "pass$\|continue$"
diff --git a/.agents/recipes/dependencies/recipe.md b/.agents/recipes/dependencies/recipe.md
index deedf742b..13580393b 100644
--- a/.agents/recipes/dependencies/recipe.md
+++ b/.agents/recipes/dependencies/recipe.md
@@ -166,12 +166,14 @@ Follow the standard fix procedure in `_fix-policy.md`. Suite-specific bits:
 
 | Category | Branch type | test_required | Eligibility note |
 |----------|-------------|---------------|------------------|
-| transitive-gap | `chore` | yes | Add the imported module to `[project.dependencies]` of the package that imports it. Use the version specifier from a package that already declares it; otherwise the latest stable specifier. Insert in alphabetical order; match existing quote/specifier style. |
+| transitive-gap | `chore` | yes | Add the imported module to `[project.dependencies]` of the package that imports it, copying the version specifier from a sibling package that already declares it. Insert in alphabetical order; match existing quote/specifier style. **Ineligible** when no sibling package declares the dep — choosing a specifier from scratch is interpretive, not mechanical. Those findings stay report-only and surface for maintainer judgement. |
 | unused | `chore` | yes | Remove the declaration. Eligible only when grep across the package's `src/`, lazy-import system, plugin entry points, and tests turns up zero references. |
 
 `fix_backlog.data` should record: for transitive-gap, the importing source
-files and proposed version specifier; for unused, which other packages
-also declare the dep.
+files and the sibling package whose specifier was copied (the recipe
+must record this *during the audit*; the fix phase rejects entries with
+no sibling source). For unused, which other packages also declare the
+dep.
 
 Before running `make test-<package>`, run `make install-dev` to confirm
 the lockfile resolves cleanly. `make install-dev` is the only sanctioned
diff --git a/.agents/recipes/issue-triage/recipe.md b/.agents/recipes/issue-triage/recipe.md
index 3edc061ae..3536c4b82 100644
--- a/.agents/recipes/issue-triage/recipe.md
+++ b/.agents/recipes/issue-triage/recipe.md
@@ -71,9 +71,10 @@ those state files are not available in the triage run; that's fine.)
 
 ### 3. Build the report
 
-Write the report to `/tmp/issue-triage-report.md`. Each part of a
-multi-comment report is a separate file: `/tmp/issue-triage-report-1.md`,
-`/tmp/issue-triage-report-2.md`, etc.
+Write each part to a numbered file: `/tmp/issue-triage-report-1.md`,
+`/tmp/issue-triage-report-2.md`, etc. Single-part reports use
+`/tmp/issue-triage-report-1.md`. The workflow's fallback step looks for
+numbered files first.
 
 Format:
 
@@ -191,11 +192,9 @@ shrinks, or stays stable.
 
 If you cannot find the tracking issue or the API calls fail repeatedly,
 write the report parts to `/tmp/issue-triage-report-*.md` and stop. The
-workflow's fallback step will post `/tmp/issue-triage-report.md` (which,
-for a single-part report, is identical to part 1; the workflow falls back
-only on single-part output). The workflow does not handle multi-part
-fallback — make sure the report fits in one part if you cannot post it
-yourself.
+workflow's fallback step posts every numbered part in order if no
+agent-authored comments containing today's date already exist on the
+tracking issue.
 
 ## Constraints
 
diff --git a/.github/workflows/agentic-ci-daily.yml b/.github/workflows/agentic-ci-daily.yml
index 481ac937d..40c57ae5c 100644
--- a/.github/workflows/agentic-ci-daily.yml
+++ b/.github/workflows/agentic-ci-daily.yml
@@ -129,8 +129,8 @@ jobs:
 
       - name: Configure git identity
         run: |
-          git config user.email "agentic-ci@datadesigner"
-          git config user.name "Agentic CI"
+          git config user.email "41898282+github-actions[bot]@users.noreply.github.com"
+          git config user.name "github-actions[bot]"
 
       - name: Pre-flight checks
         env:
diff --git a/.github/workflows/agentic-ci-issue-triage.yml b/.github/workflows/agentic-ci-issue-triage.yml
index 7a2dcdbc6..796c4b669 100644
--- a/.github/workflows/agentic-ci-issue-triage.yml
+++ b/.github/workflows/agentic-ci-issue-triage.yml
@@ -101,42 +101,52 @@ jobs:
           GH_TOKEN: ${{ github.token }}
           TRACKING_ISSUE: ${{ vars.ISSUE_TRIAGE_TRACKING_ISSUE }}
         run: |
-          if [ ! -s "/tmp/issue-triage-report.md" ]; then
+          # Collect report parts. Numbered files (multi-part) take precedence;
+          # fall back to the unsuffixed file for single-part runs.
+          shopt -s nullglob
+          PARTS=(/tmp/issue-triage-report-*.md)
+          shopt -u nullglob
+          if [ ${#PARTS[@]} -eq 0 ] && [ -s /tmp/issue-triage-report.md ]; then
+            PARTS=(/tmp/issue-triage-report.md)
+          fi
+          if [ ${#PARTS[@]} -eq 0 ]; then
             echo "::warning::Triage report not created by agent."
             exit 0
           fi
+          mapfile -t PARTS < <(printf '%s\n' "${PARTS[@]}" | sort -V)
 
-          # Check if the agent already posted/updated the comment.
-          MARKER="<!-- agentic-ci-issue-triage -->"
-          EXISTING=$(gh api "repos/${{ github.repository }}/issues/${TRACKING_ISSUE}/comments" \
-            --jq "[.[] | select(.user.login == \"github-actions[bot]\") | select(.body | contains(\"${MARKER}\"))] | last | .id" \
-            2>/dev/null || echo "")
-
-          REPORT=$(cat /tmp/issue-triage-report.md)
-
-          # Only post if the report marker is not already in a recent comment
-          # with today's date (agent already posted).
+          # Skip the fallback if the agent already posted today (any number of parts).
+          MARKER_PREFIX="agentic-ci-issue-triage:"
           TODAY=$(date -u +%Y-%m-%d)
-          if [ -n "$EXISTING" ] && [ "$EXISTING" != "null" ]; then
-            EXISTING_BODY=$(gh api "repos/${{ github.repository }}/issues/comments/${EXISTING}" --jq '.body')
-            if echo "$EXISTING_BODY" | grep -q "$TODAY"; then
-              echo "Agent already posted today's report. Skipping fallback."
-              exit 0
-            fi
-            # Update existing comment.
-            gh api -X PATCH "repos/${{ github.repository }}/issues/comments/${EXISTING}" \
-              -f body="$REPORT"
-            echo "Updated existing triage comment."
-          else
-            gh issue comment "$TRACKING_ISSUE" --body-file /tmp/issue-triage-report.md
-            echo "Posted new triage comment."
+          ALREADY_POSTED=$(gh api "repos/${{ github.repository }}/issues/${TRACKING_ISSUE}/comments" \
+            --paginate \
+            --jq "[.[] | select(.user.login == \"github-actions[bot]\") | select(.body | contains(\"${MARKER_PREFIX}\")) | select(.body | contains(\"${TODAY}\"))] | length" \
+            2>/dev/null || echo 0)
+          if [ "${ALREADY_POSTED}" != "0" ]; then
+            echo "Agent already posted today's report (${ALREADY_POSTED} part(s)). Skipping fallback."
+            exit 0
           fi
 
+          for PART in "${PARTS[@]}"; do
+            gh issue comment "$TRACKING_ISSUE" --body-file "$PART"
+            echo "Posted ${PART}"
+          done
+
       - name: Write job summary
         if: always()
         run: |
-          if [ -s "/tmp/issue-triage-report.md" ]; then
-            cat /tmp/issue-triage-report.md >> "$GITHUB_STEP_SUMMARY"
+          shopt -s nullglob
+          PARTS=(/tmp/issue-triage-report-*.md)
+          shopt -u nullglob
+          if [ ${#PARTS[@]} -eq 0 ] && [ -s /tmp/issue-triage-report.md ]; then
+            PARTS=(/tmp/issue-triage-report.md)
+          fi
+          if [ ${#PARTS[@]} -gt 0 ]; then
+            mapfile -t PARTS < <(printf '%s\n' "${PARTS[@]}" | sort -V)
+            for PART in "${PARTS[@]}"; do
+              cat "${PART}" >> "$GITHUB_STEP_SUMMARY"
+              echo "" >> "$GITHUB_STEP_SUMMARY"
+            done
           else
             echo "No triage report was generated." >> "$GITHUB_STEP_SUMMARY"
           fi

From 141a33b260942c6df7afbefe4aa9d5bc69900adc Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Mon, 4 May 2026 17:23:29 +0000
Subject: [PATCH 03/12] fix(agentic-ci): close residuals from review pass 2

- Replace remaining `make test-<package>` references with pointers to
  the mapping table; only the table itself uses that placeholder now.
- Fix `gh api --paginate | jq | length` returning per-page counts: slurp
  with `jq -s 'add // 0'` to get a single total.
- Compare posted-comment count to expected part count so a partial post
  (agent posted part 1 but not 2/3) triggers the fallback instead of
  being silently treated as success.
- Add `shell: bash` to triage steps using `shopt`/`mapfile` so they're
  not at the mercy of the runner's default shell.
- Disambiguate bare-except findings whose try-body hashes collide by
  adding a per-function ordinal to the canonical_key.
- Tie the 200-entry attempted_fixes cap eviction to `attempts[0].at`
  (the schema has no `first_seen` field).
---
 .agents/recipes/_fix-policy.md                | 22 +++++++++++--------
 .agents/recipes/dependencies/recipe.md        |  7 +++---
 .github/workflows/agentic-ci-issue-triage.yml | 20 ++++++++++++-----
 3 files changed, 32 insertions(+), 17 deletions(-)

diff --git a/.agents/recipes/_fix-policy.md b/.agents/recipes/_fix-policy.md
index 065842704..bece0226e 100644
--- a/.agents/recipes/_fix-policy.md
+++ b/.agents/recipes/_fix-policy.md
@@ -102,8 +102,9 @@ code-quality and unset elsewhere) controls draft-PR mode.
 - Prune: drop `merged` entries older than 90 days. Do **not** prune
   `closed` or `abandoned` entries by age — pruning a single-strike entry
   would erase the history needed to ever reach the two-strike threshold.
-  The 200-entry cap (with oldest-first eviction by `first_seen`) handles
-  long-tail cleanup.
+  The 200-entry cap handles long-tail cleanup; eviction order is oldest
+  first by the `at` date of the entry's first attempt
+  (`attempts[0].at`).
 - Two-strike entries (≥2 `closed`/`abandoned`) surface in the report
   under `Repeatedly-failed fix attempts` and are filtered from selection
   permanently.
@@ -123,13 +124,15 @@ text:
 | dependencies (unused) | `<package>:<dep>:unused` |
 | structure (missing-future) | `<source-file>:missing-future` |
 | structure (lazy-import) | `<source-file>:lazy-import:<imported-module>` |
-| code-quality (bare-except) | `<source-file>:<enclosing-symbol>:<try-body-hash>:bare-except` |
+| code-quality (bare-except) | `<source-file>:<enclosing-symbol>:<try-body-hash>:<ordinal>:bare-except` |
 
 Symbols use fully-qualified Python names.
 `try-body-hash` is `sha1(<try-block body, leading/trailing whitespace
-stripped, internal lines preserved>)[:8]` — needed because a function
-can contain multiple `try:` blocks with bare excepts that would
-otherwise collide on the same finding id.
+stripped, internal lines preserved>)[:8]`.
+`ordinal` is the 1-based position of this bare-except among bare-excepts
+in the same enclosing symbol, in source order. Both are needed: the body
+hash distinguishes most cases, and the ordinal disambiguates the rare
+case of two bare-except blocks with byte-identical try bodies.
 
 ## Ranking
 
@@ -181,9 +184,10 @@ declare only the parts that vary (eligible categories, branch type,
       remove from `fix_backlog` and continue.
    2. Apply the fix. If the diff exceeds the localized-fix bar or touches
       a non-allowlisted path, abandon and continue.
-   3. If the category sets `test_required: true`, run
-      `make test-<package>` for the package containing the change. On
-      failure: abandon and continue.
+   3. If the category sets `test_required: true`, run the per-package
+      test target (see the mapping table in "Localized fix bar" above)
+      for the package containing the change. On failure: abandon and
+      continue.
    4. Branch: `agentic-ci/<type>/<suite>-YYYYMMDD-<short-slug>`. Commit:
       `<type>(agentic-ci): <one-line>`. Push.
    5. Write the PR body to `/tmp/pr-body-{{suite}}.md`, including the
diff --git a/.agents/recipes/dependencies/recipe.md b/.agents/recipes/dependencies/recipe.md
index 13580393b..bc72ee805 100644
--- a/.agents/recipes/dependencies/recipe.md
+++ b/.agents/recipes/dependencies/recipe.md
@@ -175,9 +175,10 @@ must record this *during the audit*; the fix phase rejects entries with
 no sibling source). For unused, which other packages also declare the
 dep.
 
-Before running `make test-<package>`, run `make install-dev` to confirm
-the lockfile resolves cleanly. `make install-dev` is the only sanctioned
-install command (no direct `pip install` or `uv pip install`).
+Before running the per-package test target (see `_fix-policy.md` for the
+mapping), run `make install-dev` to confirm the lockfile resolves cleanly.
+`make install-dev` is the only sanctioned install command (no direct
+`pip install` or `uv pip install`).
 
 **Not eligible** — stays report-only:
 
diff --git a/.github/workflows/agentic-ci-issue-triage.yml b/.github/workflows/agentic-ci-issue-triage.yml
index 796c4b669..841410d11 100644
--- a/.github/workflows/agentic-ci-issue-triage.yml
+++ b/.github/workflows/agentic-ci-issue-triage.yml
@@ -97,6 +97,7 @@ jobs:
         continue-on-error: true
 
       - name: Fallback post if agent did not post
+        shell: bash
         env:
           GH_TOKEN: ${{ github.token }}
           TRACKING_ISSUE: ${{ vars.ISSUE_TRIAGE_TRACKING_ISSUE }}
@@ -114,19 +115,27 @@ jobs:
             exit 0
           fi
           mapfile -t PARTS < <(printf '%s\n' "${PARTS[@]}" | sort -V)
+          EXPECTED_PARTS=${#PARTS[@]}
 
-          # Skip the fallback if the agent already posted today (any number of parts).
+          # Count how many bot comments today already match the marker.
+          # `gh api --paginate` emits one JSON document per page, so we slurp
+          # them with `jq -s 'add'` to get a single count.
           MARKER_PREFIX="agentic-ci-issue-triage:"
           TODAY=$(date -u +%Y-%m-%d)
-          ALREADY_POSTED=$(gh api "repos/${{ github.repository }}/issues/${TRACKING_ISSUE}/comments" \
+          POSTED_COUNT=$(gh api "repos/${{ github.repository }}/issues/${TRACKING_ISSUE}/comments" \
             --paginate \
             --jq "[.[] | select(.user.login == \"github-actions[bot]\") | select(.body | contains(\"${MARKER_PREFIX}\")) | select(.body | contains(\"${TODAY}\"))] | length" \
-            2>/dev/null || echo 0)
-          if [ "${ALREADY_POSTED}" != "0" ]; then
-            echo "Agent already posted today's report (${ALREADY_POSTED} part(s)). Skipping fallback."
+            2>/dev/null | jq -s 'add // 0')
+
+          if [ "${POSTED_COUNT}" -ge "${EXPECTED_PARTS}" ]; then
+            echo "Agent already posted today's report (${POSTED_COUNT}/${EXPECTED_PARTS} part(s)). Skipping fallback."
             exit 0
           fi
 
+          if [ "${POSTED_COUNT}" -gt 0 ]; then
+            echo "::warning::Agent posted ${POSTED_COUNT} of ${EXPECTED_PARTS} expected parts. Posting all parts now to ensure coverage; this may produce duplicate comments."
+          fi
+
           for PART in "${PARTS[@]}"; do
             gh issue comment "$TRACKING_ISSUE" --body-file "$PART"
             echo "Posted ${PART}"
@@ -134,6 +143,7 @@ jobs:
 
       - name: Write job summary
         if: always()
+        shell: bash
         run: |
           shopt -s nullglob
           PARTS=(/tmp/issue-triage-report-*.md)

From 15cefc2707a685af5b96bb4eeba342ff6209562a Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Mon, 4 May 2026 17:28:18 +0000
Subject: [PATCH 04/12] fix(agentic-ci): identity-based partial-post detection
 in triage fallback

Replace the count-only POSTED_COUNT >= EXPECTED_PARTS check with an
identity-based check that extracts every i/N marker seen in
today-dated bot comments and verifies each expected i is present.
A duplicate post of one part can no longer mask a missing other.
---
 .github/workflows/agentic-ci-issue-triage.yml | 30 ++++++++++++-------
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/.github/workflows/agentic-ci-issue-triage.yml b/.github/workflows/agentic-ci-issue-triage.yml
index 841410d11..9a83787a9 100644
--- a/.github/workflows/agentic-ci-issue-triage.yml
+++ b/.github/workflows/agentic-ci-issue-triage.yml
@@ -117,23 +117,31 @@ jobs:
           mapfile -t PARTS < <(printf '%s\n' "${PARTS[@]}" | sort -V)
           EXPECTED_PARTS=${#PARTS[@]}
 
-          # Count how many bot comments today already match the marker.
-          # `gh api --paginate` emits one JSON document per page, so we slurp
-          # them with `jq -s 'add'` to get a single count.
-          MARKER_PREFIX="agentic-ci-issue-triage:"
+          # Skip fallback only if every expected part identity (i/N) appears
+          # in a today-dated bot comment. Identity-based, not count-based:
+          # a duplicate post of one part should not mask a missing other.
           TODAY=$(date -u +%Y-%m-%d)
-          POSTED_COUNT=$(gh api "repos/${{ github.repository }}/issues/${TRACKING_ISSUE}/comments" \
+          SEEN_PARTS=$(gh api "repos/${{ github.repository }}/issues/${TRACKING_ISSUE}/comments" \
             --paginate \
-            --jq "[.[] | select(.user.login == \"github-actions[bot]\") | select(.body | contains(\"${MARKER_PREFIX}\")) | select(.body | contains(\"${TODAY}\"))] | length" \
-            2>/dev/null | jq -s 'add // 0')
+            --jq ".[] | select(.user.login == \"github-actions[bot]\") | select(.body | contains(\"${TODAY}\")) | .body | capture(\"agentic-ci-issue-triage:(?<i>[0-9]+)/${EXPECTED_PARTS}\") | .i" \
+            2>/dev/null | sort -u)
+
+          ALL_COVERED=1
+          MISSING=()
+          for i in $(seq 1 "${EXPECTED_PARTS}"); do
+            if ! echo "${SEEN_PARTS}" | grep -qx "${i}"; then
+              ALL_COVERED=0
+              MISSING+=("${i}")
+            fi
+          done
 
-          if [ "${POSTED_COUNT}" -ge "${EXPECTED_PARTS}" ]; then
-            echo "Agent already posted today's report (${POSTED_COUNT}/${EXPECTED_PARTS} part(s)). Skipping fallback."
+          if [ "${ALL_COVERED}" -eq 1 ]; then
+            echo "All ${EXPECTED_PARTS} parts already posted today (identities: $(echo "${SEEN_PARTS}" | tr '\n' ',' | sed 's/,$//')). Skipping fallback."
             exit 0
           fi
 
-          if [ "${POSTED_COUNT}" -gt 0 ]; then
-            echo "::warning::Agent posted ${POSTED_COUNT} of ${EXPECTED_PARTS} expected parts. Posting all parts now to ensure coverage; this may produce duplicate comments."
+          if [ -n "${SEEN_PARTS}" ]; then
+            echo "::warning::Missing parts ${MISSING[*]} of ${EXPECTED_PARTS}; posting all parts. Duplicates may result for parts that were already posted."
           fi
 
           for PART in "${PARTS[@]}"; do

From 8786b8de22884b5e5e098b59a978bcd632949225 Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Mon, 4 May 2026 17:30:41 +0000
Subject: [PATCH 05/12] fix(agentic-ci): close remaining bot-review findings

- Exempt two-strike attempted_fixes entries from the 200-entry cap
  eviction. Cap now evicts non-two-strike oldest-first by
  attempts[0].at; two-strike entries are silently-forgotten only in
  the pathological all-200-are-two-strike case (itself a signal).
- Specify the attempted_fixes PR-marker reconciliation algorithm:
  scan open PR bodies for the `<!-- agentic-ci finding=<id> -->`
  marker and back-fill missing entries.
- Tighten the daily workflow conditionals to gate on explicit step
  outcomes (steps.audit.outcome == 'success' rather than success())
  so a future pre-audit gate cannot accidentally trip the fix step.
---
 .agents/recipes/_fix-policy.md         | 23 ++++++++++++++++-------
 .github/workflows/agentic-ci-daily.yml |  4 ++--
 2 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/.agents/recipes/_fix-policy.md b/.agents/recipes/_fix-policy.md
index bece0226e..7dbaf7dee 100644
--- a/.agents/recipes/_fix-policy.md
+++ b/.agents/recipes/_fix-policy.md
@@ -97,16 +97,25 @@ code-quality and unset elsewhere) controls draft-PR mode.
 
 - `abandoned` means the recipe could not produce a PR (tests failed,
   conflict, lint failed, allowlist rejected, etc.).
-- Reconcile against open PRs (`gh pr list`) at the start of each fix run
-  to recover from crashes that left state un-updated.
+- Reconcile against open PRs (`gh pr list`) at the start of each fix
+  run to recover from crashes that left state un-updated. The
+  reconciliation algorithm: list open PRs whose bodies contain the
+  `<!-- agentic-ci finding=<id> suite=<suite> -->` marker, parse out
+  each `<id>`, and back-fill any missing `attempted_fixes` entries with
+  `outcome: "open"` and the parsed `pr_number` and `branch`.
 - Prune: drop `merged` entries older than 90 days. Do **not** prune
   `closed` or `abandoned` entries by age — pruning a single-strike entry
   would erase the history needed to ever reach the two-strike threshold.
-  The 200-entry cap handles long-tail cleanup; eviction order is oldest
-  first by the `at` date of the entry's first attempt
-  (`attempts[0].at`).
-- Two-strike entries (≥2 `closed`/`abandoned`) surface in the report
-  under `Repeatedly-failed fix attempts` and are filtered from selection
+- The 200-entry cap handles long-tail cleanup. Eviction order:
+  non-two-strike entries first, oldest-first by `attempts[0].at`.
+  Two-strike entries (≥2 `closed`/`abandoned`) are exempt from cap
+  eviction unless every other entry has already been evicted — they
+  represent maintainer-action signals and must not be silently
+  forgotten. If two-strike entries alone exceed 200, that's itself a
+  signal worth surfacing; in that pathological case, evict oldest-first
+  by `attempts[0].at`.
+- Two-strike entries surface in the report under
+  `Repeatedly-failed fix attempts` and are filtered from selection
   permanently.
 
 ## Finding hash
diff --git a/.github/workflows/agentic-ci-daily.yml b/.github/workflows/agentic-ci-daily.yml
index 40c57ae5c..e5ca1e067 100644
--- a/.github/workflows/agentic-ci-daily.yml
+++ b/.github/workflows/agentic-ci-daily.yml
@@ -199,7 +199,7 @@ jobs:
 
       - name: Check fix backlog
         id: backlog
-        if: success() && matrix.suite != 'test-health'
+        if: steps.audit.outcome == 'success' && matrix.suite != 'test-health'
         run: |
           BACKLOG_SIZE=$(jq '.fix_backlog // [] | length' .agentic-ci-state/runner-state.json 2>/dev/null || echo 0)
           echo "size=${BACKLOG_SIZE}" >> "$GITHUB_OUTPUT"
@@ -207,7 +207,7 @@ jobs:
 
       - name: Run fix recipe
         id: fix
-        if: success() && matrix.suite != 'test-health' && fromJSON(steps.backlog.outputs.size || '0') > 0
+        if: steps.audit.outcome == 'success' && steps.backlog.outcome == 'success' && matrix.suite != 'test-health' && fromJSON(steps.backlog.outputs.size || '0') > 0
         env:
           ANTHROPIC_BASE_URL: ${{ secrets.AGENTIC_CI_API_BASE_URL }}
           ANTHROPIC_API_KEY: ${{ secrets.AGENTIC_CI_API_KEY }}

From e27792b78872b1875ea025fba11f9929ee632548 Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Mon, 4 May 2026 19:36:35 +0000
Subject: [PATCH 06/12] fix(agentic-ci): close Greptile pass-2 findings
 (timeout, re-verify wording)

- Bump daily-suite job timeout from 20 to 40 minutes. The split into
  two sequential `claude --max-turns 50` invocations can saturate a
  20-minute budget; a mid-fix SIGTERM would leave an orphaned branch
  and inconsistent runner-state.
- Disambiguate the `_phase-fix.md` "do NOT re-scan" rule. It forbids
  rebuilding fix_backlog from scratch but does NOT override the
  per-candidate re-verification step required by _fix-policy.md
  step 4.1 (re-grep / re-read the specific file the candidate points
  at). Single-candidate re-verification is required; whole-codebase
  re-scanning is forbidden.
---
 .agents/recipes/_phase-fix.md          | 10 ++++++++--
 .github/workflows/agentic-ci-daily.yml |  2 +-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/.agents/recipes/_phase-fix.md b/.agents/recipes/_phase-fix.md
index 48bf77a09..44449dd8e 100644
--- a/.agents/recipes/_phase-fix.md
+++ b/.agents/recipes/_phase-fix.md
@@ -6,8 +6,14 @@ This invocation runs the **FIX** phase only.
   report is at `/tmp/audit-{{suite}}.md` and
   `{{memory_path}}/runner-state.json` has the populated `fix_backlog`.
 - Execute only the recipe's "Fix phase" section per `_fix-policy.md`.
-  Do NOT redo audit work; do NOT re-scan the codebase to rebuild
-  findings.
+  Do NOT redo audit work — that is, do NOT re-scan whole packages or
+  rebuild `fix_backlog` from scratch. The "no re-scan" rule does NOT
+  override the per-candidate re-verification step required by
+  `_fix-policy.md` §"Standard fix procedure" step 4.1: when you pick a
+  candidate, you MUST re-grep / re-read the specific file or symbol it
+  points at to confirm the finding still applies before editing.
+  Re-verification of a single candidate is required; re-scanning the
+  codebase to discover new findings is forbidden.
 - Pick the highest-ranked eligible candidate from `fix_backlog`, apply
   the fix, run the package's tests if applicable, commit, push, and open
   the PR using `gh pr create --body-file`.
diff --git a/.github/workflows/agentic-ci-daily.yml b/.github/workflows/agentic-ci-daily.yml
index e5ca1e067..7bf3fe5e5 100644
--- a/.github/workflows/agentic-ci-daily.yml
+++ b/.github/workflows/agentic-ci-daily.yml
@@ -62,7 +62,7 @@ jobs:
     needs: determine-suite
     if: needs.determine-suite.outputs.suites != '[]'
     runs-on: [self-hosted, agentic-ci]
-    timeout-minutes: 20
+    timeout-minutes: 40
     strategy:
       fail-fast: false
       matrix:

From b3e07f3dca9d1e97ac1c2dc93bbe988e17b18957 Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Mon, 4 May 2026 20:00:29 +0000
Subject: [PATCH 07/12] fix(agentic-ci): close Greptile pass-3 P1s in triage
 fallback

- Guard `jq capture()` with a `test()` select. `capture()` errors on
  non-match instead of returning empty, which would truncate
  SEEN_PARTS if any unrelated today-dated bot comment lacks the
  triage marker (e.g. from a sibling workflow). Adding the test()
  guard ensures capture() only runs on bodies that already match.
- Iterate the MISSING[] array when posting fallback parts, not the
  full PARTS[] array. Posting all parts when only some were missing
  was creating duplicate comments for the parts the agent already
  successfully posted.
---
 .github/workflows/agentic-ci-issue-triage.yml | 17 ++++++++++-------
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/.github/workflows/agentic-ci-issue-triage.yml b/.github/workflows/agentic-ci-issue-triage.yml
index 9a83787a9..080ace5b7 100644
--- a/.github/workflows/agentic-ci-issue-triage.yml
+++ b/.github/workflows/agentic-ci-issue-triage.yml
@@ -120,33 +120,36 @@ jobs:
           # Skip fallback only if every expected part identity (i/N) appears
           # in a today-dated bot comment. Identity-based, not count-based:
           # a duplicate post of one part should not mask a missing other.
+          # The pre-`capture()` `test()` guard is required: jq `capture()`
+          # raises an error on non-matching input, which would truncate the
+          # stream if any unrelated today-dated bot comment lacks the
+          # triage marker (e.g. from another automated workflow).
           TODAY=$(date -u +%Y-%m-%d)
           SEEN_PARTS=$(gh api "repos/${{ github.repository }}/issues/${TRACKING_ISSUE}/comments" \
             --paginate \
-            --jq ".[] | select(.user.login == \"github-actions[bot]\") | select(.body | contains(\"${TODAY}\")) | .body | capture(\"agentic-ci-issue-triage:(?<i>[0-9]+)/${EXPECTED_PARTS}\") | .i" \
+            --jq ".[] | select(.user.login == \"github-actions[bot]\") | select(.body | contains(\"${TODAY}\")) | select(.body | test(\"agentic-ci-issue-triage:[0-9]+/${EXPECTED_PARTS}\")) | .body | capture(\"agentic-ci-issue-triage:(?<i>[0-9]+)/${EXPECTED_PARTS}\") | .i" \
             2>/dev/null | sort -u)
 
-          ALL_COVERED=1
           MISSING=()
           for i in $(seq 1 "${EXPECTED_PARTS}"); do
             if ! echo "${SEEN_PARTS}" | grep -qx "${i}"; then
-              ALL_COVERED=0
               MISSING+=("${i}")
             fi
           done
 
-          if [ "${ALL_COVERED}" -eq 1 ]; then
+          if [ ${#MISSING[@]} -eq 0 ]; then
             echo "All ${EXPECTED_PARTS} parts already posted today (identities: $(echo "${SEEN_PARTS}" | tr '\n' ',' | sed 's/,$//')). Skipping fallback."
             exit 0
           fi
 
           if [ -n "${SEEN_PARTS}" ]; then
-            echo "::warning::Missing parts ${MISSING[*]} of ${EXPECTED_PARTS}; posting all parts. Duplicates may result for parts that were already posted."
+            echo "::warning::Posting only missing parts ${MISSING[*]} of ${EXPECTED_PARTS}; the agent already posted parts $(echo "${SEEN_PARTS}" | tr '\n' ',' | sed 's/,$//')."
           fi
 
-          for PART in "${PARTS[@]}"; do
+          for i in "${MISSING[@]}"; do
+            PART="${PARTS[$((i-1))]}"
             gh issue comment "$TRACKING_ISSUE" --body-file "$PART"
-            echo "Posted ${PART}"
+            echo "Posted part ${i}: ${PART}"
           done
 
       - name: Write job summary

From 23829fb0412b4d40e090e16a271772a9712a025c Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Thu, 7 May 2026 23:40:00 +0000
Subject: [PATCH 08/12] fix(agentic-ci): close johnnygreco review-pass warnings
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Address the five Warnings from the 2026-05-07 review focused on the
trust boundary for autonomous PR generation. Five workflow/policy
adjustments shrink the surface where agent compliance is load-bearing:

- Workflow-level scope gate. After the fix step, re-derive the diff
  against `origin/main` and validate against the per-suite path
  allowlist (regex mirrored from `_fix-policy.md`), the 50-LOC cap, and
  the 3-file cap. On violation, close the PR with `--delete-branch`
  and flip the `attempted_fixes` entry from `open` to `abandoned` so
  two-strike logic still sees the failure. The recipe alone could not
  bind the agent's path choices; the workflow now does.
- Dependencies install-dev verification. For the dependencies suite
  only, re-run `make install-dev` after the scope gate so the agent's
  pyproject edit is exercised against the lockfile resolver. Closes
  the PR if `install-dev` fails — catches the failure mode where the
  per-package test target passed against the old cached lockfile.
- Flip matrix-job `cancel-in-progress` from true to false. A
  cancellation between the agent's git push and `gh pr create` would
  leave an orphaned branch with no `attempted_fixes` record;
  reconciliation only covers PRs that were opened. Queueing a
  duplicate run is the lesser evil. `_fix-policy.md` Atomicity
  section now documents the trade-off.
- Allow `/tmp/audit-{{suite}}.md` in `_phase-audit.md`'s "do not
  modify outside `{{memory_path}}/`" directive. A literal-minded
  agent could refuse to write the report file, which would break the
  job summary, artifact upload, and the fix phase's audit context.
- Always upload the agent log artifact (was `if: failure()` only) and
  include `runner-state.json`. For autonomous mode, the most
  interesting failure is "the workflow succeeded but the PR was
  wrong"; the stream-json log is the only way to look back days
  later.

Also takes johnnygreco's Suggestion 2: spell out in the policy doc
that the `draft_until_proven` flip is the sole human-gated
promotion step in the fix policy and must not be automated.

Greptile and the github-actions auto-reviewer's findings were
already closed in the prior pass-2/pass-3 commits; no action needed
on those.
---
 .agents/recipes/_fix-policy.md         |  23 +++-
 .agents/recipes/_phase-audit.md        |   3 +-
 .github/workflows/agentic-ci-daily.yml | 173 ++++++++++++++++++++++++-
 3 files changed, 195 insertions(+), 4 deletions(-)

diff --git a/.agents/recipes/_fix-policy.md b/.agents/recipes/_fix-policy.md
index 7dbaf7dee..210fd5435 100644
--- a/.agents/recipes/_fix-policy.md
+++ b/.agents/recipes/_fix-policy.md
@@ -222,7 +222,9 @@ to the next candidate.
 - **Labels**: `agentic-ci`, `agentic-ci/<suite>`.
 - **Draft PRs**: `code-quality` opens draft until a maintainer flips
   `draft_until_proven` to `false` in runner-state, after at least two
-  non-draft PRs from that suite have landed clean.
+  non-draft PRs from that suite have landed clean. This flip is
+  intentionally manual — it is the sole human-gated promotion step in
+  the fix policy and must not be automated.
 
 ## Atomicity
 
@@ -234,3 +236,22 @@ Each fix-phase invocation produces exactly one of:
 
 No half-states. The runner state is the source of truth for what the
 recipe has tried; never silently drop a failed attempt.
+
+The matrix-level concurrency for the daily workflow uses
+`cancel-in-progress: false` so a fix in flight cannot be cancelled
+between push and PR open. The trade-off is a queued duplicate run if a
+manual dispatch arrives while cron is still going; that's preferable to
+orphaned branches with no `attempted_fixes` record.
+
+## Workflow-level scope gate
+
+The agent's compliance with the path allowlists and the localized-fix
+bar is load-bearing for autonomous PR generation, but the recipe alone
+cannot enforce them. The daily workflow runs a post-fix scope gate that
+re-derives the per-suite allowlist (mirrored from the table above) and
+the diff stats from the pushed branch, then closes the PR and deletes
+the remote branch on violation. The gate also flips the
+`attempted_fixes` entry from `open` to `abandoned` so two-strike logic
+sees the failure. Keep the workflow's allowlist regexes in sync with the
+table above; the workflow is the enforcement, the table is the
+specification.
diff --git a/.agents/recipes/_phase-audit.md b/.agents/recipes/_phase-audit.md
index 5caefa0c8..c2cef775c 100644
--- a/.agents/recipes/_phase-audit.md
+++ b/.agents/recipes/_phase-audit.md
@@ -9,7 +9,8 @@ This invocation runs the **AUDIT** phase only.
   applying the `known_issues` filter to the report, so fixable findings
   persist across runs even when their report row is suppressed).
 - Do NOT attempt any fix. Do NOT create any branches, commits, or PRs.
-- Do NOT modify any files outside `{{memory_path}}/`.
+- Do NOT modify any files outside `{{memory_path}}/` and the report file
+  `/tmp/audit-{{suite}}.md` itself.
 - A separate invocation will run the FIX phase if `fix_backlog` has
   eligible candidates and the suite has a fix phase.
 - Read the recipe in full for context; the "Fix phase" section informs
diff --git a/.github/workflows/agentic-ci-daily.yml b/.github/workflows/agentic-ci-daily.yml
index 7bf3fe5e5..2a76f4c7e 100644
--- a/.github/workflows/agentic-ci-daily.yml
+++ b/.github/workflows/agentic-ci-daily.yml
@@ -69,8 +69,12 @@ jobs:
         suite: ${{ fromJSON(needs.determine-suite.outputs.suites) }}
 
     concurrency:
+      # cancel-in-progress is intentionally false: a cancellation between
+      # the agent's git push and gh pr create would leave an orphaned
+      # branch with no attempted_fixes record. Queueing a duplicate run is
+      # the lesser evil. See _fix-policy.md "Atomicity".
       group: agentic-ci-daily-${{ matrix.suite }}
-      cancel-in-progress: true
+      cancel-in-progress: false
 
     steps:
       - name: Check required config
@@ -240,6 +244,166 @@ jobs:
             --verbose \
             2>&1 | tee /tmp/claude-fix-log.txt
 
+      - name: Validate fix scope (allowlist + LOC + file cap)
+        # Workflow-level enforcement of the localized-fix bar from
+        # _fix-policy.md. Recipe instructions alone cannot bind the agent;
+        # this gate re-derives the diff and closes the PR if the agent
+        # escaped the allowlist or the LOC/file caps.
+        # Note: the docstring-only constraint on docs-and-references .py
+        # paths remains policy-level; AST-based enforcement is a follow-up.
+        id: scope_gate
+        if: steps.fix.outcome == 'success'
+        env:
+          SUITE: ${{ matrix.suite }}
+          GH_TOKEN: ${{ github.token }}
+        run: |
+          set -o pipefail
+
+          OPEN=$(jq -c '
+            .attempted_fixes // []
+            | map(select((.attempts | last | .outcome) == "open"))
+            | last // empty
+          ' .agentic-ci-state/runner-state.json)
+          if [ -z "$OPEN" ] || [ "$OPEN" = "null" ]; then
+            echo "No open attempted_fix recorded; nothing to validate."
+            exit 0
+          fi
+
+          BRANCH=$(echo "$OPEN" | jq -r '.attempts | last | .branch // empty')
+          PR_NUMBER=$(echo "$OPEN" | jq -r '.attempts | last | .pr_number // empty')
+          FINDING_ID=$(echo "$OPEN" | jq -r '.id')
+          if [ -z "$BRANCH" ]; then
+            echo "::warning::Open attempt has no branch; skipping gate."
+            exit 0
+          fi
+
+          case "$SUITE" in
+            docs-and-references)
+              ALLOW='^(architecture/|docs/|README\.md$|CONTRIBUTING\.md$|DEVELOPMENT\.md$|STYLEGUIDE\.md$|packages/[^/]+/src/.*\.py$)'
+              ;;
+            dependencies)
+              ALLOW='^packages/[^/]+/pyproject\.toml$'
+              ;;
+            structure|code-quality)
+              ALLOW='^packages/[^/]+/src/.*\.py$'
+              ;;
+            *)
+              echo "::error::No allowlist defined for suite: $SUITE"
+              exit 1
+              ;;
+          esac
+
+          FILES=$(git diff --name-only origin/main...HEAD)
+          BAD=$(printf '%s\n' "$FILES" | grep -vE "$ALLOW" | grep -v '^$' || true)
+          FILE_COUNT=$(printf '%s\n' "$FILES" | grep -c '.' || echo 0)
+          LOC_DELTA=$(git diff --shortstat origin/main...HEAD \
+            | awk '{ a=0; d=0; for (i=1;i<=NF;i++) { if ($i ~ /insertion/) a=$(i-1); if ($i ~ /deletion/) d=$(i-1) } print a+d }')
+          : "${LOC_DELTA:=0}"
+
+          REASONS=""
+          if [ -n "$BAD" ]; then
+            REASONS="${REASONS}- files outside allowlist:\n$(echo "$BAD" | sed 's/^/  - /')\n"
+          fi
+          if [ "$FILE_COUNT" -gt 3 ]; then
+            REASONS="${REASONS}- file count ($FILE_COUNT) exceeds 3-file cap\n"
+          fi
+          if [ "$LOC_DELTA" -gt 50 ]; then
+            REASONS="${REASONS}- LOC delta ($LOC_DELTA) exceeds 50-line cap\n"
+          fi
+
+          if [ -z "$REASONS" ]; then
+            echo "Scope gate passed: ${FILE_COUNT} file(s), ${LOC_DELTA} LOC, all within allowlist."
+            exit 0
+          fi
+
+          echo "::error::Scope gate violation"
+          printf '%b' "$REASONS"
+
+          if [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
+            MSG=$(printf 'Closed by workflow scope gate. The pushed diff violated the localized-fix bar (see `.agents/recipes/_fix-policy.md`):\n\n%b\nThe `attempted_fixes` entry has been flipped to `abandoned`.' "$REASONS")
+            gh pr close "$PR_NUMBER" --comment "$MSG" --delete-branch || \
+              echo "::warning::gh pr close failed; branch may need manual cleanup"
+          else
+            git push origin --delete "$BRANCH" || \
+              echo "::warning::Could not delete remote branch $BRANCH"
+          fi
+
+          FINDING_ID="$FINDING_ID" python3 -c "
+          import json, os
+          finding_id = os.environ['FINDING_ID']
+          with open('.agentic-ci-state/runner-state.json') as f:
+              state = json.load(f)
+          for entry in state.get('attempted_fixes', []):
+              if entry.get('id') != finding_id:
+                  continue
+              attempts = entry.get('attempts') or []
+              if attempts and attempts[-1].get('outcome') == 'open':
+                  attempts[-1]['outcome'] = 'abandoned'
+                  attempts[-1]['gate_violation'] = True
+          with open('.agentic-ci-state/runner-state.json', 'w') as f:
+              json.dump(state, f, indent=2)
+          "
+
+          exit 1
+
+      - name: Verify dependencies lockfile
+        # Dependencies suite only: re-run make install-dev against the
+        # agent's pyproject.toml changes. This catches the failure mode
+        # where the per-package test target passed against the *old*
+        # lockfile but the proposed dep does not actually resolve.
+        if: matrix.suite == 'dependencies' && steps.fix.outcome == 'success' && steps.scope_gate.outcome == 'success'
+        env:
+          GH_TOKEN: ${{ github.token }}
+        run: |
+          set -o pipefail
+
+          OPEN=$(jq -c '
+            .attempted_fixes // []
+            | map(select((.attempts | last | .outcome) == "open"))
+            | last // empty
+          ' .agentic-ci-state/runner-state.json)
+          if [ -z "$OPEN" ] || [ "$OPEN" = "null" ]; then
+            echo "No open attempted_fix; skipping lockfile verification."
+            exit 0
+          fi
+
+          PR_NUMBER=$(echo "$OPEN" | jq -r '.attempts | last | .pr_number // empty')
+          BRANCH=$(echo "$OPEN" | jq -r '.attempts | last | .branch // empty')
+          FINDING_ID=$(echo "$OPEN" | jq -r '.id')
+
+          if make install-dev 2>&1 | tee /tmp/install-dev-verify.log; then
+            echo "Lockfile resolves cleanly against the agent's pyproject changes."
+            exit 0
+          fi
+
+          echo "::error::make install-dev failed against the agent's pyproject changes"
+
+          if [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
+            MSG="Closed by workflow lockfile verification. \`make install-dev\` failed against the agent's \`pyproject.toml\` changes — the dependency edit does not resolve cleanly. See \`/tmp/install-dev-verify.log\` in the workflow artifact."
+            gh pr close "$PR_NUMBER" --comment "$MSG" --delete-branch || \
+              echo "::warning::gh pr close failed"
+          else
+            git push origin --delete "$BRANCH" || true
+          fi
+
+          FINDING_ID="$FINDING_ID" python3 -c "
+          import json, os
+          finding_id = os.environ['FINDING_ID']
+          with open('.agentic-ci-state/runner-state.json') as f:
+              state = json.load(f)
+          for entry in state.get('attempted_fixes', []):
+              if entry.get('id') != finding_id:
+                  continue
+              attempts = entry.get('attempts') or []
+              if attempts and attempts[-1].get('outcome') == 'open':
+                  attempts[-1]['outcome'] = 'abandoned'
+                  attempts[-1]['lockfile_verification_failed'] = True
+          with open('.agentic-ci-state/runner-state.json', 'w') as f:
+              json.dump(state, f, indent=2)
+          "
+
+          exit 1
+
       - name: Update runner memory
         if: always()
         env:
@@ -266,7 +430,10 @@ jobs:
           "
 
       - name: Upload agent log
-        if: failure()
+        # Always upload: for autonomous PR generation, the most interesting
+        # failure mode is "the workflow succeeded but the PR was wrong".
+        # The full event stream is the only way to look back days later.
+        if: always()
         uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
         with:
           name: claude-audit-log-${{ matrix.suite }}-${{ github.run_id }}-${{ github.run_attempt }}
@@ -275,6 +442,8 @@ jobs:
             /tmp/claude-fix-log.txt
             /tmp/audit-${{ matrix.suite }}.md
             /tmp/pr-body-${{ matrix.suite }}.md
+            /tmp/install-dev-verify.log
+            .agentic-ci-state/runner-state.json
           retention-days: 14
           if-no-files-found: ignore
 

From 872d56170d0e7443c91709c40f74ebf6f183943d Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Thu, 7 May 2026 23:51:41 +0000
Subject: [PATCH 09/12] fix(agentic-ci): close Codex review-pass-2 findings on
 workflow gates

Codex flagged five issues in the prior commit's scope/lockfile gates.
This commit closes all five:

- HIGH: Wrong-PR targeting. Both gates selected the last globally-open
  attempted_fixes entry, which could match a stale orphan from a
  prior crashed run rather than the PR opened by *this* run. Adds a
  pre-fix snapshot step that captures `(id, attempts-length)` pairs
  before the fix runs, and changes the post-fix selectors to require
  that the entry's attempts count grew during this run.
- HIGH: Docstring-only enforcement gap on the docs-and-references
  suite. The .py path allowlist was at workflow level but the
  docstring-only caveat was still policy-only. Adds an AST-based
  check: for each .py file changed, parse the post-change tree,
  collect docstring line ranges (module/class/function), then verify
  every added line in the diff is either inside a docstring, a
  comment, or whitespace. Verified locally with both pass and fail
  fixtures.
- MEDIUM: Diff-ref mismatch. Gates diffed `origin/main...HEAD` rather
  than `origin/main...origin/$BRANCH`, so a misbehaving agent that
  left HEAD pointing elsewhere would have validated the wrong tree.
  Now fetches `origin/$BRANCH` first and prefers that ref. Falls
  back to HEAD only if fetch fails (with a warning).
- MEDIUM: FILE_COUNT bug. `grep -c '.' || echo 0` produced "0\n0" on
  empty diff, breaking the downstream integer comparison. Replaces
  with `mapfile -t FILE_ARR` + `${#FILE_ARR[@]}`, which is correct
  for any input including empty.
- LOW: Non-atomic JSON writes. The runner-state mutations could leave
  the file half-written if the workflow was cancelled mid-write.
  Switches both gates to the temp-file + os.replace pattern.

Also: dependencies-lockfile gate now does an explicit
`git checkout --detach origin/$BRANCH` before re-running install-dev,
so verification runs against what was actually pushed rather than
relying on local working-tree state.
---
 .github/workflows/agentic-ci-daily.yml | 165 ++++++++++++++++++++++---
 1 file changed, 146 insertions(+), 19 deletions(-)

diff --git a/.github/workflows/agentic-ci-daily.yml b/.github/workflows/agentic-ci-daily.yml
index 2a76f4c7e..5ed05768e 100644
--- a/.github/workflows/agentic-ci-daily.yml
+++ b/.github/workflows/agentic-ci-daily.yml
@@ -209,6 +209,18 @@ jobs:
           echo "size=${BACKLOG_SIZE}" >> "$GITHUB_OUTPUT"
           echo "fix_backlog has ${BACKLOG_SIZE} entries"
 
+      - name: Snapshot pre-fix attempted_fixes
+        # Captures (id, attempts-length) pairs before the fix step runs so
+        # the post-fix gates can identify which entry grew during *this*
+        # run, instead of grabbing the last globally-open entry (which
+        # might be a stale orphan from a prior crashed run).
+        id: snapshot
+        if: steps.audit.outcome == 'success' && steps.backlog.outcome == 'success' && matrix.suite != 'test-health' && fromJSON(steps.backlog.outputs.size || '0') > 0
+        run: |
+          jq -c '.attempted_fixes // [] | map({id, n: (.attempts | length)})' \
+            .agentic-ci-state/runner-state.json > /tmp/prior-attempted-fixes.json
+          echo "Snapshot: $(cat /tmp/prior-attempted-fixes.json)"
+
       - name: Run fix recipe
         id: fix
         if: steps.audit.outcome == 'success' && steps.backlog.outcome == 'success' && matrix.suite != 'test-health' && fromJSON(steps.backlog.outputs.size || '0') > 0
@@ -248,9 +260,9 @@ jobs:
         # Workflow-level enforcement of the localized-fix bar from
         # _fix-policy.md. Recipe instructions alone cannot bind the agent;
         # this gate re-derives the diff and closes the PR if the agent
-        # escaped the allowlist or the LOC/file caps.
-        # Note: the docstring-only constraint on docs-and-references .py
-        # paths remains policy-level; AST-based enforcement is a follow-up.
+        # escaped the allowlist or the LOC/file caps. The docs-and-references
+        # suite additionally gets AST-based docstring-only enforcement on
+        # .py edits (no non-docstring/non-comment lines may change).
         id: scope_gate
         if: steps.fix.outcome == 'success'
         env:
@@ -259,13 +271,19 @@ jobs:
         run: |
           set -o pipefail
 
-          OPEN=$(jq -c '
-            .attempted_fixes // []
-            | map(select((.attempts | last | .outcome) == "open"))
+          # Identify the attempted_fixes entry that grew during *this* run
+          # (vs the pre-fix snapshot), not the last globally-open entry.
+          OPEN=$(jq -c --slurpfile prior /tmp/prior-attempted-fixes.json '
+            (($prior[0] // []) | map({key: .id, value: .n}) | from_entries) as $p
+            | .attempted_fixes // []
+            | map(select(
+                ((.attempts | last | .outcome) == "open")
+                and ((.attempts | length) > ($p[.id] // 0))
+              ))
             | last // empty
           ' .agentic-ci-state/runner-state.json)
           if [ -z "$OPEN" ] || [ "$OPEN" = "null" ]; then
-            echo "No open attempted_fix recorded; nothing to validate."
+            echo "No new open attempted_fix recorded by this run; nothing to validate."
             exit 0
           fi
 
@@ -277,6 +295,17 @@ jobs:
             exit 0
           fi
 
+          # Diff against the actual pushed branch (origin/$BRANCH), not
+          # local HEAD — HEAD may not match what was pushed if the agent
+          # left the working tree in an unexpected state.
+          git fetch --depth=50 origin "$BRANCH" 2>/dev/null || true
+          if git rev-parse --verify "refs/remotes/origin/$BRANCH" >/dev/null 2>&1; then
+            DIFF_REF="origin/$BRANCH"
+          else
+            echo "::warning::origin/$BRANCH not fetchable; falling back to HEAD"
+            DIFF_REF="HEAD"
+          fi
+
           case "$SUITE" in
             docs-and-references)
               ALLOW='^(architecture/|docs/|README\.md$|CONTRIBUTING\.md$|DEVELOPMENT\.md$|STYLEGUIDE\.md$|packages/[^/]+/src/.*\.py$)'
@@ -293,10 +322,14 @@ jobs:
               ;;
           esac
 
-          FILES=$(git diff --name-only origin/main...HEAD)
-          BAD=$(printf '%s\n' "$FILES" | grep -vE "$ALLOW" | grep -v '^$' || true)
-          FILE_COUNT=$(printf '%s\n' "$FILES" | grep -c '.' || echo 0)
-          LOC_DELTA=$(git diff --shortstat origin/main...HEAD \
+          mapfile -t FILE_ARR < <(git diff --name-only "origin/main...$DIFF_REF")
+          FILE_COUNT=${#FILE_ARR[@]}
+          if [ "$FILE_COUNT" -gt 0 ]; then
+            BAD=$(printf '%s\n' "${FILE_ARR[@]}" | grep -vE "$ALLOW" | grep -v '^$' || true)
+          else
+            BAD=""
+          fi
+          LOC_DELTA=$(git diff --shortstat "origin/main...$DIFF_REF" \
             | awk '{ a=0; d=0; for (i=1;i<=NF;i++) { if ($i ~ /insertion/) a=$(i-1); if ($i ~ /deletion/) d=$(i-1) } print a+d }')
           : "${LOC_DELTA:=0}"
 
@@ -311,6 +344,81 @@ jobs:
             REASONS="${REASONS}- LOC delta ($LOC_DELTA) exceeds 50-line cap\n"
           fi
 
+          # Docs suite: AST-enforce the docstring-only caveat on .py edits.
+          if [ "$SUITE" = "docs-and-references" ] && [ "$FILE_COUNT" -gt 0 ]; then
+            PY_FILES=$(printf '%s\n' "${FILE_ARR[@]}" | grep -E '^packages/[^/]+/src/.*\.py$' || true)
+            if [ -n "$PY_FILES" ]; then
+              NON_DOCSTRING=$(PY_FILES="$PY_FILES" DIFF_REF="$DIFF_REF" python3 - <<'PY'
+          import ast, os, subprocess, sys
+
+          files = [p for p in os.environ['PY_FILES'].splitlines() if p]
+          diff_ref = os.environ['DIFF_REF']
+          violations = []
+
+          for path in files:
+              try:
+                  content = subprocess.check_output(
+                      ['git', 'show', f'{diff_ref}:{path}'], text=True, errors='replace'
+                  )
+              except subprocess.CalledProcessError:
+                  violations.append(f'{path}: file deleted (deletion not allowed in docs suite)')
+                  continue
+              try:
+                  tree = ast.parse(content)
+              except SyntaxError as e:
+                  violations.append(f'{path}: parse error ({e})')
+                  continue
+              docstring_lines = set()
+              for node in ast.walk(tree):
+                  if isinstance(node, (ast.Module, ast.ClassDef, ast.FunctionDef, ast.AsyncFunctionDef)):
+                      body = getattr(node, 'body', None) or []
+                      if (body and isinstance(body[0], ast.Expr)
+                              and isinstance(body[0].value, ast.Constant)
+                              and isinstance(body[0].value.value, str)):
+                          start, end = body[0].lineno, body[0].end_lineno
+                          if start and end:
+                              docstring_lines.update(range(start, end + 1))
+              try:
+                  hunks = subprocess.check_output(
+                      ['git', 'diff', '-U0', f'origin/main...{diff_ref}', '--', path],
+                      text=True, errors='replace',
+                  )
+              except subprocess.CalledProcessError:
+                  violations.append(f'{path}: could not compute diff')
+                  continue
+              cur = None
+              for line in hunks.splitlines():
+                  if line.startswith('@@'):
+                      try:
+                          plus = line.split('+', 1)[1].split(' ', 1)[0]
+                          cur = int(plus.split(',')[0]) if ',' in plus else int(plus)
+                      except (ValueError, IndexError):
+                          cur = None
+                      continue
+                  if cur is None:
+                      continue
+                  if line.startswith('+') and not line.startswith('+++'):
+                      stripped = line[1:].strip()
+                      ln = cur
+                      cur += 1
+                      if not stripped or stripped.startswith('#'):
+                          continue
+                      if ln not in docstring_lines:
+                          violations.append(f'{path}:{ln}')
+                  elif line.startswith(' '):
+                      cur += 1
+
+          if violations:
+              print('\n'.join(violations))
+              sys.exit(1)
+          PY
+              ) || true
+              if [ -n "$NON_DOCSTRING" ]; then
+                REASONS="${REASONS}- non-docstring/non-comment .py edits in docs suite:\n$(echo "$NON_DOCSTRING" | sed 's/^/  - /')\n"
+              fi
+            fi
+          fi
+
           if [ -z "$REASONS" ]; then
             echo "Scope gate passed: ${FILE_COUNT} file(s), ${LOC_DELTA} LOC, all within allowlist."
             exit 0
@@ -331,7 +439,8 @@ jobs:
           FINDING_ID="$FINDING_ID" python3 -c "
           import json, os
           finding_id = os.environ['FINDING_ID']
-          with open('.agentic-ci-state/runner-state.json') as f:
+          path = '.agentic-ci-state/runner-state.json'
+          with open(path) as f:
               state = json.load(f)
           for entry in state.get('attempted_fixes', []):
               if entry.get('id') != finding_id:
@@ -340,8 +449,10 @@ jobs:
               if attempts and attempts[-1].get('outcome') == 'open':
                   attempts[-1]['outcome'] = 'abandoned'
                   attempts[-1]['gate_violation'] = True
-          with open('.agentic-ci-state/runner-state.json', 'w') as f:
+          tmp = path + '.tmp'
+          with open(tmp, 'w') as f:
               json.dump(state, f, indent=2)
+          os.replace(tmp, path)
           "
 
           exit 1
@@ -357,13 +468,19 @@ jobs:
         run: |
           set -o pipefail
 
-          OPEN=$(jq -c '
-            .attempted_fixes // []
-            | map(select((.attempts | last | .outcome) == "open"))
+          # Same snapshot-based selector as the scope gate: target only
+          # the entry whose attempts grew during this run.
+          OPEN=$(jq -c --slurpfile prior /tmp/prior-attempted-fixes.json '
+            (($prior[0] // []) | map({key: .id, value: .n}) | from_entries) as $p
+            | .attempted_fixes // []
+            | map(select(
+                ((.attempts | last | .outcome) == "open")
+                and ((.attempts | length) > ($p[.id] // 0))
+              ))
             | last // empty
           ' .agentic-ci-state/runner-state.json)
           if [ -z "$OPEN" ] || [ "$OPEN" = "null" ]; then
-            echo "No open attempted_fix; skipping lockfile verification."
+            echo "No new open attempted_fix; skipping lockfile verification."
             exit 0
           fi
 
@@ -371,6 +488,13 @@ jobs:
           BRANCH=$(echo "$OPEN" | jq -r '.attempts | last | .branch // empty')
           FINDING_ID=$(echo "$OPEN" | jq -r '.id')
 
+          # Verify against the actually-pushed branch, not local HEAD.
+          if [ -n "$BRANCH" ]; then
+            git fetch --depth=50 origin "$BRANCH" 2>/dev/null || true
+            git checkout --detach "refs/remotes/origin/$BRANCH" 2>/dev/null \
+              || echo "::warning::Could not checkout origin/$BRANCH; verifying current tree"
+          fi
+
           if make install-dev 2>&1 | tee /tmp/install-dev-verify.log; then
             echo "Lockfile resolves cleanly against the agent's pyproject changes."
             exit 0
@@ -389,7 +513,8 @@ jobs:
           FINDING_ID="$FINDING_ID" python3 -c "
           import json, os
           finding_id = os.environ['FINDING_ID']
-          with open('.agentic-ci-state/runner-state.json') as f:
+          path = '.agentic-ci-state/runner-state.json'
+          with open(path) as f:
               state = json.load(f)
           for entry in state.get('attempted_fixes', []):
               if entry.get('id') != finding_id:
@@ -398,8 +523,10 @@ jobs:
               if attempts and attempts[-1].get('outcome') == 'open':
                   attempts[-1]['outcome'] = 'abandoned'
                   attempts[-1]['lockfile_verification_failed'] = True
-          with open('.agentic-ci-state/runner-state.json', 'w') as f:
+          tmp = path + '.tmp'
+          with open(tmp, 'w') as f:
               json.dump(state, f, indent=2)
+          os.replace(tmp, path)
           "
 
           exit 1

From 68827c156f82a6bd457e748b81f838d08e3140d0 Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Fri, 8 May 2026 00:07:08 +0000
Subject: [PATCH 10/12] fix(agentic-ci): gate fix + scope_gate steps on
 snapshot.outcome
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Greptile review on 872d5617 flagged that the fix step's custom `if:`
expression bypasses GitHub Actions' implicit success() check. Without
explicitly referencing steps.snapshot.outcome, a snapshot failure
(corrupt runner-state, disk error) would let the fix step run anyway.
The scope gate's `jq --slurpfile prior /tmp/prior-attempted-fixes.json`
would then exit non-zero on the missing file, leave OPEN empty, and
hit the "nothing to validate" early-exit — silently approving whatever
the agent pushed.

Adds steps.snapshot.outcome == 'success' to both the fix step's
condition (the actual fix) and the scope_gate step's condition
(belt-and-suspenders against future refactors).
---
 .github/workflows/agentic-ci-daily.yml | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/agentic-ci-daily.yml b/.github/workflows/agentic-ci-daily.yml
index 5ed05768e..93832c0e8 100644
--- a/.github/workflows/agentic-ci-daily.yml
+++ b/.github/workflows/agentic-ci-daily.yml
@@ -223,7 +223,12 @@ jobs:
 
       - name: Run fix recipe
         id: fix
-        if: steps.audit.outcome == 'success' && steps.backlog.outcome == 'success' && matrix.suite != 'test-health' && fromJSON(steps.backlog.outputs.size || '0') > 0
+        # Custom if: bypasses implicit success(), so snapshot.outcome must
+        # be checked explicitly. Without it, a snapshot failure (corrupt
+        # runner-state, disk error) would leave /tmp/prior-attempted-fixes.json
+        # missing, the scope gate's jq --slurpfile would short-circuit, and
+        # the gate would exit 0 — silently approving the agent's PR.
+        if: steps.audit.outcome == 'success' && steps.backlog.outcome == 'success' && steps.snapshot.outcome == 'success' && matrix.suite != 'test-health' && fromJSON(steps.backlog.outputs.size || '0') > 0
         env:
           ANTHROPIC_BASE_URL: ${{ secrets.AGENTIC_CI_API_BASE_URL }}
           ANTHROPIC_API_KEY: ${{ secrets.AGENTIC_CI_API_KEY }}
@@ -264,7 +269,10 @@ jobs:
         # suite additionally gets AST-based docstring-only enforcement on
         # .py edits (no non-docstring/non-comment lines may change).
         id: scope_gate
-        if: steps.fix.outcome == 'success'
+        # Belt-and-suspenders: also check snapshot succeeded. The fix step
+        # already gates on snapshot, so if we got here the snapshot ran,
+        # but be explicit so a future condition refactor cannot regress.
+        if: steps.fix.outcome == 'success' && steps.snapshot.outcome == 'success'
         env:
           SUITE: ${{ matrix.suite }}
           GH_TOKEN: ${{ github.token }}

From 3a0bb95c583324cb92f00b69f0d2018575780ff1 Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Fri, 8 May 2026 00:38:44 +0000
Subject: [PATCH 11/12] fix(agentic-ci): harden daily fix gates

Signed-off-by: Andre Manoel <amanoel@nvidia.com>
---
 .github/workflows/agentic-ci-daily.yml | 175 +++++++++++++++++++------
 1 file changed, 132 insertions(+), 43 deletions(-)

diff --git a/.github/workflows/agentic-ci-daily.yml b/.github/workflows/agentic-ci-daily.yml
index 93832c0e8..7db78d141 100644
--- a/.github/workflows/agentic-ci-daily.yml
+++ b/.github/workflows/agentic-ci-daily.yml
@@ -298,20 +298,29 @@ jobs:
           BRANCH=$(echo "$OPEN" | jq -r '.attempts | last | .branch // empty')
           PR_NUMBER=$(echo "$OPEN" | jq -r '.attempts | last | .pr_number // empty')
           FINDING_ID=$(echo "$OPEN" | jq -r '.id')
+
+          REASONS=""
+          if [ -z "$BRANCH" ] && [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
+            BRANCH=$(gh pr view "$PR_NUMBER" --json headRefName -q .headRefName 2>/dev/null || true)
+            if [ -n "$BRANCH" ]; then
+              echo "::warning::Open attempt had no branch; recovered $BRANCH from PR #$PR_NUMBER."
+            fi
+          fi
           if [ -z "$BRANCH" ]; then
-            echo "::warning::Open attempt has no branch; skipping gate."
-            exit 0
+            REASONS="${REASONS}- open attempt has no branch and could not be recovered from PR ${PR_NUMBER:-unknown}\n"
           fi
 
           # Diff against the actual pushed branch (origin/$BRANCH), not
           # local HEAD — HEAD may not match what was pushed if the agent
           # left the working tree in an unexpected state.
-          git fetch --depth=50 origin "$BRANCH" 2>/dev/null || true
-          if git rev-parse --verify "refs/remotes/origin/$BRANCH" >/dev/null 2>&1; then
-            DIFF_REF="origin/$BRANCH"
-          else
-            echo "::warning::origin/$BRANCH not fetchable; falling back to HEAD"
-            DIFF_REF="HEAD"
+          if [ -n "$BRANCH" ]; then
+            git fetch --depth=50 origin "$BRANCH" 2>/dev/null || true
+            if git rev-parse --verify "refs/remotes/origin/$BRANCH" >/dev/null 2>&1; then
+              DIFF_REF="origin/$BRANCH"
+            else
+              echo "::warning::origin/$BRANCH not fetchable; falling back to HEAD"
+              DIFF_REF="HEAD"
+            fi
           fi
 
           case "$SUITE" in
@@ -330,18 +339,24 @@ jobs:
               ;;
           esac
 
-          mapfile -t FILE_ARR < <(git diff --name-only "origin/main...$DIFF_REF")
-          FILE_COUNT=${#FILE_ARR[@]}
-          if [ "$FILE_COUNT" -gt 0 ]; then
-            BAD=$(printf '%s\n' "${FILE_ARR[@]}" | grep -vE "$ALLOW" | grep -v '^$' || true)
+          if [ -n "$BRANCH" ]; then
+            mapfile -t FILE_ARR < <(git diff --name-only "origin/main...$DIFF_REF")
+            FILE_COUNT=${#FILE_ARR[@]}
+            if [ "$FILE_COUNT" -gt 0 ]; then
+              BAD=$(printf '%s\n' "${FILE_ARR[@]}" | grep -vE "$ALLOW" | grep -v '^$' || true)
+            else
+              BAD=""
+            fi
+            LOC_DELTA=$(git diff --shortstat "origin/main...$DIFF_REF" \
+              | awk '{ a=0; d=0; for (i=1;i<=NF;i++) { if ($i ~ /insertion/) a=$(i-1); if ($i ~ /deletion/) d=$(i-1) } print a+d }')
+            : "${LOC_DELTA:=0}"
           else
+            FILE_ARR=()
+            FILE_COUNT=0
             BAD=""
+            LOC_DELTA=0
           fi
-          LOC_DELTA=$(git diff --shortstat "origin/main...$DIFF_REF" \
-            | awk '{ a=0; d=0; for (i=1;i<=NF;i++) { if ($i ~ /insertion/) a=$(i-1); if ($i ~ /deletion/) d=$(i-1) } print a+d }')
-          : "${LOC_DELTA:=0}"
 
-          REASONS=""
           if [ -n "$BAD" ]; then
             REASONS="${REASONS}- files outside allowlist:\n$(echo "$BAD" | sed 's/^/  - /')\n"
           fi
@@ -357,26 +372,42 @@ jobs:
             PY_FILES=$(printf '%s\n' "${FILE_ARR[@]}" | grep -E '^packages/[^/]+/src/.*\.py$' || true)
             if [ -n "$PY_FILES" ]; then
               NON_DOCSTRING=$(PY_FILES="$PY_FILES" DIFF_REF="$DIFF_REF" python3 - <<'PY'
-          import ast, os, subprocess, sys
+          import ast
+          import os
+          import re
+          import subprocess
+          import sys
 
           files = [p for p in os.environ['PY_FILES'].splitlines() if p]
           diff_ref = os.environ['DIFF_REF']
           violations = []
+          hunk_re = re.compile(r'@@ -(?P<old>\d+)(?:,\d+)? \+(?P<new>\d+)(?:,\d+)? @@')
 
-          for path in files:
+          try:
+              base_ref = subprocess.check_output(
+                  ['git', 'merge-base', 'origin/main', diff_ref], text=True
+              ).strip()
+          except subprocess.CalledProcessError:
+              violations.append(f'could not compute merge base for origin/main...{diff_ref}')
+              base_ref = 'origin/main'
+
+          def collect_docstring_lines(ref, path, *, missing_ok=False):
               try:
                   content = subprocess.check_output(
-                      ['git', 'show', f'{diff_ref}:{path}'], text=True, errors='replace'
+                      ['git', 'show', f'{ref}:{path}'], text=True, errors='replace'
                   )
               except subprocess.CalledProcessError:
-                  violations.append(f'{path}: file deleted (deletion not allowed in docs suite)')
-                  continue
+                  if missing_ok:
+                      return set()
+                  violations.append(f'{path}: file missing at {ref}')
+                  return None
               try:
                   tree = ast.parse(content)
               except SyntaxError as e:
-                  violations.append(f'{path}: parse error ({e})')
-                  continue
-              docstring_lines = set()
+                  violations.append(f'{path}: parse error at {ref} ({e})')
+                  return None
+
+              lines = set()
               for node in ast.walk(tree):
                   if isinstance(node, (ast.Module, ast.ClassDef, ast.FunctionDef, ast.AsyncFunctionDef)):
                       body = getattr(node, 'body', None) or []
@@ -385,7 +416,17 @@ jobs:
                               and isinstance(body[0].value.value, str)):
                           start, end = body[0].lineno, body[0].end_lineno
                           if start and end:
-                              docstring_lines.update(range(start, end + 1))
+                              lines.update(range(start, end + 1))
+              return lines
+
+          for path in files:
+              old_docstring_lines = collect_docstring_lines(base_ref, path, missing_ok=True)
+              new_docstring_lines = collect_docstring_lines(diff_ref, path)
+              if new_docstring_lines is None:
+                  violations.append(f'{path}: file deleted or unreadable at {diff_ref}')
+                  continue
+              if old_docstring_lines is None:
+                  continue
               try:
                   hunks = subprocess.check_output(
                       ['git', 'diff', '-U0', f'origin/main...{diff_ref}', '--', path],
@@ -394,27 +435,39 @@ jobs:
               except subprocess.CalledProcessError:
                   violations.append(f'{path}: could not compute diff')
                   continue
-              cur = None
+              old_cur = None
+              new_cur = None
               for line in hunks.splitlines():
                   if line.startswith('@@'):
-                      try:
-                          plus = line.split('+', 1)[1].split(' ', 1)[0]
-                          cur = int(plus.split(',')[0]) if ',' in plus else int(plus)
-                      except (ValueError, IndexError):
-                          cur = None
+                      match = hunk_re.match(line)
+                      if match:
+                          old_cur = int(match.group('old'))
+                          new_cur = int(match.group('new'))
+                      else:
+                          old_cur = None
+                          new_cur = None
                       continue
-                  if cur is None:
+                  if old_cur is None or new_cur is None:
                       continue
                   if line.startswith('+') and not line.startswith('+++'):
                       stripped = line[1:].strip()
-                      ln = cur
-                      cur += 1
+                      ln = new_cur
+                      new_cur += 1
                       if not stripped or stripped.startswith('#'):
                           continue
-                      if ln not in docstring_lines:
-                          violations.append(f'{path}:{ln}')
+                      if ln not in new_docstring_lines:
+                          violations.append(f'{path}:{ln} added outside docstring')
+                  elif line.startswith('-') and not line.startswith('---'):
+                      stripped = line[1:].strip()
+                      ln = old_cur
+                      old_cur += 1
+                      if not stripped or stripped.startswith('#'):
+                          continue
+                      if ln not in old_docstring_lines:
+                          violations.append(f'{path}:{ln} removed outside docstring')
                   elif line.startswith(' '):
-                      cur += 1
+                      old_cur += 1
+                      new_cur += 1
 
           if violations:
               print('\n'.join(violations))
@@ -439,9 +492,11 @@ jobs:
             MSG=$(printf 'Closed by workflow scope gate. The pushed diff violated the localized-fix bar (see `.agents/recipes/_fix-policy.md`):\n\n%b\nThe `attempted_fixes` entry has been flipped to `abandoned`.' "$REASONS")
             gh pr close "$PR_NUMBER" --comment "$MSG" --delete-branch || \
               echo "::warning::gh pr close failed; branch may need manual cleanup"
-          else
+          elif [ -n "$BRANCH" ]; then
             git push origin --delete "$BRANCH" || \
               echo "::warning::Could not delete remote branch $BRANCH"
+          else
+            echo "::warning::No PR number or branch available for cleanup"
           fi
 
           FINDING_ID="$FINDING_ID" python3 -c "
@@ -463,14 +518,16 @@ jobs:
           os.replace(tmp, path)
           "
 
-          exit 1
+          echo "rejected=true" >> "$GITHUB_OUTPUT"
+          exit 0
 
       - name: Verify dependencies lockfile
         # Dependencies suite only: re-run make install-dev against the
         # agent's pyproject.toml changes. This catches the failure mode
         # where the per-package test target passed against the *old*
         # lockfile but the proposed dep does not actually resolve.
-        if: matrix.suite == 'dependencies' && steps.fix.outcome == 'success' && steps.scope_gate.outcome == 'success'
+        id: lockfile_gate
+        if: matrix.suite == 'dependencies' && steps.fix.outcome == 'success' && steps.scope_gate.outcome == 'success' && steps.scope_gate.outputs.rejected != 'true'
         env:
           GH_TOKEN: ${{ github.token }}
         run: |
@@ -497,6 +554,12 @@ jobs:
           FINDING_ID=$(echo "$OPEN" | jq -r '.id')
 
           # Verify against the actually-pushed branch, not local HEAD.
+          if [ -z "$BRANCH" ] && [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
+            BRANCH=$(gh pr view "$PR_NUMBER" --json headRefName -q .headRefName 2>/dev/null || true)
+            if [ -n "$BRANCH" ]; then
+              echo "::warning::Open attempt had no branch; recovered $BRANCH from PR #$PR_NUMBER."
+            fi
+          fi
           if [ -n "$BRANCH" ]; then
             git fetch --depth=50 origin "$BRANCH" 2>/dev/null || true
             git checkout --detach "refs/remotes/origin/$BRANCH" 2>/dev/null \
@@ -514,8 +577,10 @@ jobs:
             MSG="Closed by workflow lockfile verification. \`make install-dev\` failed against the agent's \`pyproject.toml\` changes — the dependency edit does not resolve cleanly. See \`/tmp/install-dev-verify.log\` in the workflow artifact."
             gh pr close "$PR_NUMBER" --comment "$MSG" --delete-branch || \
               echo "::warning::gh pr close failed"
-          else
+          elif [ -n "$BRANCH" ]; then
             git push origin --delete "$BRANCH" || true
+          else
+            echo "::warning::No PR number or branch available for cleanup"
           fi
 
           FINDING_ID="$FINDING_ID" python3 -c "
@@ -537,7 +602,8 @@ jobs:
           os.replace(tmp, path)
           "
 
-          exit 1
+          echo "rejected=true" >> "$GITHUB_OUTPUT"
+          exit 0
 
       - name: Update runner memory
         if: always()
@@ -545,7 +611,7 @@ jobs:
           SUITE: ${{ matrix.suite }}
           AUDIT_OUTCOME: ${{ steps.audit.outcome }}
         run: |
-          # Always validate state (cache saves regardless of outcome)
+          # Always validate state before cache save/post-job handling.
           python3 -c "
           import json, datetime, os
           try:
@@ -589,12 +655,20 @@ jobs:
           AUDIT_OUTCOME: ${{ steps.audit.outcome }}
           FIX_OUTCOME: ${{ steps.fix.outcome }}
           BACKLOG_SIZE: ${{ steps.backlog.outputs.size }}
+          SCOPE_REJECTED: ${{ steps.scope_gate.outputs.rejected }}
+          LOCKFILE_REJECTED: ${{ steps.lockfile_gate.outputs.rejected }}
         run: |
           echo "## Daily Audit: ${SUITE}" >> "$GITHUB_STEP_SUMMARY"
           echo "" >> "$GITHUB_STEP_SUMMARY"
           echo "- Audit: \`${AUDIT_OUTCOME:-unknown}\`" >> "$GITHUB_STEP_SUMMARY"
           echo "- Fix backlog size: \`${BACKLOG_SIZE:-n/a}\`" >> "$GITHUB_STEP_SUMMARY"
           echo "- Fix: \`${FIX_OUTCOME:-skipped}\`" >> "$GITHUB_STEP_SUMMARY"
+          if [ "${SCOPE_REJECTED}" = "true" ]; then
+            echo "- Scope gate: \`rejected and closed PR\`" >> "$GITHUB_STEP_SUMMARY"
+          fi
+          if [ "${LOCKFILE_REJECTED}" = "true" ]; then
+            echo "- Lockfile gate: \`rejected and closed PR\`" >> "$GITHUB_STEP_SUMMARY"
+          fi
           echo "" >> "$GITHUB_STEP_SUMMARY"
 
           if [ -s "/tmp/audit-${SUITE}.md" ]; then
@@ -602,3 +676,18 @@ jobs:
           else
             echo "No report generated. See the \`claude-audit-log-*\` artifact on failures for the full event stream." >> "$GITHUB_STEP_SUMMARY"
           fi
+
+      - name: Save rejected gate state
+        if: always() && (steps.scope_gate.outputs.rejected == 'true' || steps.lockfile_gate.outputs.rejected == 'true')
+        uses: actions/cache/save@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5
+        with:
+          path: |
+            .agentic-ci-state
+            graphify-out
+          key: agentic-ci-state-${{ matrix.suite }}-${{ github.run_id }}-${{ github.run_attempt }}-rejected
+
+      - name: Fail rejected fix gates
+        if: always() && (steps.scope_gate.outputs.rejected == 'true' || steps.lockfile_gate.outputs.rejected == 'true')
+        run: |
+          echo "::error::A post-fix gate rejected and closed the agent PR. Runner memory was saved before failing."
+          exit 1

From c1d55708c36963fc986aa70f0685bb812001bfea Mon Sep 17 00:00:00 2001
From: Andre Manoel <amanoel@nvidia.com>
Date: Fri, 8 May 2026 20:23:52 +0000
Subject: [PATCH 12/12] fix(agentic-ci): validate all grown fix attempts

---
 .github/workflows/agentic-ci-daily.yml | 295 ++++++++++++++-----------
 1 file changed, 160 insertions(+), 135 deletions(-)

diff --git a/.github/workflows/agentic-ci-daily.yml b/.github/workflows/agentic-ci-daily.yml
index 7db78d141..247a13122 100644
--- a/.github/workflows/agentic-ci-daily.yml
+++ b/.github/workflows/agentic-ci-daily.yml
@@ -279,99 +279,107 @@ jobs:
         run: |
           set -o pipefail
 
-          # Identify the attempted_fixes entry that grew during *this* run
-          # (vs the pre-fix snapshot), not the last globally-open entry.
-          OPEN=$(jq -c --slurpfile prior /tmp/prior-attempted-fixes.json '
+          # Identify every attempted_fixes entry that grew during *this* run
+          # (vs the pre-fix snapshot), not just the last globally-open entry.
+          OPEN_ENTRIES=$(jq -c --slurpfile prior /tmp/prior-attempted-fixes.json '
             (($prior[0] // []) | map({key: .id, value: .n}) | from_entries) as $p
-            | .attempted_fixes // []
-            | map(select(
+            | [
+              .attempted_fixes // []
+              | .[]
+              | select(
                 ((.attempts | last | .outcome) == "open")
                 and ((.attempts | length) > ($p[.id] // 0))
-              ))
-            | last // empty
+              )
+            ]
           ' .agentic-ci-state/runner-state.json)
-          if [ -z "$OPEN" ] || [ "$OPEN" = "null" ]; then
+          OPEN_COUNT=$(echo "$OPEN_ENTRIES" | jq 'length')
+          if [ "$OPEN_COUNT" -eq 0 ]; then
             echo "No new open attempted_fix recorded by this run; nothing to validate."
             exit 0
           fi
 
-          BRANCH=$(echo "$OPEN" | jq -r '.attempts | last | .branch // empty')
-          PR_NUMBER=$(echo "$OPEN" | jq -r '.attempts | last | .pr_number // empty')
-          FINDING_ID=$(echo "$OPEN" | jq -r '.id')
+          echo "Validating ${OPEN_COUNT} new open attempted_fix entries."
+          REJECTED=0
 
-          REASONS=""
-          if [ -z "$BRANCH" ] && [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
-            BRANCH=$(gh pr view "$PR_NUMBER" --json headRefName -q .headRefName 2>/dev/null || true)
-            if [ -n "$BRANCH" ]; then
-              echo "::warning::Open attempt had no branch; recovered $BRANCH from PR #$PR_NUMBER."
+          while IFS= read -r OPEN; do
+            BRANCH=$(echo "$OPEN" | jq -r '.attempts | last | .branch // empty')
+            PR_NUMBER=$(echo "$OPEN" | jq -r '.attempts | last | .pr_number // empty')
+            FINDING_ID=$(echo "$OPEN" | jq -r '.id')
+            DIFF_REF=""
+            REASONS=""
+
+            if [ -z "$BRANCH" ] && [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
+              BRANCH=$(gh pr view "$PR_NUMBER" --json headRefName -q .headRefName 2>/dev/null || true)
+              if [ -n "$BRANCH" ]; then
+                echo "::warning::Open attempt had no branch; recovered $BRANCH from PR #$PR_NUMBER."
+              fi
+            fi
+            if [ -z "$BRANCH" ]; then
+              REASONS="${REASONS}- open attempt has no branch and could not be recovered from PR ${PR_NUMBER:-unknown}\n"
             fi
-          fi
-          if [ -z "$BRANCH" ]; then
-            REASONS="${REASONS}- open attempt has no branch and could not be recovered from PR ${PR_NUMBER:-unknown}\n"
-          fi
 
-          # Diff against the actual pushed branch (origin/$BRANCH), not
-          # local HEAD — HEAD may not match what was pushed if the agent
-          # left the working tree in an unexpected state.
-          if [ -n "$BRANCH" ]; then
-            git fetch --depth=50 origin "$BRANCH" 2>/dev/null || true
-            if git rev-parse --verify "refs/remotes/origin/$BRANCH" >/dev/null 2>&1; then
-              DIFF_REF="origin/$BRANCH"
-            else
-              echo "::warning::origin/$BRANCH not fetchable; falling back to HEAD"
-              DIFF_REF="HEAD"
+            # Diff against the actual pushed branch (origin/$BRANCH), not
+            # local HEAD — HEAD may not match what was pushed if the agent
+            # left the working tree in an unexpected state.
+            if [ -n "$BRANCH" ]; then
+              git fetch --depth=50 origin "$BRANCH" 2>/dev/null || true
+              if git rev-parse --verify "refs/remotes/origin/$BRANCH" >/dev/null 2>&1; then
+                DIFF_REF="origin/$BRANCH"
+              else
+                echo "::warning::origin/$BRANCH not fetchable; falling back to HEAD"
+                DIFF_REF="HEAD"
+              fi
             fi
-          fi
 
-          case "$SUITE" in
-            docs-and-references)
-              ALLOW='^(architecture/|docs/|README\.md$|CONTRIBUTING\.md$|DEVELOPMENT\.md$|STYLEGUIDE\.md$|packages/[^/]+/src/.*\.py$)'
-              ;;
-            dependencies)
-              ALLOW='^packages/[^/]+/pyproject\.toml$'
-              ;;
-            structure|code-quality)
-              ALLOW='^packages/[^/]+/src/.*\.py$'
-              ;;
-            *)
-              echo "::error::No allowlist defined for suite: $SUITE"
-              exit 1
-              ;;
-          esac
+            case "$SUITE" in
+              docs-and-references)
+                ALLOW='^(architecture/|docs/|README\.md$|CONTRIBUTING\.md$|DEVELOPMENT\.md$|STYLEGUIDE\.md$|packages/[^/]+/src/.*\.py$)'
+                ;;
+              dependencies)
+                ALLOW='^packages/[^/]+/pyproject\.toml$'
+                ;;
+              structure|code-quality)
+                ALLOW='^packages/[^/]+/src/.*\.py$'
+                ;;
+              *)
+                echo "::error::No allowlist defined for suite: $SUITE"
+                exit 1
+                ;;
+            esac
 
-          if [ -n "$BRANCH" ]; then
-            mapfile -t FILE_ARR < <(git diff --name-only "origin/main...$DIFF_REF")
-            FILE_COUNT=${#FILE_ARR[@]}
-            if [ "$FILE_COUNT" -gt 0 ]; then
-              BAD=$(printf '%s\n' "${FILE_ARR[@]}" | grep -vE "$ALLOW" | grep -v '^$' || true)
+            if [ -n "$BRANCH" ]; then
+              mapfile -t FILE_ARR < <(git diff --name-only "origin/main...$DIFF_REF")
+              FILE_COUNT=${#FILE_ARR[@]}
+              if [ "$FILE_COUNT" -gt 0 ]; then
+                BAD=$(printf '%s\n' "${FILE_ARR[@]}" | grep -vE "$ALLOW" | grep -v '^$' || true)
+              else
+                BAD=""
+              fi
+              LOC_DELTA=$(git diff --shortstat "origin/main...$DIFF_REF" \
+                | awk '{ a=0; d=0; for (i=1;i<=NF;i++) { if ($i ~ /insertion/) a=$(i-1); if ($i ~ /deletion/) d=$(i-1) } print a+d }')
+              : "${LOC_DELTA:=0}"
             else
+              FILE_ARR=()
+              FILE_COUNT=0
               BAD=""
+              LOC_DELTA=0
             fi
-            LOC_DELTA=$(git diff --shortstat "origin/main...$DIFF_REF" \
-              | awk '{ a=0; d=0; for (i=1;i<=NF;i++) { if ($i ~ /insertion/) a=$(i-1); if ($i ~ /deletion/) d=$(i-1) } print a+d }')
-            : "${LOC_DELTA:=0}"
-          else
-            FILE_ARR=()
-            FILE_COUNT=0
-            BAD=""
-            LOC_DELTA=0
-          fi
 
-          if [ -n "$BAD" ]; then
-            REASONS="${REASONS}- files outside allowlist:\n$(echo "$BAD" | sed 's/^/  - /')\n"
-          fi
-          if [ "$FILE_COUNT" -gt 3 ]; then
-            REASONS="${REASONS}- file count ($FILE_COUNT) exceeds 3-file cap\n"
-          fi
-          if [ "$LOC_DELTA" -gt 50 ]; then
-            REASONS="${REASONS}- LOC delta ($LOC_DELTA) exceeds 50-line cap\n"
-          fi
+            if [ -n "$BAD" ]; then
+              REASONS="${REASONS}- files outside allowlist:\n$(echo "$BAD" | sed 's/^/  - /')\n"
+            fi
+            if [ "$FILE_COUNT" -gt 3 ]; then
+              REASONS="${REASONS}- file count ($FILE_COUNT) exceeds 3-file cap\n"
+            fi
+            if [ "$LOC_DELTA" -gt 50 ]; then
+              REASONS="${REASONS}- LOC delta ($LOC_DELTA) exceeds 50-line cap\n"
+            fi
 
-          # Docs suite: AST-enforce the docstring-only caveat on .py edits.
-          if [ "$SUITE" = "docs-and-references" ] && [ "$FILE_COUNT" -gt 0 ]; then
-            PY_FILES=$(printf '%s\n' "${FILE_ARR[@]}" | grep -E '^packages/[^/]+/src/.*\.py$' || true)
-            if [ -n "$PY_FILES" ]; then
-              NON_DOCSTRING=$(PY_FILES="$PY_FILES" DIFF_REF="$DIFF_REF" python3 - <<'PY'
+            # Docs suite: AST-enforce the docstring-only caveat on .py edits.
+            if [ "$SUITE" = "docs-and-references" ] && [ "$FILE_COUNT" -gt 0 ]; then
+              PY_FILES=$(printf '%s\n' "${FILE_ARR[@]}" | grep -E '^packages/[^/]+/src/.*\.py$' || true)
+              if [ -n "$PY_FILES" ]; then
+                NON_DOCSTRING=$(PY_FILES="$PY_FILES" DIFF_REF="$DIFF_REF" python3 - <<'PY'
           import ast
           import os
           import re
@@ -473,33 +481,34 @@ jobs:
               print('\n'.join(violations))
               sys.exit(1)
           PY
-              ) || true
-              if [ -n "$NON_DOCSTRING" ]; then
-                REASONS="${REASONS}- non-docstring/non-comment .py edits in docs suite:\n$(echo "$NON_DOCSTRING" | sed 's/^/  - /')\n"
+                ) || true
+                if [ -n "$NON_DOCSTRING" ]; then
+                  REASONS="${REASONS}- non-docstring/non-comment .py edits in docs suite:\n$(echo "$NON_DOCSTRING" | sed 's/^/  - /')\n"
+                fi
               fi
             fi
-          fi
 
-          if [ -z "$REASONS" ]; then
-            echo "Scope gate passed: ${FILE_COUNT} file(s), ${LOC_DELTA} LOC, all within allowlist."
-            exit 0
-          fi
-
-          echo "::error::Scope gate violation"
-          printf '%b' "$REASONS"
+            if [ -z "$REASONS" ]; then
+              echo "Scope gate passed for ${FINDING_ID}: ${FILE_COUNT} file(s), ${LOC_DELTA} LOC, all within allowlist."
+              continue
+            fi
 
-          if [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
-            MSG=$(printf 'Closed by workflow scope gate. The pushed diff violated the localized-fix bar (see `.agents/recipes/_fix-policy.md`):\n\n%b\nThe `attempted_fixes` entry has been flipped to `abandoned`.' "$REASONS")
-            gh pr close "$PR_NUMBER" --comment "$MSG" --delete-branch || \
-              echo "::warning::gh pr close failed; branch may need manual cleanup"
-          elif [ -n "$BRANCH" ]; then
-            git push origin --delete "$BRANCH" || \
-              echo "::warning::Could not delete remote branch $BRANCH"
-          else
-            echo "::warning::No PR number or branch available for cleanup"
-          fi
+            REJECTED=1
+            echo "::error::Scope gate violation for ${FINDING_ID}"
+            printf '%b' "$REASONS"
+
+            if [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
+              MSG=$(printf 'Closed by workflow scope gate. The pushed diff violated the localized-fix bar (see `.agents/recipes/_fix-policy.md`):\n\n%b\nThe `attempted_fixes` entry has been flipped to `abandoned`.' "$REASONS")
+              gh pr close "$PR_NUMBER" --comment "$MSG" --delete-branch || \
+                echo "::warning::gh pr close failed; branch may need manual cleanup"
+            elif [ -n "$BRANCH" ]; then
+              git push origin --delete "$BRANCH" || \
+                echo "::warning::Could not delete remote branch $BRANCH"
+            else
+              echo "::warning::No PR number or branch available for cleanup"
+            fi
 
-          FINDING_ID="$FINDING_ID" python3 -c "
+            FINDING_ID="$FINDING_ID" python3 -c "
           import json, os
           finding_id = os.environ['FINDING_ID']
           path = '.agentic-ci-state/runner-state.json'
@@ -518,7 +527,11 @@ jobs:
           os.replace(tmp, path)
           "
 
-          echo "rejected=true" >> "$GITHUB_OUTPUT"
+          done < <(echo "$OPEN_ENTRIES" | jq -c '.[]')
+
+          if [ "$REJECTED" -eq 1 ]; then
+            echo "rejected=true" >> "$GITHUB_OUTPUT"
+          fi
           exit 0
 
       - name: Verify dependencies lockfile
@@ -527,63 +540,71 @@ jobs:
         # where the per-package test target passed against the *old*
         # lockfile but the proposed dep does not actually resolve.
         id: lockfile_gate
-        if: matrix.suite == 'dependencies' && steps.fix.outcome == 'success' && steps.scope_gate.outcome == 'success' && steps.scope_gate.outputs.rejected != 'true'
+        if: matrix.suite == 'dependencies' && steps.fix.outcome == 'success' && steps.scope_gate.outcome == 'success'
         env:
           GH_TOKEN: ${{ github.token }}
         run: |
           set -o pipefail
 
-          # Same snapshot-based selector as the scope gate: target only
-          # the entry whose attempts grew during this run.
-          OPEN=$(jq -c --slurpfile prior /tmp/prior-attempted-fixes.json '
+          # Same snapshot-based selector as the scope gate: target every
+          # entry whose attempts grew during this run.
+          OPEN_ENTRIES=$(jq -c --slurpfile prior /tmp/prior-attempted-fixes.json '
             (($prior[0] // []) | map({key: .id, value: .n}) | from_entries) as $p
-            | .attempted_fixes // []
-            | map(select(
+            | [
+              .attempted_fixes // []
+              | .[]
+              | select(
                 ((.attempts | last | .outcome) == "open")
                 and ((.attempts | length) > ($p[.id] // 0))
-              ))
-            | last // empty
+              )
+            ]
           ' .agentic-ci-state/runner-state.json)
-          if [ -z "$OPEN" ] || [ "$OPEN" = "null" ]; then
+          OPEN_COUNT=$(echo "$OPEN_ENTRIES" | jq 'length')
+          if [ "$OPEN_COUNT" -eq 0 ]; then
             echo "No new open attempted_fix; skipping lockfile verification."
             exit 0
           fi
 
-          PR_NUMBER=$(echo "$OPEN" | jq -r '.attempts | last | .pr_number // empty')
-          BRANCH=$(echo "$OPEN" | jq -r '.attempts | last | .branch // empty')
-          FINDING_ID=$(echo "$OPEN" | jq -r '.id')
+          echo "Verifying dependencies lockfile for ${OPEN_COUNT} new open attempted_fix entries."
+          REJECTED=0
 
-          # Verify against the actually-pushed branch, not local HEAD.
-          if [ -z "$BRANCH" ] && [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
-            BRANCH=$(gh pr view "$PR_NUMBER" --json headRefName -q .headRefName 2>/dev/null || true)
+          while IFS= read -r OPEN; do
+            PR_NUMBER=$(echo "$OPEN" | jq -r '.attempts | last | .pr_number // empty')
+            BRANCH=$(echo "$OPEN" | jq -r '.attempts | last | .branch // empty')
+            FINDING_ID=$(echo "$OPEN" | jq -r '.id')
+
+            # Verify against the actually-pushed branch, not local HEAD.
+            if [ -z "$BRANCH" ] && [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
+              BRANCH=$(gh pr view "$PR_NUMBER" --json headRefName -q .headRefName 2>/dev/null || true)
+              if [ -n "$BRANCH" ]; then
+                echo "::warning::Open attempt had no branch; recovered $BRANCH from PR #$PR_NUMBER."
+              fi
+            fi
             if [ -n "$BRANCH" ]; then
-              echo "::warning::Open attempt had no branch; recovered $BRANCH from PR #$PR_NUMBER."
+              git fetch --depth=50 origin "$BRANCH" 2>/dev/null || true
+              git checkout --force --detach "refs/remotes/origin/$BRANCH" 2>/dev/null \
+                || echo "::warning::Could not checkout origin/$BRANCH; verifying current tree"
             fi
-          fi
-          if [ -n "$BRANCH" ]; then
-            git fetch --depth=50 origin "$BRANCH" 2>/dev/null || true
-            git checkout --detach "refs/remotes/origin/$BRANCH" 2>/dev/null \
-              || echo "::warning::Could not checkout origin/$BRANCH; verifying current tree"
-          fi
 
-          if make install-dev 2>&1 | tee /tmp/install-dev-verify.log; then
-            echo "Lockfile resolves cleanly against the agent's pyproject changes."
-            exit 0
-          fi
+            if make install-dev 2>&1 | tee /tmp/install-dev-verify.log; then
+              echo "Lockfile resolves cleanly for ${FINDING_ID}."
+              continue
+            fi
 
-          echo "::error::make install-dev failed against the agent's pyproject changes"
+            REJECTED=1
+            echo "::error::make install-dev failed against the agent's pyproject changes for ${FINDING_ID}"
 
-          if [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
-            MSG="Closed by workflow lockfile verification. \`make install-dev\` failed against the agent's \`pyproject.toml\` changes — the dependency edit does not resolve cleanly. See \`/tmp/install-dev-verify.log\` in the workflow artifact."
-            gh pr close "$PR_NUMBER" --comment "$MSG" --delete-branch || \
-              echo "::warning::gh pr close failed"
-          elif [ -n "$BRANCH" ]; then
-            git push origin --delete "$BRANCH" || true
-          else
-            echo "::warning::No PR number or branch available for cleanup"
-          fi
+            if [ -n "$PR_NUMBER" ] && [ "$PR_NUMBER" != "null" ]; then
+              MSG="Closed by workflow lockfile verification. \`make install-dev\` failed against the agent's \`pyproject.toml\` changes — the dependency edit does not resolve cleanly. See \`/tmp/install-dev-verify.log\` in the workflow artifact."
+              gh pr close "$PR_NUMBER" --comment "$MSG" --delete-branch || \
+                echo "::warning::gh pr close failed"
+            elif [ -n "$BRANCH" ]; then
+              git push origin --delete "$BRANCH" || true
+            else
+              echo "::warning::No PR number or branch available for cleanup"
+            fi
 
-          FINDING_ID="$FINDING_ID" python3 -c "
+            FINDING_ID="$FINDING_ID" python3 -c "
           import json, os
           finding_id = os.environ['FINDING_ID']
           path = '.agentic-ci-state/runner-state.json'
@@ -602,7 +623,11 @@ jobs:
           os.replace(tmp, path)
           "
 
-          echo "rejected=true" >> "$GITHUB_OUTPUT"
+          done < <(echo "$OPEN_ENTRIES" | jq -c '.[]')
+
+          if [ "$REJECTED" -eq 1 ]; then
+            echo "rejected=true" >> "$GITHUB_OUTPUT"
+          fi
           exit 0
 
       - name: Update runner memory