diff --git a/README.md b/README.md
index fd1c5b81..0c495d28 100644
--- a/README.md
+++ b/README.md
@@ -128,7 +128,7 @@ Options:
   --spec PATH              Build a single spec file (skip discovery)
   --dry-run                Discover specs and print plan, no execution
   --max-commits-per-pr N   PR commit cap (default: 8, max: 100)
-  --max-loc-per-pr N       PR line-of-code cap (default: 400, range: 50-5000)
+  --max-loc-per-pr N       PR line-of-code cap (default: 250, range: 50-5000)
   --platform PLATFORM      github, gitlab, or auto (default: auto)
   --parallel N             Concurrent agentic loops, HF engine only (default: 1)
   --skip-auth-check        Skip gh/glab auth validation (for CI with GITHUB_TOKEN)
@@ -146,7 +146,7 @@ Environment variables:
 Codelicious produces small, focused, human-reviewable PRs by default:
 
 - **8 commits max per PR** (`--max-commits-per-pr`, range 1–100)
-- **400 LOC max per PR** (`--max-loc-per-pr`, range 50–5000)
+- **250 LOC max per PR** (`--max-loc-per-pr`, range 50–5000)
 
 When either cap is hit, the current PR is transitioned to review and a
 continuation branch (`<branch>-part-2`, `-part-3`, ...) is opened so work
diff --git a/docs/specs/28_bite_sized_continuous_prs_v1.md b/docs/specs/28_bite_sized_continuous_prs_v1.md
index e9fe27e6..a5ec9fda 100644
--- a/docs/specs/28_bite_sized_continuous_prs_v1.md
+++ b/docs/specs/28_bite_sized_continuous_prs_v1.md
@@ -42,7 +42,7 @@ many small PRs rather than a few large ones.
       the old default
 - [x] Update the banner print line that echoes the cap so it reflects the
       new default
-- [x] Add a `--max-loc-per-pr` CLI flag, default `400`, range 50–5000,
+- [x] Add a `--max-loc-per-pr` CLI flag, default `250`, range 50–5000,
       mapped via the same `_INT_KEYS` mechanism as `max_commits_per_pr`
 - [x] Pass `max_loc_per_pr` into `V2Orchestrator(...)` constructor
       (constructor accepts the kwarg now; cap-enforcement logic lands in Phase 2.2)
@@ -50,9 +50,9 @@ many small PRs rather than a few large ones.
 **Claude Code prompt:**
 > Open `src/codelicious/cli.py`. In `_parse_args`, change the default
 > for `max_commits_per_pr` from 50 to 8. Add a new option
-> `max_loc_per_pr` with default 400, validated to be between 50 and 5000,
+> `max_loc_per_pr` with default 250, validated to be between 50 and 5000,
 > and add the flag `--max-loc-per-pr` to the arg map. Update `_INT_KEYS`
-> to include it. In `main()`, pass `max_loc_per_pr=opts.get("max_loc_per_pr", 400)`
+> to include it. In `main()`, pass `max_loc_per_pr=opts.get("max_loc_per_pr", 250)`
 > when constructing `V2Orchestrator`. Update the banner to print both caps.
 > Do not change behavior when the operator explicitly sets the cap.
 
@@ -97,7 +97,7 @@ many small PRs rather than a few large ones.
 
 **File:** `src/codelicious/orchestrator.py`
 
-- [x] Add `max_loc_per_pr: int = 400` parameter to `V2Orchestrator.__init__`
+- [x] Add `max_loc_per_pr: int = 250` parameter to `V2Orchestrator.__init__`
       (done in Phase 1.1)
 - [x] In `run()`, after the existing commit-cap check (around line 1204),
       add a parallel check: if `max_loc_per_pr > 0` and
@@ -111,7 +111,7 @@ many small PRs rather than a few large ones.
 
 **Claude Code prompt:**
 > In `src/codelicious/orchestrator.py`, extend `V2Orchestrator`. Add
-> `max_loc_per_pr: int = 400` to `__init__`. Extract the existing
+> `max_loc_per_pr: int = 250` to `__init__`. Extract the existing
 > commit-cap split logic in `run()` (around line 1206) into a helper
 > method `_split_pr_and_continue` that takes `spec_id_str`, `spec_title`,
 > current `pr_part`, and returns `(new_pr_part, new_pr_number)`. Then
@@ -256,7 +256,7 @@ and `TestSkipCredentialProbeFlag` (2 cases). Full suite: 1901 passed.
 ## Acceptance Criteria
 
 - [ ] Running `codelicious .` on a multi-task spec produces PRs of ≤ 8 commits
-      and ≤ 400 LOC each, splitting into part-2 / part-3 branches as needed.
+      and ≤ 250 LOC each, splitting into part-2 / part-3 branches as needed.
       *Verified via unit tests; live end-to-end run requires a real repo +
       `gh` auth and is left as a manual smoke test.*
 - [x] Existing tests still pass: `pytest` → **1901 passed**.
diff --git a/docs/specs/spec-v29_gap_closure_v1.md b/docs/specs/spec-v29_gap_closure_v1.md
new file mode 100644
index 00000000..2d19467d
--- /dev/null
+++ b/docs/specs/spec-v29_gap_closure_v1.md
@@ -0,0 +1,599 @@
+---
+version: 1.0.0
+status: Complete
+created: 2026-05-03
+updated: 2026-05-04
+related_specs: ["27_codelicious_v2_rewrite.md", "28_bite_sized_continuous_prs_v1.md", "06_production_hardening.md", "07_sandbox_security_hardening.md"]
+progress:
+  completed: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
+  pending: []
+---
+
+# Spec v29 — Gap Closure (2026-05-03)
+
+## Background
+
+Codelicious has evolved across 28 specs and a v2 rewrite (spec 27) plus a
+bite-sized-PR tightening pass (spec 28). The codebase is mostly aligned to
+the v2 architecture, but a deep audit surfaced **18 concrete gaps** between
+what is specified and what is implemented, plus several low-grade resilience
+and consistency issues.
+
+Headline findings:
+
+* **Spec drift:** spec 28 documents a default LOC cap of 400, but a recent
+  commit (`258e5e90 lower default PR cap to 250 LOC`) silently changed the
+  runtime default to 250 without updating the spec. The spec and the code
+  now disagree.
+* **Legacy 4-phase orchestrator (`Orchestrator` BUILD/MERGE/REVIEW/FIX) is
+  still present** alongside the v2 chunk-based `V2Orchestrator`, yet spec 27
+  Phase 4.1 explicitly mandates removal. ~1,100 lines of dead code remain
+  reachable through the legacy `run_build_cycle()` engine path.
+* **No jitter on exponential backoff.** `LLMClient` and `loop_controller`
+  use plain `2**attempt` backoff, which CLAUDE.md ("LLM calls use
+  exponential backoff with jitter for transient errors") explicitly forbids.
+* **Two test files are missing for major modules** (`scaffolder.py`'s
+  ~558 LOC has shallow tests, `tools/audit_logger.py` has none of its own
+  test file), and `chunker.chunk_spec_with_llm` has no dedicated tests.
+* The Claude engine's `execute_chunk` does **not pass `stream-json`,
+  `--output-format`, or the `--allowedTools` list specified in spec 27
+  Phase 3.2** — the prompt is built but the CLI flags spec 27 mandates are
+  not visible at the call site (they may live in `agent_runner`, but the
+  spec wants them per-chunk).
+
+This spec closes those gaps. It is ordered by severity: critical security
+and correctness items first, then resilience, then consistency / cleanup,
+then test coverage. Each step is self-contained — pick one off the top,
+execute it, ship a small PR, repeat.
+
+## Gap Inventory
+
+| # | Title | Severity |
+|---|---|---|
+| 1 | Spec 28 documents 400 LOC default but code uses 250 | High (consistency) — **DONE** |
+| 2 | Legacy 4-phase `Orchestrator` not removed per spec 27 Phase 4.1 | High (drift) — **DONE** (orchestrator.py 1499→501 LOC; `V2Orchestrator` renamed to `Orchestrator`, alias retained; `tests/test_orchestrator.py` deleted) |
+| 3 | LLM backoff has no jitter (CLAUDE.md violation) | High (resilience) — **DONE** |
+| 4 | `LLMClient` rate-limit retry ignores `Retry-After` header | High (resilience) — **DONE** |
+| 5 | `agent_runner` Claude rate-limit detection exits with retry_after_s=60 always — no provider-supplied delay | Medium (resilience) — **DONE** (`_parse_claude_reset_seconds`) |
+| 6 | `verify_chunk` runs the entire repo's verifier — not chunk-scoped (spec 27 §3.1) | Medium (correctness) — **DONE** (`verify_paths` runs scoped ruff/bandit/pytest; both engines call it when chunk has files) |
+| 7 | `chunk_spec_with_llm` truncates spec body at 5000 chars silently | Medium (correctness) — **DONE** (`_split_spec_for_llm` produces overlapping windows up to `_LLM_MAX_WINDOWS`; deduped on title) |
+| 8 | `engines/base.py` still exposes the legacy `run_build_cycle` abstract method, blocking removal of legacy path | Medium (drift) — **DONE** (`run_build_cycle` and `BuildResult` removed; engine ABC has exactly 3 abstract methods) |
+| 9 | `claude_engine.execute_chunk` doesn't surface `--allowedTools`/`--output-format stream-json` flags from spec 27 §3.2 at the chunk call-site | Medium (drift) — **DONE** (`_DEFAULT_ALLOWED_TOOLS` + `_ChunkConfig.allowed_tools/output_format`; `_build_agent_command` consumes them) |
+| 10 | No deadline enforcement *per chunk* in `V2Orchestrator.run()` | Medium (resilience) — **DONE** (`>=` deadline gate before each `engine.execute_chunk` call) |
+| 11 | `huggingface_engine.execute_chunk` lacks the reflection step's verification — model can fake `CHUNK_COMPLETE` | Medium (correctness) — **DONE** (`_HF_MAX_FIX_ATTEMPTS=2` cap on fix-cycle loop; verify must pass before success=True) |
+| 12 | `audit_logger.py` has no dedicated unit-test file | Medium (testing) — **DONE** (`tests/test_audit_logger.py`, 15 cases) |
+| 13 | `chunk_spec_with_llm` has no dedicated unit-test coverage | Medium (testing) — **DONE** (`TestChunkSpecWithLlm`, 10 cases) |
+| 14 | `_probe_git_credentials` in `cli.py` defaults conservative on failure but doesn't distinguish `ssh-add` exit-1 (no agent) from exit-0-empty (agent but no keys) | Low (UX) — **DONE** (`ssh_agent_state` field) |
+| 15 | Sandbox post-write `O_NOFOLLOW` re-check is conditional on `hasattr(os, "O_NOFOLLOW")` — no fallback warning when running on a platform without it (Windows) | Low (security) — **DONE** |
+| 16 | `pyproject.toml` declares `Python :: 3.14` classifier without CI matrix coverage | Low (consistency) — **DONE** (CI matrix `3.10/3.11/3.12/3.13/3.14-dev` matches classifiers; 3.14-dev marked `continue-on-error`) |
+| 17 | `prompts.py` retains multi-phase legacy templates that spec 27 §6.2 marked for removal | Low (cleanup) — **DONE** (templates already absent; live set documented in module docstring; AGENT_BUILD_SPEC stays until Step 2 lands) |
+| 18 | `_DEFAULT_PR_LOC_CAP` constant is duplicated as a literal `250` in cli.py and `400` in the README/spec — single source of truth missing | Low (consistency) — **DONE** (Step 1 added `_DEFAULT_PR_LOC_CAP`; Step 18 added `_FORBIDDEN_PATTERNS_DOC` tuple in scaffolder) |
+
+---
+
+## Step-by-step Fix Instructions
+
+### Step 1: Reconcile spec 28 with the code's 250 LOC default [High]
+
+**Context:** Commit `258e5e90` lowered the runtime default from 400 to 250
+LOC per PR but the spec 28 still says 400 and the README's "Bite-sized PR
+mode" example uses 400. Operators reading the spec will be surprised the
+tool splits earlier than documented.
+
+**Prompt for Claude Code:**
+> Open `docs/specs/28_bite_sized_continuous_prs_v1.md` and update every
+> reference from `400` LOC to `250` LOC: section 1.1 ("max_loc_per_pr CLI
+> flag, default 400"), the Claude-Code prompt block in 1.1 that sets
+> `max_loc_per_pr=opts.get("max_loc_per_pr", 400)`, the 2.2 default in the
+> `__init__` signature, and the Acceptance Criteria line that says "≤ 400
+> LOC each". Also open `README.md`, find the "Bite-sized PR mode" section,
+> and adjust the documented default. Then introduce a single-source-of-
+> truth constant `_DEFAULT_PR_LOC_CAP = 250` (and `_DEFAULT_PR_COMMIT_CAP =
+> 8`) in `src/codelicious/cli.py` near the existing `_INT_KEYS`
+> declaration; replace the four magic-number occurrences in `_parse_args`
+> defaults, banner print lines, and the `V2Orchestrator(...)` call site.
+> Add a regression test to `tests/test_cli.py` that asserts the parsed
+> `max_loc_per_pr` default equals the constant.
+
+**Acceptance criteria:**
+- Spec 28 and README consistently quote 250 LOC.
+- A single named constant drives the default in `cli.py`.
+- Test asserts `_parse_args([])["max_loc_per_pr"] == 250`.
+
+---
+
+### Step 2: Remove the legacy 4-phase `Orchestrator` [High]
+
+**Context:** Spec 27 Phase 4.1 mandates "Remove the 4-phase model
+(BUILD → MERGE → REVIEW → FIX) — the new model is simpler". `V2Orchestrator`
+exists in `orchestrator.py` (line 1124+) but the legacy `Orchestrator`
+class (lines 1–1123) is still present, still exported in `__all__`, and
+still entry-pointed via `engines/*.run_build_cycle()`. This is dead code
+weight + drift risk.
+
+**Prompt for Claude Code:**
+> Open `src/codelicious/orchestrator.py` and identify the legacy
+> `Orchestrator` class plus its supporting helpers `REVIEWER_PROMPTS`,
+> `_MERGE_ABORT_TIMEOUT_S`, and any worktree/merge code that
+> `V2Orchestrator` does not call. Trace callers: `grep -rn
+> "Orchestrator(" src/ tests/` to find them. Migrate any test that still
+> targets the old class to `V2Orchestrator` if equivalent coverage is
+> needed; otherwise delete the test. Remove the `Orchestrator` class and
+> its helpers, drop them from `__all__`, and update `cli.py` and engine
+> entry points so only `V2Orchestrator` is referenced. After deletion,
+> rename `V2Orchestrator` → `Orchestrator` (since v2 is now THE
+> orchestrator) but keep a `V2Orchestrator = Orchestrator` alias for one
+> release cycle to avoid breaking any external import. Also update
+> `engines/base.py` to drop `run_build_cycle` from the abstract surface
+> (Step 8 covers this in detail). Run `pytest -x` and confirm green.
+
+**Acceptance criteria:**
+- `grep -rn "BUILD.*MERGE.*REVIEW.*FIX" src/` returns zero hits.
+- `pytest` still passes.
+- Net diff is a deletion of >800 LOC.
+- No test imports `Orchestrator` to mean the old 4-phase variant.
+
+---
+
+### Step 3: Add jitter to all exponential backoff sites [High]
+
+**Context:** CLAUDE.md states "LLM calls use exponential backoff with
+jitter for transient errors (429, 5xx)". Today
+`src/codelicious/llm_client.py:221` uses `backoff = self._BACKOFF_BASE_S *
+(2**attempt)` with no jitter, and `loop_controller.py:28` uses
+`_LLM_BACKOFF_BASE_S` similarly. Without jitter, multiple concurrent
+codelicious runs against the same provider thunder-herd on the same retry
+cycle.
+
+**Prompt for Claude Code:**
+> Add a `_jittered_backoff(base: float, attempt: int) -> float` helper to
+> `src/codelicious/llm_client.py` that returns `base * (2**attempt) *
+> (0.5 + secrets.SystemRandom().random())` — i.e. multiplicative jitter
+> in the range [0.5x, 1.5x]. Use the `secrets` module per the security
+> rules ("Use secrets module for random token generation, not random") —
+> `SystemRandom().random()` is the correct primitive here. Replace both
+> `time.sleep(backoff)` call sites in `chat_completion()` (HTTP retry
+> branch and network-error retry branch) to use the new helper. Make the
+> identical change in `src/codelicious/loop_controller.py` around line
+> 200 where `_LLM_BACKOFF_BASE_S` drives a sleep. Add a unit test in
+> `tests/test_llm_client.py` that monkey-patches `secrets.SystemRandom`
+> and asserts the actual sleep duration falls within
+> `[0.5*expected, 1.5*expected]` for three consecutive attempts.
+
+**Acceptance criteria:**
+- Both retry call sites use jittered backoff.
+- Jitter is sourced from `secrets`, not `random`.
+- New test verifies jitter range.
+
+---
+
+### Step 4: Honor provider-supplied `Retry-After` on 429 in LLMClient [High]
+
+**Context:** `llm_client.py` lines 213–232 handle HTTP 429 by using its
+own backoff schedule. HuggingFace and most providers return a `Retry-After`
+header (seconds or HTTP date). Ignoring it wastes our retry budget.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/llm_client.py`, inside the
+> `except urllib.error.HTTPError as e:` branch, after sanitizing the
+> error body, read `e.headers.get("Retry-After")`. Parse it as: integer
+> seconds first, then HTTP-date (use
+> `email.utils.parsedate_to_datetime`). Cap parsed values at 120 seconds
+> (defensive: a malicious or buggy provider could send a huge value). If
+> `Retry-After` parses successfully and `e.code == 429`, use that value
+> instead of the computed exponential backoff (still apply jitter from
+> Step 3). Log at INFO level when the provider-supplied value is used:
+> "Honoring Retry-After: %.1fs for HTTP 429". Add a test in
+> `tests/test_llm_client.py` that builds a mock `HTTPError` whose
+> `headers` provides `Retry-After: 7` and asserts `time.sleep` was called
+> with a value in [3.5, 10.5].
+
+**Acceptance criteria:**
+- `Retry-After` header (integer or HTTP-date) is parsed and respected.
+- Cap of 120 s applied.
+- Jitter still applied on top.
+- New test verifies behavior.
+
+---
+
+### Step 5: Surface provider Retry-After in agent_runner Claude rate-limit path [Medium]
+
+**Context:** `agent_runner.py:263` raises `ClaudeRateLimitError(...,
+retry_after_s=60.0)` as a hardcoded constant. The Claude CLI's stderr
+message often includes the actual reset window ("resets at 5pm"); we
+should parse it where present.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/agent_runner.py`, near the rate-limit detection
+> block at line 243–264, add a helper `_parse_claude_reset_seconds(text:
+> str) -> float | None` that scans `text` for patterns like
+> `resets in (\d+) seconds`, `try again in (\d+)m`, or
+> `Retry-After: (\d+)`. Return parsed seconds (clamped to [10, 3600]) or
+> None. In the existing rate-limit branch, call the parser on
+> `(stderr_text + stdout_text)` and pass the result (or 60.0) to
+> `ClaudeRateLimitError(retry_after_s=...)`. Add tests in
+> `tests/test_agent_runner.py` covering: no match → 60.0, "resets in 30
+> seconds" → 30.0, "try again in 5m" → 300.0, value above cap → 3600.0.
+
+**Acceptance criteria:**
+- Helper parses three formats and clamps.
+- Tests cover all four cases.
+- Existing rate-limit tests still pass.
+
+---
+
+### Step 6: Make `verify_chunk` chunk-scoped, not whole-repo [Medium]
+
+**Context:** Spec 27 §3.1 specifies `verify_chunk` "Verify a *completed
+chunk* passes lint/test/security checks". The current Claude implementation
+(`engines/claude_engine.py:152`) and HF implementation both call the
+repo-wide `verifier.verify(repo_path)`. On a 50-chunk run this re-runs the
+entire test suite 50 times.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/verifier.py`, add a new function `verify_paths(repo:
+> Path, paths: list[Path]) -> VerifyResult` that runs lint/security
+> checks scoped to the provided paths (pass them as positional args to
+> `ruff check` and `bandit`) and the test suite scoped via
+> `pytest <inferred-test-paths>` where `<inferred-test-paths>` are derived
+> by mapping each modified `src/<module>.py` to `tests/test_<module>.py`
+> if it exists, falling back to the full suite when no mapping exists.
+> Update `engines/claude_engine.py:verify_chunk` and
+> `engines/huggingface_engine.py:verify_chunk` to use the new
+> `verify_paths(repo, chunk.estimated_files + result.files_modified)`
+> when `files_modified` is non-empty; fall back to `verify(repo_path)`
+> otherwise. Add tests in `tests/test_verifier.py::TestVerifyPaths`
+> covering: empty paths → falls back to full verify, single src file with
+> matching test → only that test runs, source file with no matching test
+> → falls back to full suite.
+
+**Acceptance criteria:**
+- Chunk-scoped verification runs only mapped tests when mapping exists.
+- Lint and bandit only scan modified files.
+- 3 new tests pass.
+
+---
+
+### Step 7: Stop silent truncation in `chunk_spec_with_llm` [Medium]
+
+**Context:** `chunker.py:212` truncates the spec body to 5000 chars
+(`spec_content[:5000]`). For a 12k-line spec like 27, the LLM only sees the
+first ~10% and silently produces an incomplete chunking. There is no
+warning logged.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/chunker.py`, replace the bare `[:5000]` slice with
+> a windowing strategy: if `len(spec_content) > 5000`, log a WARNING
+> ("Spec too large for single LLM pass: %d chars, splitting into N
+> windows of ~4500 chars with 500-char overlap") and call the LLM once
+> per window, merging the resulting chunk lists with deduplication on
+> normalized title. Cap total windows at 10 to prevent runaway. If
+> windowing produces > `_MAX_CHUNKS_PER_SPEC` items, truncate to the cap
+> and log a WARNING. Add tests in `tests/test_chunker.py` covering: short
+> spec (no windowing), exactly-5000-char spec (no windowing), 12k-char
+> spec (3 windows expected), 60k-char spec (capped at 10 windows + warning).
+
+**Acceptance criteria:**
+- Long specs no longer silently lose content.
+- Window count and overlap configurable via module-level constants.
+- 4 new tests pass.
+
+---
+
+### Step 8: Drop `run_build_cycle` from the engine abstract surface [Medium]
+
+**Context:** `engines/base.py:135` still declares `run_build_cycle` as
+`@abc.abstractmethod`. Spec 27 §6.1 says "Merge `executor.py` into
+`orchestrator.py`" — the same simplification mandates removing the
+legacy entry from the engine interface so engines can't accidentally
+re-grow a 6-phase loop.
+
+**Prompt for Claude Code:**
+> Open `src/codelicious/engines/base.py`. Remove the abstract
+> `run_build_cycle` method, the `BuildResult` dataclass, and the legacy
+> docstring sections (the docstring should now describe only the
+> chunk-level interface). In `engines/claude_engine.py` and
+> `engines/huggingface_engine.py`, remove the `run_build_cycle`
+> implementations. Verify that nothing in `cli.py` or `orchestrator.py`
+> still calls `run_build_cycle` (Step 2 should have already eliminated
+> this; if any remains, route it through `V2Orchestrator.run()`). Update
+> `tests/test_engine_base.py` to reflect the new minimal abstract
+> surface — assert that subclasses without `execute_chunk` /
+> `verify_chunk` / `fix_chunk` raise TypeError on instantiation.
+
+**Acceptance criteria:**
+- Engine ABC has exactly three abstract methods.
+- `run_build_cycle` and `BuildResult` are gone.
+- `pytest tests/test_engine_base.py` passes.
+
+---
+
+### Step 9: Surface Claude CLI flags from spec 27 §3.2 at the chunk call site [Medium]
+
+**Context:** Spec 27 §3.2 specifies the exact Claude CLI invocation:
+`claude -p "{prompt}" --output-format stream-json --max-turns 50
+--allowedTools "Edit,Write,Bash(git status:*),Bash(pytest:*),Bash(ruff:*),
+Read,Glob,Grep"`. `claude_engine.py:execute_chunk` builds a `_ChunkConfig`
+with only `model`, `effort`, `max_turns`, `agent_timeout_s`, `dry_run` —
+the allowedTools allow-list and stream-json output format are not visible
+at the chunk site (they live in `agent_runner` defaults, but spec 27 wants
+chunk-level configurability so future engines can tighten the tool list).
+
+**Prompt for Claude Code:**
+> Extend `_ChunkConfig` in `engines/claude_engine.py:execute_chunk` to
+> include `allowed_tools: list[str]` and `output_format: str` attributes,
+> defaulting to the exact list from spec 27 §3.2 and `"stream-json"`.
+> Open `src/codelicious/agent_runner.py`, find where the `claude` argv is
+> assembled (search for `--max-turns` or `--allowedTools`), and make those
+> values read from `config.allowed_tools` and `config.output_format` when
+> set, falling back to current defaults. Add a unit test in
+> `tests/test_agent_runner.py` (or `test_engine_claude_chunk.py`) that
+> patches `subprocess.Popen` and asserts the argv contains the expected
+> `--allowedTools` string and `--output-format stream-json`.
+
+**Acceptance criteria:**
+- Chunk-level config controls allow-list and output format.
+- Default allow-list matches spec 27 verbatim.
+- Argv-assertion test passes.
+
+---
+
+### Step 10: Enforce per-chunk deadline in `V2Orchestrator.run()` [Medium]
+
+**Context:** CLAUDE.md: "Always check the build deadline before starting a
+new phase". `V2Orchestrator.run()` accepts a `deadline: float = 0.0`
+parameter (orchestrator.py:1202) but the per-chunk loop body does not
+check it before each chunk.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/orchestrator.py:V2Orchestrator.run()`, locate the
+> for-each-chunk loop. Before invoking `engine.execute_chunk(...)`, add:
+> `if self.deadline and time.monotonic() >= self.deadline: logger.warning(
+> "Deadline reached after %d/%d chunks; stopping early.", ...);
+> break`. Pass the *remaining* deadline to the engine via
+> `EngineContext(deadline=self.deadline - 0)` so engines can budget their
+> own subprocess timeouts. Add a test in
+> `tests/test_v2_orchestrator.py::TestDeadlineEnforcement` that uses a
+> mock engine; sets `deadline = time.monotonic() - 1` (already expired)
+> and asserts zero chunks executed. A second test sets a future deadline
+> and asserts all chunks ran.
+
+**Acceptance criteria:**
+- Loop breaks cleanly when deadline elapses.
+- Engine receives current deadline.
+- 2 new tests pass.
+
+---
+
+### Step 11: Add a verification gate before HF engine returns CHUNK_COMPLETE [Medium]
+
+**Context:** Spec 27 §3.3 says the HF engine's reflection step should
+"review its own changes and fix any issues before signaling
+CHUNK_COMPLETE". Today the HF engine accepts the model's
+`CHUNK_COMPLETE` token at face value (huggingface_engine.py:188 reflection
+step exists, but its output is *advisory*, not gating).
+
+**Prompt for Claude Code:**
+> In `src/codelicious/engines/huggingface_engine.py:execute_chunk`, after
+> the reflection step (around line 188–220), add a post-reflection
+> verification: if `self.verify_chunk(chunk, repo_path)` returns
+> `success=False`, do NOT return `ChunkResult(success=True)`. Instead, set
+> the model up for one fix-cycle iteration (call `fix_chunk` with the
+> failure message, then re-verify). Cap at 2 fix-cycle attempts to
+> prevent infinite loops. If still failing, return
+> `ChunkResult(success=False, message="HF engine could not satisfy
+> verification after 2 fix attempts")`. Add a test in
+> `tests/test_engine_huggingface_chunk.py` that mocks `verify_chunk` to
+> fail twice then pass, asserts 2 fix attempts and a final success; a
+> second test where verify always fails asserts the final ChunkResult is
+> not-success and `retries_used >= 2`.
+
+**Acceptance criteria:**
+- HF engine never returns success=True without passing verification.
+- Hard cap of 2 fix iterations.
+- 2 new tests pass.
+
+---
+
+### Step 12: Add dedicated test file for `tools/audit_logger.py` [Medium]
+
+**Context:** `src/codelicious/tools/audit_logger.py` is 256 lines and is
+referenced indirectly via `test_registry.py::test_dispatch_calls_audit_
+logger`, but has no `tests/test_audit_logger.py`. Audit logging is a
+security-relevant code path (every tool call goes through it) and deserves
+direct coverage.
+
+**Prompt for Claude Code:**
+> Create `tests/test_audit_logger.py`. Read
+> `src/codelicious/tools/audit_logger.py` to enumerate its public surface.
+> Write tests for: file rotation (writing > size threshold spawns a new
+> file), redaction (an entry containing a fake API key like
+> `sk-test-12345abcdef` does not appear in the rotated file), atomic
+> append behavior (use `os.fork` is forbidden — instead use threads:
+> 10 threads each appending 100 entries should yield 1000 distinct lines
+> with no truncation), and the JSON-line format (each line parses as
+> JSON). Use `pytest` fixtures with `tmp_path`. Aim for 8+ test cases.
+
+**Acceptance criteria:**
+- File `tests/test_audit_logger.py` exists with 8+ tests.
+- Coverage of `audit_logger.py` rises above 90%.
+- All tests pass.
+
+---
+
+### Step 13: Add `chunk_spec_with_llm` test coverage [Medium]
+
+**Context:** `chunker.py:165` exposes a public function that has been
+implemented but has no dedicated test cases (existing
+`tests/test_chunker.py` only exercises the deterministic `chunk_spec`).
+With Step 7's windowing logic landing, the LLM path needs explicit
+coverage.
+
+**Prompt for Claude Code:**
+> In `tests/test_chunker.py`, add a test class
+> `TestChunkSpecWithLLM` that uses a mock `llm_client` whose
+> `chat_completion` method returns canned JSON. Cover: valid 3-chunk
+> response (assert 3 WorkChunks produced with correct titles), JSON
+> wrapped in markdown code fences (parser strips them), invalid JSON
+> (falls back to deterministic `chunk_spec`), circular dependency (chunk
+> A depends_on B, B depends_on A — assert the bad pair is dropped or the
+> entire response is rejected and we fall back), path-traversal attempt
+> (a chunk's `files: ["../../etc/passwd"]` — assert the file hint is
+> sanitized away), and exceeding `_MAX_CHUNKS_PER_SPEC` (mock returns 150
+> chunks — assert truncated to 100 with a warning). Use `caplog` to
+> assert warnings.
+
+**Acceptance criteria:**
+- 6 new tests under `TestChunkSpecWithLLM`.
+- All pass.
+- Coverage of `chunk_spec_with_llm` reaches > 95%.
+
+---
+
+### Step 14: Distinguish ssh-add states more precisely [Low]
+
+**Context:** `cli.py:_probe_git_credentials` treats `ssh-add` failure as
+"not loaded". But `ssh-add -l` returns exit-1 in two distinct cases: no
+agent running vs. agent running but empty. The conservative-default
+behavior is correct, but the user-facing prompt could be sharper.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/cli.py:_probe_git_credentials`, capture the exact
+> exit code from `ssh-add -l`: 0 = keys present, 1 = agent running but
+> no keys, 2 = no agent. Return an extra dict key `ssh_agent_state`
+> with one of `"keys_loaded"`, `"empty"`, `"no_agent"`, `"unknown"`. In
+> `_ensure_git_credentials_unlocked`, branch on this value: when
+> `no_agent`, suggest `eval $(ssh-agent)` first; when `empty`, just run
+> `ssh-add`. Update existing tests in
+> `tests/test_cli.py::TestProbeGitCredentials` to verify the new key,
+> and add a new test for the `no_agent` branch (mock subprocess to
+> return exit 2).
+
+**Acceptance criteria:**
+- New `ssh_agent_state` key in the probe dict.
+- Tests cover all three states.
+- User-facing message changes by state.
+
+---
+
+### Step 15: Sandbox: explicit Windows fallback warning when O_NOFOLLOW unavailable [Low]
+
+**Context:** `sandbox.py:387`: `if os.path.exists(str(resolved)) and
+hasattr(os, "O_NOFOLLOW"):`. On platforms lacking `O_NOFOLLOW` (Windows
+mainly), the symlink TOCTOU re-check is silently skipped. Codelicious
+claims "TOCTOU-safe operations" in CLAUDE.md.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/sandbox.py`, locate the `hasattr(os, "O_NOFOLLOW")`
+> guard around line 387. Replace the bare `if not hasattr(...)` skip
+> with a startup-time check (in `Sandbox.__init__`): if
+> `os.O_NOFOLLOW` is unavailable, log a single WARNING:
+> "Sandbox: os.O_NOFOLLOW unavailable on this platform — TOCTOU
+> protection on atomic writes is best-effort. Run codelicious on a POSIX
+> system for full guarantees." Add a test in `tests/test_sandbox.py`
+> that monkey-patches `os` to remove `O_NOFOLLOW` and asserts the
+> warning is emitted exactly once per Sandbox instance.
+
+**Acceptance criteria:**
+- Operators see a clear warning on Windows.
+- Test verifies single-emission behavior.
+
+---
+
+### Step 16: Align Python classifier list with CI matrix [Low]
+
+**Context:** `pyproject.toml` lists Python 3.10 through 3.14 as
+supported but the project's CI (if any) likely doesn't cover all five.
+Claiming 3.14 support is forward-looking but unverified.
+
+**Prompt for Claude Code:**
+> Open `pyproject.toml`. Run `gh workflow list` and inspect the test
+> workflow YAML to enumerate the actual Python versions exercised in CI.
+> For any version listed in the `classifiers` block but NOT in CI, either
+> (a) drop the classifier, or (b) add the version to the CI matrix — pick
+> based on whether the project intends to support that version. Default
+> to dropping 3.13 and 3.14 from the classifiers if no CI exists, and
+> open a follow-up issue titled "Add Python 3.13/3.14 to CI matrix once
+> released and stable".
+
+**Acceptance criteria:**
+- `pyproject.toml` classifiers match CI reality.
+- Follow-up issue exists if classifiers were trimmed.
+
+---
+
+### Step 17: Remove obsolete multi-phase prompt templates [Low]
+
+**Context:** Spec 27 §6.2: "Remove multi-phase prompt templates
+(SCAFFOLD, ANALYZE, REFLECT, etc.)". `prompts.py` is 286 lines —
+inspect to find any template not in the chunk-focused set
+(CHUNK_EXECUTE / CHUNK_VERIFY / CHUNK_FIX) and remove it.
+
+**Prompt for Claude Code:**
+> Open `src/codelicious/prompts.py`. List every named template constant.
+> Run `grep -rn "<TEMPLATE_NAME>" src/ tests/` for each — any template
+> with zero non-test references is dead code. Delete it and update the
+> module docstring to enumerate only the live templates. Update
+> `tests/test_prompts.py` to drop assertions on deleted templates and
+> add a smoke test that asserts CHUNK_EXECUTE, CHUNK_VERIFY, CHUNK_FIX,
+> and `render()` exist.
+
+**Acceptance criteria:**
+- No dead templates remain.
+- Live templates enumerated in module docstring.
+- `tests/test_prompts.py` passes.
+
+---
+
+### Step 18: Audit `scaffolder.py` for security-rule self-violations [Low]
+
+**Context:** `scaffolder.py:148, 291, 335-337` contain string literals
+listing forbidden patterns (`eval(`, `shell=True`, `os.system`) used as
+prose in scaffolded README content. These are scaffolding output, not
+runtime calls — but a future refactor that turns them into f-strings
+without care could regress security. Tighten the source to make this
+harder.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/scaffolder.py`, replace the three string-literal
+> blocks at lines ~148, ~291, ~335 that enumerate forbidden patterns
+> with reads from a single module-level frozen tuple
+> `_FORBIDDEN_PATTERNS_DOC = ("eval(", "exec(", "shell=True", ...)`.
+> Format that tuple into the scaffolded README/CLAUDE.md/security.md
+> output via `"\n".join(f"- {p}" for p in _FORBIDDEN_PATTERNS_DOC)`. The
+> tuple becomes the single source of truth and any future change must be
+> intentional. Add a test that asserts the scaffolded output contains
+> every entry from `_FORBIDDEN_PATTERNS_DOC`.
+
+**Acceptance criteria:**
+- Single tuple drives the forbidden-pattern documentation.
+- Tests cover the round-trip.
+
+---
+
+## Roll-out Order
+
+Execute the steps in numerical order. Steps 1–3 are mergeable
+independently. Steps 6, 8, 9, 10, 11 should land after Step 2 (legacy
+removal) so diffs stay clean. Steps 12–18 are pure additions and can be
+parallelized.
+
+## Acceptance Criteria (overall)
+
+- [ ] All 18 steps complete and merged.
+- [ ] `pytest` passes with coverage ≥ 90% (existing project gate).
+- [ ] `ruff check src/ tests/` clean.
+- [ ] `bandit -r src/` clean (zero new high/medium findings).
+- [ ] `grep -rn "Orchestrator(" src/` returns only `V2Orchestrator`-derived hits.
+- [ ] No new runtime dependencies in `pyproject.toml`.
+- [ ] Spec 28 and README quote the same LOC default as the code.
+- [ ] All retries (LLM, HF tool loop, Claude CLI) include jitter.
+
+## Out of Scope
+
+- New engine backends (Anthropic API, OpenAI, Gemini) — already deferred to
+  spec 27 §8.2.
+- Stacked PRs / DAG executor — deferred per spec 28 "Out of Scope".
+- Webhook/Slack/Jira triggers — deferred per spec 27 §8.1.
+- Replacing the `urllib`-based HTTP client with `httpx` — would break the
+  zero-runtime-dependency invariant.
diff --git a/docs/specs/spec-v30_operational_resilience_and_idempotency_v1.md b/docs/specs/spec-v30_operational_resilience_and_idempotency_v1.md
new file mode 100644
index 00000000..b4e75402
--- /dev/null
+++ b/docs/specs/spec-v30_operational_resilience_and_idempotency_v1.md
@@ -0,0 +1,497 @@
+---
+version: 1.0.0
+status: Complete
+created: 2026-05-03
+updated: 2026-05-04
+related_specs: ["27_codelicious_v2_rewrite.md", "28_bite_sized_continuous_prs_v1.md", "spec-v29_gap_closure_v1.md", "18_operational_resilience_v1.md"]
+progress:
+  completed: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
+  pending: []
+---
+
+# Spec v30 — Operational Resilience & Idempotency Gap Closure (2026-05-03)
+
+## Background
+
+Spec v29 closes 18 gaps, mostly around legacy-orchestrator cleanup, jitter,
+chunk-scoped verification, and missing tests for individual modules. After
+landing v29, a second pass surfaced a different *category* of gap: things
+that work fine for a single, lucky, end-to-end run but fall over under
+realistic operational conditions — concurrent invocations, partial failures
+followed by resumes, oversized specs, rate-limit storms, log-file
+contention, and CLI flags that are only validated at one layer.
+
+These gaps are **orthogonal to v29**: they are not on v29's inventory, they
+are not addressed by deleting the legacy `Orchestrator`, and they will
+remain after every step of v29 is merged. They are ordered here by impact
+on a real user running `codelicious build` against a non-trivial spec on a
+shared workstation.
+
+## Gap Inventory
+
+| # | Title | Severity |
+|---|---|---|
+| 1 | No concurrent-run lockfile — two `codelicious build` invocations on the same repo can race on git, sandbox, and `latest.log` | High (correctness) — **DONE** (`_run_lock` advisory `fcntl.flock`; second invocation exits 75) |
+| 2 | No idempotent resume — if a run aborts mid-spec (rate-limit, SIGTERM, crash), restarting redoes already-merged chunks and may double-PR | High (correctness) — **DONE** (atomic JSON ledger at `.codelicious/state/<spec>.json`; `--no-resume` and `--reset-ledger` flags) |
+| 3 | `--endpoint` CLI flag is not validated for HTTPS at the CLI layer; config-layer validation runs only when LLMClient is constructed, after a banner has already been printed | High (security) — **DONE** (`_validate_endpoint_url_strict` runs before banner; rejects http, user-info, empty host with masked output, exits 2) |
+| 4 | SIGTERM handler raises `SystemExit(143)` but there is no integration test asserting that an in-flight chunk and its subprocess (Claude CLI) both terminate within the 5 s grace window | High (resilience) — **DONE** (`TestSigtermIntegration` in `test_cli.py` spawns a real Python child holding the run-lock, sends SIGTERM, asserts exit 143 within 8s and lockfile cleanup) |
+| 5 | No engine fallback on persistent rate-limit — Claude CLI rate-limit aborts the build (per recent commit `9b68ea84`) instead of failing over to the HuggingFace engine when both are configured | Medium (resilience) — **DONE** (`Orchestrator(engines=[…])` ordered list; rate-limited engine is dropped and the chunk retries on the next one; CLI auto-builds the list when both creds are present) |
+| 6 | Chunker has no token-budget awareness — a chunk whose estimated files exceed the engine context window will be dispatched anyway and fail at runtime | Medium (correctness) — **DONE** (`enforce_token_budget` halves over-budget chunks recursively up to depth 3; orchestrator runs it after deterministic chunking) |
+| 7 | `latest.log` symlink is updated non-atomically on every run — two simultaneous runs (or a fast restart) leave a dangling or wrong-target symlink | Medium (operational) — **DONE** (`_atomic_symlink_update` uses tmp + `os.replace`; Windows fallback writes `<link>.txt`) |
+| 8 | Verifier measures coverage but does not enforce a gate — a chunk whose patch drops coverage below the project floor still returns `success=True` | Medium (correctness) — **DONE** (`resolve_min_coverage` + `_enforce_coverage_floor`; CLI `--min-coverage`, `[tool.codelicious].min_coverage`, default 90) |
+| 9 | PR description body lacks chunk metadata (parent spec link, dependency chain, verifier summary) — humans reviewing the PR have no context | Medium (UX) — **DONE** (`_build_pr_body` accepts `chunk_metadata`; renders Chunk Context, Verifier Summary, Audit Log sections; missing fields → `n/a`) |
+| 10 | Branch-name collision — two chunks whose titles normalize to the same slug produce the same branch; the second `git checkout -b` fails with a cryptic error | Medium (correctness) — **DONE** (`_disambiguate_branch` probes local + remote, suffix or unix-ts fallback, INFO logged) |
+| 11 | `audit_logger` rotation has a thread-safe append path but no `fcntl.flock` cross-process guard; concurrent `codelicious` runs interleave audit lines | Medium (security) — **DONE** (`_cross_process_lock` wraps every append with `fcntl.flock` on `.audit.lock`; Windows `msvcrt.locking` fallback) |
+| 12 | No diagnostic dump on abnormal exit — when the build aborts (deadline, rate-limit, exception), there is no single artifact summarizing what completed, what failed, and what to retry | Low (UX) — **DONE** (`_write_postmortem` aggregates ledger counts + log tail + resume hint; called on abort) |
+
+---
+
+## Step-by-step Fix Instructions
+
+### Step 1: Add a per-repo lockfile to prevent concurrent runs [High]
+
+**Context:** Nothing today prevents two `codelicious build` invocations
+from running against the same repository checkout. Both will try to
+create branches, write to `.codelicious/`, append to `audit.log`, and
+update the `latest.log` symlink. The git operations alone can corrupt
+each other (one process's `git checkout -b` racing the other's
+`git commit`). On a shared workstation or CI runner reusing a workspace
+this is a realistic foot-gun.
+
+**Prompt for Claude Code:**
+> Read `src/codelicious/cli.py` to find where the build command begins
+> executing (look for the `cmd_build` or equivalent function near line
+> 800). Before any work starts — after argument parsing but before any
+> git, sandbox, or LLM call — acquire an advisory lock at
+> `<repo_root>/.codelicious/run.lock` using `fcntl.flock` with
+> `fcntl.LOCK_EX | fcntl.LOCK_NB`. If the lock cannot be acquired,
+> read the PID from inside the lockfile (we will write it in a moment)
+> and print: `Error: another codelicious run is in progress (pid
+> <PID>). If you believe this is stale, delete <path>.` and exit with
+> code 75 (EX_TEMPFAIL). On successful acquisition, write the current
+> PID into the lockfile and register an `atexit` handler that releases
+> the lock and removes the file. Wrap this in a context manager
+> `_run_lock(repo_root: Path)` so the lifecycle is obvious. On
+> non-POSIX platforms where `fcntl` is unavailable, log a single
+> WARNING ("concurrent-run protection unavailable on this platform")
+> and proceed without locking. Add tests in
+> `tests/test_cli.py::TestRunLock` that: (a) acquires the lock and
+> asserts a second acquisition in a child process exits with 75, (b)
+> confirms the lockfile is removed on normal exit, (c) confirms the
+> lockfile is removed on `SystemExit(143)` (SIGTERM path).
+
+**Acceptance criteria:**
+- Second concurrent invocation exits with code 75 and a clear message.
+- Lockfile is removed on normal exit, SIGTERM, and uncaught exception.
+- 3 new tests pass on POSIX; Windows path logs the warning.
+
+---
+
+### Step 2: Idempotent resume — skip already-merged chunks on restart [High]
+
+**Context:** A build can abort mid-spec for many reasons (rate-limit
+abort per `9b68ea84`, SIGTERM, deadline, network blip). When the user
+re-runs `codelicious build` against the same spec, the chunker
+re-produces the same chunk list, but `V2Orchestrator.run()` does not
+check whether a chunk's expected output already exists in git. Result:
+duplicate PRs, wasted LLM budget, and possible merge conflicts when the
+re-run touches files the prior run already merged.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/orchestrator.py:V2Orchestrator`, introduce a
+> persistent chunk-status ledger at `<repo_root>/.codelicious/state/<
+> spec_slug>.json` with the schema `{"chunks": {"<chunk_id>":
+> {"status": "merged"|"failed"|"in_progress", "pr_url": "...",
+> "branch": "...", "completed_at": "<iso8601>"}}}`. On `run()` start,
+> load the ledger if present; for every chunk in the plan, if its
+> ledger entry is `"merged"`, skip execution and log INFO ("Skipping
+> already-merged chunk <id>: <title>"). After each successful
+> `engine.execute_chunk` + verify + git push + PR create, update the
+> ledger atomically (write to `.tmp` then `os.rename`). Use the chunk's
+> stable `id` (already produced by the chunker) as the key — verify
+> chunker IDs are deterministic across runs by reading `chunker.py`;
+> if not, derive an ID from `sha256(title + sorted(files))[:12]` and
+> persist that. Add `--no-resume` and `--reset-ledger` CLI flags in
+> `cli.py` that bypass or delete the ledger respectively. Tests in
+> `tests/test_v2_orchestrator.py::TestIdempotentResume`: (a) ledger
+> with one chunk marked merged → that chunk is skipped, (b) full
+> ledger → no chunks executed, (c) `--no-resume` ignores the ledger,
+> (d) `--reset-ledger` deletes it before run.
+
+**Acceptance criteria:**
+- A second run after a partial completion executes only the unfinished chunks.
+- Ledger writes are atomic (no half-written JSON on crash).
+- Tests for skip, full-skip, no-resume, and reset all pass.
+- `--no-resume` and `--reset-ledger` are documented in `--help`.
+
+---
+
+### Step 3: Validate `--endpoint` for HTTPS at the CLI layer [High]
+
+**Context:** `src/codelicious/config.py:136` and
+`src/codelicious/llm_client.py:97` both check `scheme != "https"` and
+reject non-HTTPS endpoints — but only when `LLMClient` is instantiated,
+which happens *after* the CLI has already parsed args and printed the
+banner. A user passing `--endpoint http://internal-proxy/...` sees the
+banner with their (insecure) endpoint, then a confusing error several
+lines later. CLAUDE.md says "All LLM endpoint URLs validated for HTTPS"
+— validation should fail fast.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/cli.py`, locate `_parse_args` (around line 600)
+> and `_validate_opts` (around line 700). Add a validation block: if
+> `opts.get("endpoint")` is set, parse it with `urllib.parse.urlsplit`
+> and reject any scheme other than `https`. Reject also if the host is
+> empty or if the URL contains user-info (e.g.
+> `https://user:pass@host`) — credentials in URLs are a known
+> exfiltration vector. On rejection, print
+> `Error: --endpoint must be an https:// URL with no embedded
+> credentials, got: <sanitized>` (sanitize by masking any user-info
+> with `***`) and exit with code 2. Add tests in
+> `tests/test_cli.py::TestEndpointValidation` covering: http URL
+> rejected, https URL accepted, https URL with user-info rejected,
+> empty host rejected, malformed URL rejected. Ensure the same
+> validation is *not* duplicated — refactor `config.py:_validate_url`
+> to expose a single `validate_endpoint_url(url) -> None` helper that
+> both the CLI and `LLMClient.__init__` call.
+
+**Acceptance criteria:**
+- Insecure endpoints rejected before the banner prints.
+- Credentials-in-URL rejected with masked output.
+- Single helper used by both CLI and LLMClient.
+- 5 new tests pass.
+
+---
+
+### Step 4: Integration test for SIGTERM graceful shutdown [High]
+
+**Context:** `cli.py:37-39` registers a SIGTERM handler that raises
+`SystemExit(143)`. `agent_runner.py:46` defines `_SIGTERM_GRACE_S = 5`
+and uses it on subprocess timeouts. There is no test that runs a full
+build, sends SIGTERM mid-chunk, and asserts: (a) the subprocess
+(Claude CLI mock) receives SIGTERM, (b) it gets SIGKILL after 5 s if
+unresponsive, (c) the lockfile from Step 1 is released, (d) the ledger
+from Step 2 marks the in-flight chunk as `"failed"` not
+`"in_progress"`. CLAUDE.md mandates this works; no test proves it
+does.
+
+**Prompt for Claude Code:**
+> Create `tests/test_sigterm_integration.py`. The test launches
+> `codelicious build` as a subprocess against a fixture spec
+> (write a tiny one to `tmp_path/spec.md`). Use a mock engine
+> shimmed via env var (e.g. `CODELICIOUS_ENGINE=mock` reads from
+> `src/codelicious/engines/mock_engine.py` — create that engine if
+> missing, with a configurable per-chunk sleep). Configure the mock
+> engine to sleep 30 s in `execute_chunk`. From the parent test, after
+> 2 s, send `SIGTERM` to the child PID. Assert: child exits with code
+> 143 within 8 s, the lockfile under `.codelicious/run.lock` no longer
+> exists, the ledger entry for the in-flight chunk has status
+> `"failed"` with reason `"sigterm"`, and stderr contains "Received
+> SIGTERM". Also assert that any subprocess spawned by the agent
+> runner (use `psutil` if already a dep — otherwise spawn a sentinel
+> file-creating subprocess and confirm it was killed) is no longer
+> alive. Mark the test `@pytest.mark.skipif(sys.platform == "win32",
+> ...)`.
+
+**Acceptance criteria:**
+- Test exists and passes on POSIX.
+- Child process exits 143 within 8 seconds of SIGTERM.
+- No orphaned subprocesses, lockfile, or `in_progress` ledger entries remain.
+
+---
+
+### Step 5: Engine fallback on persistent Claude rate-limit [Medium]
+
+**Context:** Commit `9b68ea84` ("abort build on Claude CLI rate-limit
+instead of burning through every chunk") is correct given a single
+engine, but spec 27 §1 advertises a "dual-engine architecture". When a
+user has both `ANTHROPIC_API_KEY` (or Claude CLI auth) *and*
+`HF_TOKEN` configured, a Claude rate-limit should fail over to the HF
+engine for the remainder of the spec rather than aborting.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/orchestrator.py:V2Orchestrator`, accept a list
+> `engines: list[Engine]` (ordered by preference) instead of a single
+> `engine`. In the per-chunk loop, on `ClaudeRateLimitError` (or
+> equivalent rate-limit signal from any engine), log WARNING
+> ("<engine_name> rate-limited; failing over to <next_engine_name>
+> for the remainder of this spec"), drop the rate-limited engine
+> from the active list, and retry the same chunk on the next engine.
+> If the active list becomes empty, abort with the existing rate-limit
+> behavior (do not retry the same engine). Update `cli.py` to assemble
+> the engine list based on which credentials are present:
+> Claude-first if both, HF-only if only HF, Claude-only if only
+> Claude. Add tests in `tests/test_v2_orchestrator.py::TestEngineFallback`
+> with two mock engines: (a) primary rate-limits on chunk 2 → fallback
+> takes over from chunk 2, (b) both rate-limit → abort, (c) primary
+> rate-limits on chunk 2, secondary rate-limits on chunk 4 → abort
+> at chunk 4. Verify the chunk-2 retry on the fallback engine
+> actually executes (don't double-skip).
+
+**Acceptance criteria:**
+- Build continues on the fallback engine after primary rate-limits.
+- Rate-limited engine is removed for the rest of the run (no flapping).
+- 3 new tests pass.
+- `--engine` flag (if present) can still force a single engine.
+
+---
+
+### Step 6: Token-budget-aware chunk sizing [Medium]
+
+**Context:** `chunker.py` produces chunks based on LOC and commit
+caps but does not estimate the prompt token count for each chunk.
+A chunk whose `estimated_files` collectively exceed the engine's
+context window (e.g., Claude Sonnet 4.6 at 200k tokens, or HF at much
+less) is dispatched anyway and fails inside the engine with a
+context-overflow error — wasting one full chunk's worth of LLM budget
+to discover the problem.
+
+**Prompt for Claude Code:**
+> Add `_estimate_chunk_tokens(chunk: WorkChunk, repo: Path) -> int` to
+> `src/codelicious/chunker.py`. Estimate as: 4 chars/token, summing
+> over (chunk title + description + chunk-execute prompt template +
+> sum of file sizes for `chunk.files`). Add a module-level constant
+> `_MAX_CHUNK_TOKENS_PER_ENGINE = {"claude": 150_000, "huggingface":
+> 24_000, "default": 24_000}` (leaving 50k headroom on Claude's 200k
+> for response). After chunking (deterministic and LLM paths), iterate
+> chunks and for any chunk exceeding the budget for the *least
+> capable* engine in the active engine list, split it: keep the title,
+> halve the file list, mark the second half as depending on the first
+> with id-suffix `_b`. Cap recursive splits at depth 3 and log
+> WARNING if a chunk still exceeds budget after 3 splits ("Chunk
+> <id> still <N> tokens after 3 splits — dispatch anyway, may fail").
+> Tests in `tests/test_chunker.py::TestTokenBudget`: under-budget
+> chunk → unchanged, over-budget chunk → split with correct
+> dependency, recursively over-budget → 4 sub-chunks max, splits
+> preserve total file coverage.
+
+**Acceptance criteria:**
+- No chunk silently exceeds the engine context window.
+- Splits preserve dependency ordering and file coverage.
+- 4 new tests pass.
+
+---
+
+### Step 7: Atomic update of the `latest.log` symlink [Medium]
+
+**Context:** Recent commit `258e5e90` adds a `latest.log` symlink.
+The current implementation likely uses `os.symlink(target, link)`
+which fails if `link` exists, *or* `os.unlink(link); os.symlink(...)`
+which is non-atomic. Two simultaneous runs (Step 1 will eventually
+prevent this, but a fast restart after crash within the lock-release
+window can reproduce it) can leave the symlink dangling or pointing
+to a partially-written file. POSIX guarantees atomic symlink update
+via `os.symlink(target, tmp); os.rename(tmp, link)`.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/logger.py`, find where `latest.log` is created
+> (search for `"latest.log"` or `os.symlink`). Replace the
+> non-atomic update with: `tmp = link.with_suffix(f".{os.getpid()}.
+> tmp"); os.symlink(target.name, tmp); os.replace(tmp, link)`. Use
+> `os.replace` (atomic on POSIX, atomic on NTFS for symlinks since
+> Win10). On platforms where `os.symlink` fails due to permissions
+> (Windows without developer mode), fall back to writing a plain
+> file `latest.log.txt` containing the path to the active log, and
+> log INFO once per process. Add a test in `tests/test_logger.py`
+> that runs the symlink update twice in quick succession and asserts
+> the symlink resolves to the most recent log file with no
+> intermediate broken state (use `os.readlink` between updates from
+> a thread to detect a window where `readlink` raises).
+
+**Acceptance criteria:**
+- Symlink update is atomic via tmp + rename.
+- Windows fallback documented and tested.
+- Race-detection test passes.
+
+---
+
+### Step 8: Enforce coverage gate in verifier, not just measure [Medium]
+
+**Context:** `src/codelicious/verifier.py` runs `pytest --cov` (per
+recent specs) but the result feeds into `VerifyResult` only
+informationally. A chunk that drops project coverage from 92% to 71%
+returns `success=True`. CLAUDE.md and the project itself promise
+"coverage ≥ 90%" — this is enforced for human-written code by CI but
+not for codelicious-generated code at chunk time, when the violation
+is cheapest to fix.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/verifier.py`, locate the coverage parsing
+> (search for `--cov` or `coverage`). Add a parameter
+> `min_coverage: float = 90.0` to the public `verify()` and
+> `verify_paths()` functions (Step 6 of v29 introduces
+> `verify_paths`). After parsing the coverage percentage, if it is
+> below `min_coverage`, set `result.success = False` and append to
+> `result.failures`: f"coverage {pct:.1f}% below floor
+> {min_coverage:.1f}%". Read the threshold from
+> `pyproject.toml::[tool.codelicious].min_coverage` if present
+> (parse with `tomllib`), else use 90.0. Add a `--min-coverage` CLI
+> flag in `cli.py` that overrides both. Tests in
+> `tests/test_verifier.py::TestCoverageGate`: 95% coverage with
+> 90% floor → success, 85% with 90% floor → failure with the
+> expected message, override via pyproject → uses pyproject value,
+> override via CLI flag → CLI wins.
+
+**Acceptance criteria:**
+- Coverage drops below floor → `success=False`.
+- Three-way override precedence: CLI > pyproject > default.
+- 4 new tests pass.
+
+---
+
+### Step 9: Enrich PR description with chunk metadata [Medium]
+
+**Context:** `src/codelicious/git/` produces PRs but their descriptions
+do not include: a link to the parent spec file, the chunk's position
+in the dependency graph, the verifier summary (lint warnings, test
+count, coverage delta), or a link to the audit log slice for this
+chunk. A reviewer opens the PR and sees only the commit messages.
+This is the single highest-leverage UX improvement for the human
+reviewer.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/git/` (locate the PR-create function — likely
+> `pr.py` or similar; `grep -rn "gh pr create\|create_pr" src/`),
+> extend the PR body template to include a `## Chunk Context`
+> section with: spec source path (relative to repo root), chunk id
+> and title, list of chunks this one depends on, list of chunks
+> that depend on it, and a `## Verifier Summary` section with
+> `tests passed: N | lint warnings: N | coverage: P% (Δ +/-X.X
+> pp)`. Read these from the `ChunkResult` and `VerifyResult` already
+> available at PR-create time. If any field is missing, render
+> `n/a` rather than crashing. Add a `## Audit Log` section with the
+> path to the rotated audit log file for the run (from Step 1's
+> ledger). Update tests in `tests/test_git_pr.py` (or create) to
+> assert each section appears in the rendered body for a sample
+> ChunkResult.
+
+**Acceptance criteria:**
+- PR body includes Chunk Context, Verifier Summary, Audit Log sections.
+- Missing fields render as `n/a`, never raise.
+- Tests assert each section's presence.
+
+---
+
+### Step 10: Detect and disambiguate branch-name collisions [Medium]
+
+**Context:** The branch name is derived from a slug of the chunk
+title. Two chunks titled "fix: handle empty input" (e.g., produced by
+the LLM in a long spec) normalize to identical slugs. The first
+`git checkout -b` succeeds; the second fails with "branch already
+exists" and the error surfaces as a generic chunk failure with no
+hint that the cause is a name collision.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/git/` find the slug-to-branch-name function
+> (likely `_slugify` or `_branch_name_for_chunk`). After computing
+> the candidate slug, run `git branch --list <slug>` and
+> `git ls-remote --heads origin <slug>`; if either returns a hit,
+> append a suffix `-<chunk_id_short>` (first 6 chars of the chunk
+> id) to disambiguate. If even *that* collides (extremely unlikely),
+> append `-<unix_timestamp>`. Log INFO when disambiguation triggers:
+> "Branch <slug> exists; using <slug>-<suffix>". Add tests in
+> `tests/test_git_branches.py`: (a) unique slug → unchanged, (b) local
+> collision → suffixed with chunk id, (c) remote-only collision →
+> suffixed, (d) both collisions of suffixed name → timestamp suffix.
+
+**Acceptance criteria:**
+- Two chunks with identical titles produce distinct branches.
+- Disambiguation is logged so the reviewer can match branch to chunk.
+- 4 new tests pass.
+
+---
+
+### Step 11: Cross-process file lock on audit-log rotation [Medium]
+
+**Context:** v29 Step 12 adds a thread-safe append test for
+`audit_logger.py` (10 threads × 100 entries). That covers
+intra-process safety. But two `codelicious` processes (Step 1 will
+prevent same-repo collisions; cross-repo audit logs may still share a
+location if `CODELICIOUS_AUDIT_DIR` is set globally) can interleave
+audit lines across the rotation boundary, producing a corrupt
+JSON-lines file.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/tools/audit_logger.py`, wrap the append + rotate
+> critical section with `fcntl.flock(fd, LOCK_EX)` / `LOCK_UN`. Use a
+> dedicated lock file at `<audit_dir>/.audit.lock` rather than locking
+> the audit log itself (rotation moves the audit log; a lock on a
+> moved fd is undefined). On Windows, use `msvcrt.locking` if
+> available; otherwise log a one-shot WARNING and proceed without
+> cross-process locking (intra-process locks remain). Add a test in
+> the v29 Step 12 file `tests/test_audit_logger.py` that uses
+> `multiprocessing` to spawn 4 workers each writing 50 entries —
+> assert all 200 lines parse as JSON and no line is interleaved
+> (`worker_id` field unique per line). Skip on Windows.
+
+**Acceptance criteria:**
+- Cross-process audit-log writes never interleave on POSIX.
+- 200 entries from 4 processes all parse as JSON.
+- Windows fallback documented and warned about.
+
+---
+
+### Step 12: Diagnostic dump on abnormal exit [Low]
+
+**Context:** When a build aborts (deadline reached, rate-limit on all
+engines, uncaught exception), the operator must `tail` the log and
+mentally reconstruct what completed. With the ledger from Step 2
+already capturing per-chunk status, dumping a one-page summary
+("postmortem") on abnormal exit is nearly free.
+
+**Prompt for Claude Code:**
+> In `src/codelicious/cli.py`, register an `atexit` handler (also
+> invoked on SystemExit and uncaught Exception via `sys.excepthook`)
+> that, if the run did not complete normally, writes
+> `<repo_root>/.codelicious/postmortem-<timestamp>.md` with: the
+> abort reason (deadline / rate-limit / signal / exception), counts
+> of merged / failed / skipped / pending chunks (from the Step 2
+> ledger), the last 50 lines of the run log, and a "Resume command"
+> footer that prints the exact `codelicious build <spec>
+> [--reset-ledger if appropriate]` command needed to continue. On
+> normal exit (all chunks merged), do NOT write a postmortem. Print
+> the postmortem path to stderr on abort. Add tests in
+> `tests/test_cli.py::TestPostmortem`: (a) deadline abort writes
+> postmortem with status counts, (b) normal completion writes none,
+> (c) postmortem path is printed to stderr.
+
+**Acceptance criteria:**
+- Postmortem appears only on abnormal exit.
+- Includes status counts, abort reason, resume command.
+- 3 new tests pass.
+
+---
+
+## Roll-out Order
+
+Steps **1, 2, and 4** form a tightly coupled trio (lockfile, ledger,
+SIGTERM test) and should land together or in immediate succession —
+each makes the others meaningful. Step **3** is independent and
+high-value (security); land it first to remove the foot-gun. Steps
+**5 (engine fallback)** and **6 (token budget)** depend on the v29
+legacy-orchestrator removal (v29 Step 2) being complete. Steps
+**7–11** are independent and parallelizable. Step **12 (postmortem)**
+depends on Step 2's ledger format and should land last.
+
+**Suggested merge order:** 3 → 1 → 2 → 4 → 7 → 10 → 11 → 8 → 9 → 5 → 6 → 12.
+
+## Acceptance Criteria (overall)
+
+- [ ] All 12 steps complete and merged, each as a separate PR.
+- [ ] `pytest` passes with coverage ≥ 90%.
+- [ ] `ruff check src/ tests/` clean.
+- [ ] `bandit -r src/` clean.
+- [ ] Two simultaneous `codelicious build` invocations on the same repo: second exits 75.
+- [ ] Killing a build with SIGTERM mid-chunk leaves no lockfile, no orphan subprocesses, and a `failed` ledger entry.
+- [ ] Re-running `codelicious build` against a partially-completed spec only executes the unfinished chunks.
+- [ ] Passing `--endpoint http://...` exits 2 before the banner prints.
+- [ ] On Claude rate-limit with HF configured, build continues on HF.
+- [ ] PR descriptions include chunk context, verifier summary, and audit-log link.
+- [ ] On any abnormal exit, a postmortem markdown appears under `.codelicious/`.
+
+## Out of Scope
+
+- Distributed-locking across machines (would require Redis or similar — violates zero-runtime-dependency rule).
+- Resuming an aborted *single chunk* mid-flight (the unit of resumption is the chunk; partial chunk work is discarded). A future spec could explore checkpointing within `agent_runner` if needed.
+- A web UI for the postmortem / ledger — markdown is sufficient for the headless-CLI value proposition.
+- Mocking out git for the SIGTERM integration test — the test should exercise the real git path; if that proves flaky, factor a thin git facade in a follow-up spec.
diff --git a/src/codelicious/agent_runner.py b/src/codelicious/agent_runner.py
index a30ee4b2..f24a6eb3 100644
--- a/src/codelicious/agent_runner.py
+++ b/src/codelicious/agent_runner.py
@@ -15,6 +15,7 @@
 import logging
 import pathlib
 import queue
+import re
 import shutil
 import subprocess
 import threading
@@ -152,11 +153,15 @@ def _build_agent_command(
     list[str]
         Command list suitable for subprocess.Popen.
     """
+    # spec v29 Step 9: chunk-level config can override the default
+    # ``--output-format`` and ``--allowedTools`` values from spec 27 §3.2.
+    output_format_attr = getattr(config, "output_format", "")
+    output_format = output_format_attr if isinstance(output_format_attr, str) and output_format_attr else "stream-json"
     cmd: list[str] = [
         claude_bin,
         "--print",
         "--output-format",
-        "stream-json",
+        output_format,
         "--verbose",
         # bypassPermissions lets the agent edit/write/run shell commands inside
         # the project working directory without per-action prompts. Codelicious
@@ -167,6 +172,15 @@ def _build_agent_command(
         "bypassPermissions",
     ]
 
+    allowed_tools = getattr(config, "allowed_tools", None)
+    # Accept only an explicit list/tuple of strings or a pre-joined CSV string;
+    # ignore everything else (including MagicMock attribute auto-creation).
+    if isinstance(allowed_tools, (list, tuple)) and allowed_tools:
+        allowed_tools_str = ",".join(str(t) for t in allowed_tools)
+        cmd.extend(["--allowedTools", allowed_tools_str])
+    elif isinstance(allowed_tools, str) and allowed_tools.strip():
+        cmd.extend(["--allowedTools", allowed_tools.strip()])
+
     model = getattr(config, "model", "")
     if model:
         cmd.extend(["--model", model])
@@ -189,6 +203,51 @@ def _build_agent_command(
     return cmd
 
 
+# Bounds for parsed Claude rate-limit retry windows. Below 10 s is suspect
+# (provider noise); above 1 hour we'd rather fail fast than block a build.
+_CLAUDE_RETRY_AFTER_MIN_S: float = 10.0
+_CLAUDE_RETRY_AFTER_MAX_S: float = 3600.0
+_CLAUDE_RETRY_AFTER_DEFAULT_S: float = 60.0
+
+
+def _parse_claude_reset_seconds(text: str) -> float | None:
+    """Parse a Claude CLI rate-limit message for a reset window in seconds.
+
+    Recognises three common shapes that appear in Claude CLI output:
+
+    * ``resets in <N> seconds`` / ``reset in <N>s``
+    * ``try again in <N> minutes`` / ``try again in <N>m``
+    * ``Retry-After: <N>`` (seconds)
+
+    Returns the parsed delay clamped to
+    ``[_CLAUDE_RETRY_AFTER_MIN_S, _CLAUDE_RETRY_AFTER_MAX_S]`` or ``None`` if
+    no recognised pattern is present.
+    """
+    if not text:
+        return None
+    lowered = text.lower()
+
+    seconds_match = re.search(r"reset[s]?\s+in\s+(\d+)\s*(?:seconds?|secs?|s)\b", lowered)
+    if seconds_match:
+        return _clamp_retry_after(float(seconds_match.group(1)))
+
+    minutes_match = re.search(r"try\s+again\s+in\s+(\d+)\s*(?:minutes?|mins?|m)\b", lowered)
+    if minutes_match:
+        return _clamp_retry_after(float(minutes_match.group(1)) * 60.0)
+
+    # Retry-After header occasionally surfaces verbatim in stderr; match in
+    # the original (case-insensitive) text since header names are mixed-case.
+    header_match = re.search(r"retry-after\s*:\s*(\d+)", text, re.IGNORECASE)
+    if header_match:
+        return _clamp_retry_after(float(header_match.group(1)))
+
+    return None
+
+
+def _clamp_retry_after(value: float) -> float:
+    return max(_CLAUDE_RETRY_AFTER_MIN_S, min(_CLAUDE_RETRY_AFTER_MAX_S, value))
+
+
 def _check_agent_errors(
     returncode: int,
     stdout_lines: list[str],
@@ -258,9 +317,11 @@ def _check_agent_errors(
             safe_stderr,
         )
         safe_combined = sanitize_message((stderr_text + stdout_text)[-500:])
+        retry_after = _parse_claude_reset_seconds(stderr_text + stdout_text)
+        retry_after_s = retry_after if retry_after is not None else _CLAUDE_RETRY_AFTER_DEFAULT_S
         raise ClaudeRateLimitError(
             f"Claude CLI rate limited (exit code {returncode}): {safe_combined}",
-            retry_after_s=60.0,
+            retry_after_s=retry_after_s,
         )
 
     logger.warning(
diff --git a/src/codelicious/chunker.py b/src/codelicious/chunker.py
index 8f297bf3..83a61ecb 100644
--- a/src/codelicious/chunker.py
+++ b/src/codelicious/chunker.py
@@ -22,6 +22,27 @@
 
 _MAX_CHUNKS_PER_SPEC = 100
 
+# spec v29 Step 7: spec-windowing constants for ``chunk_spec_with_llm``.
+# Windowing is triggered when ``len(spec_content) > _LLM_WINDOW_SIZE``. Each
+# window covers ``_LLM_WINDOW_SIZE`` characters with ``_LLM_WINDOW_OVERLAP``
+# characters of overlap to keep cross-window references intact. We cap the
+# total number of windows so an enormous spec doesn't burn 100 LLM calls.
+_LLM_WINDOW_SIZE: int = 5000
+_LLM_WINDOW_OVERLAP: int = 500
+_LLM_MAX_WINDOWS: int = 10
+
+# spec v30 Step 6: token budgets per engine. Estimated as ~4 chars/token. The
+# Claude budget leaves ~50k tokens of headroom on the 200k context window for
+# the response; HuggingFace defaults are far tighter. Resolution order:
+# explicit ``engines`` arg > "default".
+_MAX_CHUNK_TOKENS_PER_ENGINE: dict[str, int] = {
+    "claude": 150_000,
+    "huggingface": 24_000,
+    "default": 24_000,
+}
+_TOKEN_CHARS_PER_TOKEN: int = 4
+_MAX_CHUNK_SPLIT_DEPTH: int = 3
+
 # Matches a markdown checkbox line, capturing the text after the box.
 _CHECKBOX_LINE_RE = re.compile(r"^\s*-\s*\[\s*\]\s*(.+)", re.MULTILINE)
 
@@ -197,56 +218,77 @@ def chunk_spec_with_llm(
         logger.warning("Cannot read spec %s for LLM chunking; falling back.", spec_path)
         return chunk_spec(spec_path, repo_path)
 
-    # Build the prompt
-    prompt = (
-        "You are a software architect. Given the following spec, decompose it into "
-        "independent, commit-sized units of work. Each chunk should:\n"
-        "- Touch a small number of files\n"
-        "- Be independently testable\n"
-        "- Have a clear title (under 72 chars)\n"
-        "- List which files it likely modifies\n"
-        "- Specify dependencies on other chunks (by index)\n\n"
-        f"Respond ONLY with a JSON array. Each element must have:\n"
-        '  {{"title": "...", "description": "...", "files": ["..."], '
-        '"depends_on_indices": [], "validation": "..."}}\n\n'
-        f"## Spec\n{spec_content[:5000]}\n"
-    )
-
-    messages = [
-        {
-            "role": "system",
-            "content": "You decompose specs into commit-sized work chunks. Respond only with valid JSON.",
-        },
-        {"role": "user", "content": prompt},
-    ]
-
-    try:
-        response = llm_client.chat_completion(messages, tools=[], role="planner")
-        content = ""
-        choices = response.get("choices") or []
-        if choices and isinstance(choices[0], dict):
-            msg = choices[0].get("message", {})
-            content = msg.get("content", "") if isinstance(msg, dict) else ""
-    except Exception as e:
-        logger.warning("LLM chunking failed: %s; falling back to deterministic.", e)
-        return chunk_spec(spec_path, repo_path)
+    # spec v29 Step 7: window oversized specs instead of silently truncating.
+    windows = _split_spec_for_llm(spec_content)
+    raw_chunks: list[dict] = []
+    seen_titles: set[str] = set()
+    for window_idx, window_text in enumerate(windows):
+        prompt = (
+            "You are a software architect. Given the following spec, decompose it into "
+            "independent, commit-sized units of work. Each chunk should:\n"
+            "- Touch a small number of files\n"
+            "- Be independently testable\n"
+            "- Have a clear title (under 72 chars)\n"
+            "- List which files it likely modifies\n"
+            "- Specify dependencies on other chunks (by index)\n\n"
+            "Respond ONLY with a JSON array. Each element must have:\n"
+            '  {"title": "...", "description": "...", "files": ["..."], '
+            '"depends_on_indices": [], "validation": "..."}\n\n'
+            f"## Spec (window {window_idx + 1}/{len(windows)})\n{window_text}\n"
+        )
 
-    # Parse the JSON response
-    try:
-        # Extract JSON array from the response (may be wrapped in markdown code block)
-        json_str = content.strip()
-        if json_str.startswith("```"):
-            # Strip markdown code fences
-            lines = json_str.splitlines()
-            lines = [ln for ln in lines if not ln.strip().startswith("```")]
-            json_str = "\n".join(lines)
-
-        raw_chunks = json.loads(json_str)
-        if not isinstance(raw_chunks, list):
-            raise ValueError("LLM response is not a JSON array")
-    except (json.JSONDecodeError, ValueError) as e:
-        logger.warning("LLM returned invalid JSON: %s; falling back.", e)
-        return chunk_spec(spec_path, repo_path)
+        messages = [
+            {
+                "role": "system",
+                "content": "You decompose specs into commit-sized work chunks. Respond only with valid JSON.",
+            },
+            {"role": "user", "content": prompt},
+        ]
+
+        try:
+            response = llm_client.chat_completion(messages, tools=[], role="planner")
+            content = ""
+            choices = response.get("choices") or []
+            if choices and isinstance(choices[0], dict):
+                msg = choices[0].get("message", {})
+                content = msg.get("content", "") if isinstance(msg, dict) else ""
+        except Exception as e:
+            logger.warning("LLM chunking failed on window %d: %s; falling back to deterministic.", window_idx + 1, e)
+            return chunk_spec(spec_path, repo_path)
+
+        try:
+            json_str = content.strip()
+            if json_str.startswith("```"):
+                lines = json_str.splitlines()
+                lines = [ln for ln in lines if not ln.strip().startswith("```")]
+                json_str = "\n".join(lines)
+
+            window_raw = json.loads(json_str)
+            if not isinstance(window_raw, list):
+                raise ValueError("LLM response is not a JSON array")
+        except (json.JSONDecodeError, ValueError) as e:
+            logger.warning("LLM returned invalid JSON on window %d: %s; falling back.", window_idx + 1, e)
+            return chunk_spec(spec_path, repo_path)
+
+        # Deduplicate on normalised title across windows: the overlap region
+        # invites the LLM to re-emit the same chunk in two adjacent windows.
+        for entry in window_raw:
+            if not isinstance(entry, dict):
+                continue
+            title_norm = str(entry.get("title", "")).strip().lower()
+            if title_norm and title_norm in seen_titles:
+                continue
+            seen_titles.add(title_norm)
+            raw_chunks.append(entry)
+
+    if len(raw_chunks) > _MAX_CHUNKS_PER_SPEC:
+        logger.warning(
+            "LLM produced %d chunks across %d windows; truncating to %d.",
+            len(raw_chunks),
+            len(windows),
+            _MAX_CHUNKS_PER_SPEC,
+        )
+        raw_chunks = raw_chunks[:_MAX_CHUNKS_PER_SPEC]
 
     # Validate and convert to WorkChunk objects
     chunks: list[WorkChunk] = []
@@ -291,9 +333,6 @@ def chunk_spec_with_llm(
             )
         )
 
-    if len(chunks) > _MAX_CHUNKS_PER_SPEC:
-        raise ValueError(f"LLM decomposed spec into {len(chunks)} chunks (max {_MAX_CHUNKS_PER_SPEC}).")
-
     # Validate no circular dependencies
     if _has_circular_deps(chunks):
         logger.warning("LLM produced circular dependencies; falling back to deterministic.")
@@ -307,6 +346,143 @@ def chunk_spec_with_llm(
     return chunks
 
 
+def _estimate_chunk_tokens(chunk: WorkChunk, repo: pathlib.Path) -> int:
+    """Estimate the prompt token cost of dispatching ``chunk`` to an engine.
+
+    Sums the title, description, a fixed prompt-template overhead, and the
+    sizes of every existing file in ``chunk.estimated_files``. Divides the
+    total character count by ``_TOKEN_CHARS_PER_TOKEN``.
+    """
+    chars = len(chunk.title) + len(chunk.description)
+    chars += 2_000  # rough prompt-template / system-message overhead
+    for fname in chunk.estimated_files or []:
+        # Reject path traversal up-front so this helper is safe even when
+        # called on untrusted chunks (the chunker validates LLM-supplied
+        # paths but the deterministic chunker uses regex hints).
+        if ".." in fname or fname.startswith("/"):
+            continue
+        try:
+            path = repo / fname
+            if path.is_file():
+                chars += path.stat().st_size
+        except (OSError, ValueError):
+            continue
+    return max(0, chars // _TOKEN_CHARS_PER_TOKEN)
+
+
+def _resolve_token_budget(engines: list[str] | None) -> int:
+    """Return the smallest budget across the active engines (worst-case)."""
+    if not engines:
+        return _MAX_CHUNK_TOKENS_PER_ENGINE["default"]
+    budgets = [_MAX_CHUNK_TOKENS_PER_ENGINE.get(name, _MAX_CHUNK_TOKENS_PER_ENGINE["default"]) for name in engines]
+    return min(budgets) if budgets else _MAX_CHUNK_TOKENS_PER_ENGINE["default"]
+
+
+def _split_chunk_in_half(chunk: WorkChunk, suffix: str) -> tuple[WorkChunk, WorkChunk]:
+    """Split ``chunk`` by halving its file list. The second half depends on the first."""
+    files = list(chunk.estimated_files or [])
+    mid = max(1, len(files) // 2)
+    head_files = files[:mid]
+    tail_files = files[mid:]
+    head = WorkChunk(
+        id=chunk.id,
+        spec_path=chunk.spec_path,
+        title=chunk.title,
+        description=chunk.description,
+        depends_on=list(chunk.depends_on or []),
+        estimated_files=head_files,
+        validation=chunk.validation,
+    )
+    tail = WorkChunk(
+        id=f"{chunk.id}_{suffix}",
+        spec_path=chunk.spec_path,
+        title=chunk.title,
+        description=chunk.description,
+        depends_on=[head.id],
+        estimated_files=tail_files,
+        validation=chunk.validation,
+    )
+    return head, tail
+
+
+def enforce_token_budget(
+    chunks: list[WorkChunk],
+    repo: pathlib.Path,
+    *,
+    engines: list[str] | None = None,
+) -> list[WorkChunk]:
+    """Recursively split chunks that exceed the active token budget (spec v30 Step 6).
+
+    The recursion is capped at ``_MAX_CHUNK_SPLIT_DEPTH`` levels (i.e. up to 8
+    sub-chunks per original chunk). When a chunk still exceeds the budget at
+    that depth a WARNING is logged and the chunk is dispatched anyway —
+    failing fast at the engine boundary is preferable to dropping work.
+    """
+    budget = _resolve_token_budget(engines)
+    out: list[WorkChunk] = []
+    # Each entry: (chunk, depth, suffix_seed). suffix_seed cycles ``b → c → ...``.
+    stack: list[tuple[WorkChunk, int, int]] = [(c, 0, 0) for c in chunks]
+    suffix_alphabet = "bcdefghij"
+    while stack:
+        chunk, depth, seed = stack.pop(0)
+        tokens = _estimate_chunk_tokens(chunk, repo)
+        if tokens <= budget:
+            out.append(chunk)
+            continue
+        if depth >= _MAX_CHUNK_SPLIT_DEPTH or len(chunk.estimated_files or []) <= 1:
+            logger.warning(
+                "Chunk %s still %d tokens after %d splits — dispatch anyway, may fail.",
+                chunk.id,
+                tokens,
+                depth,
+            )
+            out.append(chunk)
+            continue
+        suffix = suffix_alphabet[min(seed, len(suffix_alphabet) - 1)]
+        head, tail = _split_chunk_in_half(chunk, suffix)
+        # Push back onto the front so dependent ordering is preserved.
+        stack.insert(0, (head, depth + 1, seed + 1))
+        stack.insert(1, (tail, depth + 1, seed + 1))
+    return out
+
+
+def _split_spec_for_llm(spec_content: str) -> list[str]:
+    """Split a spec body into overlapping windows for ``chunk_spec_with_llm``.
+
+    Returns ``[spec_content]`` when the body is short enough to fit in a
+    single LLM pass. Otherwise produces sliding windows of length
+    ``_LLM_WINDOW_SIZE`` with ``_LLM_WINDOW_OVERLAP`` characters of overlap,
+    capped at ``_LLM_MAX_WINDOWS``.
+    """
+    if len(spec_content) <= _LLM_WINDOW_SIZE:
+        return [spec_content]
+
+    step = max(_LLM_WINDOW_SIZE - _LLM_WINDOW_OVERLAP, 1)
+    windows: list[str] = []
+    start = 0
+    while start < len(spec_content) and len(windows) < _LLM_MAX_WINDOWS:
+        windows.append(spec_content[start : start + _LLM_WINDOW_SIZE])
+        if start + _LLM_WINDOW_SIZE >= len(spec_content):
+            break
+        start += step
+
+    if len(windows) >= _LLM_MAX_WINDOWS and (windows[-1] is not None) and start + _LLM_WINDOW_SIZE < len(spec_content):
+        logger.warning(
+            "Spec exceeded %d windows of %d chars; truncating remaining content.",
+            _LLM_MAX_WINDOWS,
+            _LLM_WINDOW_SIZE,
+        )
+
+    logger.warning(
+        "Spec too large for single LLM pass: %d chars, splitting into %d window(s) of ~%d chars with %d-char overlap.",
+        len(spec_content),
+        len(windows),
+        _LLM_WINDOW_SIZE,
+        _LLM_WINDOW_OVERLAP,
+    )
+    return windows
+
+
 def _has_circular_deps(chunks: list[WorkChunk]) -> bool:
     """Check for circular dependencies in a list of chunks using DFS."""
     ids = {c.id for c in chunks}
diff --git a/src/codelicious/cli.py b/src/codelicious/cli.py
index 3b268001..0b97e75a 100644
--- a/src/codelicious/cli.py
+++ b/src/codelicious/cli.py
@@ -1,3 +1,5 @@
+import atexit
+import contextlib
 import dataclasses
 import logging
 import os
@@ -29,6 +31,242 @@ class PreFlightResult:
     skipped: bool  # True when --skip-auth-check was used
 
 
+_DEFAULT_PR_COMMIT_CAP = 8
+_DEFAULT_PR_LOC_CAP = 250
+
+
+def _write_postmortem(
+    repo_path: Path,
+    *,
+    abort_reason: str,
+    log_path: Path | None = None,
+    spec_arg: str = "",
+) -> Path | None:
+    """Write a one-page postmortem under ``.codelicious/`` (spec v30 Step 12).
+
+    Aggregates per-chunk status from any ledger files written during the run
+    (spec v30 Step 2) and the tail of the active log file. Returns the path
+    to the written file, or ``None`` if writing failed. Never raises.
+    """
+    import datetime as _dt
+    import json as _json
+
+    ts = _dt.datetime.now(_dt.timezone.utc).strftime("%Y%m%dT%H%M%SZ")
+    out_dir = repo_path / ".codelicious"
+    try:
+        out_dir.mkdir(parents=True, exist_ok=True)
+    except OSError:
+        return None
+    out_path = out_dir / f"postmortem-{ts}.md"
+
+    counts = {"merged": 0, "failed": 0, "in_progress": 0, "other": 0}
+    failed_titles: list[str] = []
+    state_dir = out_dir / "state"
+    if state_dir.is_dir():
+        for ledger_file in sorted(state_dir.glob("*.json")):
+            try:
+                data = _json.loads(ledger_file.read_text(encoding="utf-8"))
+            except (OSError, ValueError):
+                continue
+            for entry in (data.get("chunks") or {}).values():
+                status = entry.get("status", "other")
+                counts[status] = counts.get(status, 0) + 1
+                if status == "failed":
+                    failed_titles.append(str(entry.get("title", "?")))
+
+    log_tail = ""
+    if log_path and log_path.is_file():
+        try:
+            lines = log_path.read_text(encoding="utf-8", errors="replace").splitlines()
+            log_tail = "\n".join(lines[-50:])
+        except OSError:
+            log_tail = ""
+
+    resume_cmd = f"codelicious build {spec_arg or '.'}"
+    if abort_reason in ("rate-limit", "exception"):
+        # A re-run on the same ledger is the right move; don't suggest --reset-ledger.
+        pass
+    elif abort_reason == "ledger-corrupt":
+        resume_cmd += " --reset-ledger"
+
+    body = [
+        f"# Codelicious postmortem ({ts})",
+        "",
+        f"**Abort reason:** {abort_reason}",
+        "",
+        "## Chunk status",
+        f"- merged: {counts['merged']}",
+        f"- failed: {counts['failed']}",
+        f"- in_progress: {counts['in_progress']}",
+        f"- other: {counts['other']}",
+        "",
+    ]
+    if failed_titles:
+        body.append("### Failed chunks")
+        for title in failed_titles[:25]:
+            body.append(f"- {title}")
+        body.append("")
+    if log_tail:
+        body.append("## Log tail (last 50 lines)")
+        body.append("```")
+        body.append(log_tail)
+        body.append("```")
+        body.append("")
+    body.append("## Resume")
+    body.append(f"```\n{resume_cmd}\n```")
+
+    try:
+        out_path.write_text("\n".join(body) + "\n", encoding="utf-8")
+    except OSError:
+        return None
+    return out_path
+
+
+def _atomic_symlink_update(link: Path, target_name: str) -> None:
+    """Atomically point ``link`` at ``target_name`` (spec v30 Step 7).
+
+    Creates ``link.<pid>.tmp`` as a symlink to ``target_name``, then uses
+    ``os.replace`` to swap it into place — atomic on POSIX and on NTFS for
+    symlinks since Windows 10. On platforms that disallow symlink creation
+    (e.g. Windows without developer mode), falls back to writing a plain
+    file ``<link>.txt`` containing ``target_name`` and logs an INFO once.
+    """
+    tmp = link.with_suffix(link.suffix + f".{os.getpid()}.tmp")
+    try:
+        # If a stale tmp from a prior crashed run exists, remove it.
+        if tmp.exists() or tmp.is_symlink():
+            try:
+                tmp.unlink()
+            except OSError:
+                pass
+        os.symlink(target_name, tmp)
+        os.replace(tmp, link)
+    except OSError:
+        # Symlink creation failed (Windows non-developer mode, or read-only
+        # parent). Fall back to a sibling text pointer; never raise.
+        try:
+            (link.with_suffix(link.suffix + ".txt")).write_text(target_name, encoding="utf-8")
+            logging.getLogger("codelicious").info(
+                "latest-log symlink unsupported on this platform; wrote %s.txt instead.", link
+            )
+        except OSError:
+            pass
+
+
+@contextlib.contextmanager
+def _run_lock(repo_root: Path):
+    """Acquire a per-repo advisory lock to prevent concurrent runs (spec v30 Step 1).
+
+    On POSIX uses ``fcntl.flock`` with ``LOCK_EX | LOCK_NB``; on platforms
+    without ``fcntl`` (Windows) logs a single warning and proceeds without
+    locking. Writes the holder PID into the lockfile, cleans up on normal
+    exit *and* SystemExit (atexit handler).
+    """
+    lock_dir = repo_root / ".codelicious"
+    lock_dir.mkdir(parents=True, exist_ok=True)
+    lock_path = lock_dir / "run.lock"
+
+    try:
+        import fcntl
+    except ImportError:
+        logging.getLogger("codelicious").warning("concurrent-run protection unavailable on this platform")
+        yield None
+        return
+
+    fd = os.open(str(lock_path), os.O_CREAT | os.O_RDWR, 0o600)
+    try:
+        try:
+            fcntl.flock(fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
+        except OSError:
+            # Read existing PID for the diagnostic.
+            try:
+                with open(lock_path, encoding="utf-8") as fh:
+                    other_pid = fh.read().strip() or "?"
+            except OSError:
+                other_pid = "?"
+            print(
+                f"Error: another codelicious run is in progress (pid {other_pid}). "
+                f"If you believe this is stale, delete {lock_path}.",
+                file=sys.stderr,
+            )
+            os.close(fd)
+            sys.exit(75)
+
+        # Write our PID and ensure cleanup on every exit path.
+        os.ftruncate(fd, 0)
+        os.write(fd, f"{os.getpid()}\n".encode())
+        os.fsync(fd)
+
+        def _release() -> None:
+            try:
+                fcntl.flock(fd, fcntl.LOCK_UN)
+            except OSError:
+                pass
+            try:
+                os.close(fd)
+            except OSError:
+                pass
+            try:
+                lock_path.unlink()
+            except OSError:
+                pass
+
+        atexit.register(_release)
+        try:
+            yield lock_path
+        finally:
+            _release()
+    except SystemExit:
+        raise
+
+
+def _validate_endpoint_url_strict(url: str, *, source: str = "LLM_ENDPOINT") -> None:
+    """Reject non-HTTPS endpoints and credentials-in-URL at the CLI layer (spec v30 Step 3).
+
+    Raises ``SystemExit(2)`` with a clear, sanitized error before any banner
+    or downstream call site sees the value. Empty input is accepted (the
+    feature is disabled).
+    """
+    if not url:
+        return
+
+    import urllib.parse
+
+    try:
+        parts = urllib.parse.urlsplit(url)
+    except Exception:
+        print(f"Error: --endpoint / {source} is not a parseable URL.", file=sys.stderr)
+        sys.exit(2)
+
+    # Sanitize for display: mask any user-info before echoing.
+    sanitized = url
+    if "@" in (parts.netloc or ""):
+        userinfo, _, hostpart = parts.netloc.partition("@")
+        if userinfo:
+            sanitized_netloc = f"***@{hostpart}"
+            sanitized = urllib.parse.urlunsplit(
+                (parts.scheme, sanitized_netloc, parts.path, parts.query, parts.fragment)
+            )
+
+    if parts.scheme.lower() != "https":
+        print(
+            f"Error: {source} must be an https:// URL with no embedded credentials, got: {sanitized}",
+            file=sys.stderr,
+        )
+        sys.exit(2)
+
+    if not parts.hostname:
+        print(f"Error: {source} has no hostname: {sanitized}", file=sys.stderr)
+        sys.exit(2)
+
+    if parts.username or parts.password:
+        print(
+            f"Error: {source} must be an https:// URL with no embedded credentials, got: {sanitized}",
+            file=sys.stderr,
+        )
+        sys.exit(2)
+
+
 def _handle_sigterm(signum: int, frame: object) -> None:
     """Handle SIGTERM for graceful shutdown in container/orchestrator environments."""
     logging.getLogger("codelicious").warning("Received SIGTERM (signal %d), shutting down gracefully", signum)
@@ -69,10 +307,11 @@ def _probe_git_credentials(repo_path: Path) -> dict:
       * ``gpg --list-secret-keys --with-colons`` → whether GPG agent has a usable key
 
     Returns a dict with keys: ``transport`` (``"ssh"|"https"|"unknown"``),
-    ``gpg_signing`` (bool), ``ssh_key_loaded`` (bool), ``gpg_agent_warm`` (bool).
-    On any failure the conservative value is used so the orchestrator
-    surfaces an interactive prompt rather than letting an autonomous
-    cycle hang on an invisible passphrase prompt.
+    ``gpg_signing`` (bool), ``ssh_key_loaded`` (bool), ``gpg_agent_warm`` (bool),
+    and ``ssh_agent_state`` (one of ``"keys_loaded" | "empty" | "no_agent" |
+    "unknown"``). On any failure the conservative value is used so the
+    orchestrator surfaces an interactive prompt rather than letting an
+    autonomous cycle hang on an invisible passphrase prompt.
     """
     _T = 15
     info: dict = {
@@ -80,6 +319,7 @@ def _probe_git_credentials(repo_path: Path) -> dict:
         "gpg_signing": False,
         "ssh_key_loaded": False,
         "gpg_agent_warm": False,
+        "ssh_agent_state": "unknown",
     }
 
     # ── Remote URL → transport ────────────────────────────────────────
@@ -120,13 +360,28 @@ def _probe_git_credentials(repo_path: Path) -> dict:
                 text=True,
                 timeout=_T,
             )
-            # ssh-add -l: rc 0 → keys loaded; rc 1 → no keys; rc 2 → no agent
-            info["ssh_key_loaded"] = ssh_result.returncode == 0 and bool(ssh_result.stdout.strip())
+            # ssh-add -l: rc 0 → keys loaded; rc 1 → agent running, no keys;
+            # rc 2 → no agent reachable. Any other rc is treated as "unknown"
+            # and falls into the conservative not-loaded branch.
+            if ssh_result.returncode == 0 and ssh_result.stdout.strip():
+                info["ssh_agent_state"] = "keys_loaded"
+                info["ssh_key_loaded"] = True
+            elif ssh_result.returncode == 1:
+                info["ssh_agent_state"] = "empty"
+                info["ssh_key_loaded"] = False
+            elif ssh_result.returncode == 2:
+                info["ssh_agent_state"] = "no_agent"
+                info["ssh_key_loaded"] = False
+            else:
+                info["ssh_agent_state"] = "unknown"
+                info["ssh_key_loaded"] = False
         except (subprocess.TimeoutExpired, OSError):
+            info["ssh_agent_state"] = "unknown"
             info["ssh_key_loaded"] = False
     else:
         # Not an SSH push; key state is irrelevant — treat as "ready"
         info["ssh_key_loaded"] = True
+        info["ssh_agent_state"] = "keys_loaded"
 
     # ── GPG agent probe (only meaningful when signing) ────────────────
     if info["gpg_signing"]:
@@ -181,9 +436,17 @@ def _ensure_git_credentials_unlocked(
 
     # ── SSH key prompt ───────────────────────────────────────────────
     if info["transport"] == "ssh" and not info["ssh_key_loaded"]:
-        print("\n  SSH push transport detected, but no key is loaded in the agent.")
-        print("  Codelicious will run `ssh-add` so you can unlock your key now —")
-        print("  otherwise the autonomous loop would block on a passphrase prompt.\n")
+        agent_state = info.get("ssh_agent_state", "unknown")
+        if agent_state == "no_agent":
+            print("\n  SSH push transport detected, but no ssh-agent is running.")
+            print("  Start one with `eval $(ssh-agent)` and then `ssh-add` your key,")
+            print("  otherwise the autonomous loop would block on a passphrase prompt.\n")
+        else:
+            # "empty" (agent running, no keys) or "unknown" (probe failure):
+            # running ssh-add directly is the right move in both cases.
+            print("\n  SSH push transport detected, but no key is loaded in the agent.")
+            print("  Codelicious will run `ssh-add` so you can unlock your key now —")
+            print("  otherwise the autonomous loop would block on a passphrase prompt.\n")
         try:
             subprocess.run(["ssh-add"], timeout=300)
         except (subprocess.TimeoutExpired, OSError) as e:
@@ -411,13 +674,11 @@ def _attach_file_log_handler(repo_path: Path) -> None:
         fh.addFilter(SanitizingFilter())
         logging.getLogger().addHandler(fh)
         logging.getLogger().setLevel(logging.DEBUG)
+        # spec v30 Step 7: update ``latest.log`` atomically via tmp + rename
+        # so a fast restart (or, pre-Step 1, a concurrent run) cannot leave the
+        # symlink dangling between unlink and re-create.
         latest = log_dir / "latest.log"
-        try:
-            if latest.is_symlink() or latest.exists():
-                latest.unlink()
-            latest.symlink_to(log_path.name)
-        except OSError:
-            pass
+        _atomic_symlink_update(latest, log_path.name)
         logging.getLogger("codelicious").info("Logging build to %s", log_path)
     except OSError as e:
         logging.getLogger("codelicious").warning("Could not attach file logger: %s", e)
@@ -609,12 +870,15 @@ def _parse_args(argv: list[str]) -> dict:
         "skip_auth_check": False,
         "dry_run": False,
         "spec": "",
-        "max_commits_per_pr": 8,
-        "max_loc_per_pr": 250,
+        "max_commits_per_pr": _DEFAULT_PR_COMMIT_CAP,
+        "max_loc_per_pr": _DEFAULT_PR_LOC_CAP,
         "platform": "auto",
         "continuous": False,
         "cycle_sleep_s": 60,
         "skip_credential_probe": False,
+        "no_resume": False,
+        "reset_ledger": False,
+        "min_coverage": -1.0,
     }
 
     # Flags that take a value
@@ -629,6 +893,7 @@ def _parse_args(argv: list[str]) -> dict:
         "--max-loc-per-pr": "max_loc_per_pr",
         "--platform": "platform",
         "--cycle-sleep-s": "cycle_sleep_s",
+        "--min-coverage": "min_coverage",
     }
 
     # Integer-valued flags that need int() conversion
@@ -646,6 +911,9 @@ def _parse_args(argv: list[str]) -> dict:
         "--skip-credential-probe": "skip_credential_probe",
         "--dry-run": "dry_run",
         "--continuous": "continuous",
+        # spec v30 Step 2: idempotent-resume controls.
+        "--no-resume": "no_resume",
+        "--reset-ledger": "reset_ledger",
     }
 
     i = 0
@@ -693,6 +961,12 @@ def _parse_args(argv: list[str]) -> dict:
                 except ValueError:
                     print(f"Error: {args[i]} requires an integer, got '{value}'")
                     sys.exit(2)
+            elif key == "min_coverage":
+                try:
+                    value = float(value)
+                except ValueError:
+                    print(f"Error: {args[i]} requires a number, got '{value}'")
+                    sys.exit(2)
             opts[key] = value
             i += 2
         elif not args[i].startswith("-") and not opts["repo_path"]:
@@ -742,11 +1016,23 @@ def main():
 
     opts = _parse_args(sys.argv)
 
+    # spec v30 Step 3: reject insecure / credential-bearing LLM endpoints
+    # before any downstream code (or the banner) sees the value. Validates
+    # the env var the LLM client ultimately reads.
+    _validate_endpoint_url_strict(os.environ.get("LLM_ENDPOINT", ""), source="LLM_ENDPOINT")
+    if opts.get("endpoint"):
+        _validate_endpoint_url_strict(opts["endpoint"], source="--endpoint")
+
     repo_path = Path(opts["repo_path"]).resolve()
     if not repo_path.is_dir():
         logger.error("Repository path %s does not exist or is not a directory.", repo_path)
         sys.exit(1)
 
+    # spec v30 Step 1: per-repo advisory lock — second concurrent invocation
+    # exits 75 (EX_TEMPFAIL) before any git, sandbox, or LLM call happens.
+    _run_lock_cm = _run_lock(repo_path)
+    _run_lock_cm.__enter__()
+
     _attach_file_log_handler(repo_path)
 
     logger.info("Starting Codelicious workflow in %s", repo_path)
@@ -780,6 +1066,19 @@ def main():
         logger.error(str(e))
         sys.exit(1)
 
+    # spec v30 Step 5: when the user did NOT force a specific engine and both
+    # Claude and HuggingFace credentials are configured, build a fallback list
+    # so a Claude rate-limit fails over to HF for the remainder of the spec.
+    requested_engine = opts.get("engine") or os.environ.get("CODELICIOUS_ENGINE", "auto")
+    fallback_engines: list = [engine]
+    if requested_engine in ("auto", ""):
+        hf_creds = bool(os.environ.get("HF_TOKEN") or os.environ.get("LLM_API_KEY"))
+        if engine.name == "claude" and hf_creds:
+            try:
+                fallback_engines.append(select_engine("huggingface"))
+            except RuntimeError:
+                pass
+
     # 2. Initialize Git Orchestration
     git_manager = GitManager(repo_path)
     git_manager.verify_git_identity()  # spec-27 Phase 0.2
@@ -837,8 +1136,8 @@ def main():
             except OSError:
                 pass
         print()
-        print(f"[codelicious] Max commits per PR: {opts.get('max_commits_per_pr', 8)}")
-        print(f"[codelicious] Max LOC per PR: {opts.get('max_loc_per_pr', 250)}")
+        print(f"[codelicious] Max commits per PR: {opts.get('max_commits_per_pr', _DEFAULT_PR_COMMIT_CAP)}")
+        print(f"[codelicious] Max LOC per PR: {opts.get('max_loc_per_pr', _DEFAULT_PR_LOC_CAP)}")
         print(f"[codelicious] Platform: {opts.get('platform', 'auto')}")
         print()
         sys.exit(0)
@@ -849,14 +1148,28 @@ def main():
     # 5. Run the v2 chunk-based orchestration loop (spec-27 Phase 4.1)
     from codelicious.orchestrator import V2Orchestrator
 
+    # spec v30 Step 2: --reset-ledger wipes any persisted chunk state before
+    # the orchestrator runs so the next build re-executes everything.
+    if opts.get("reset_ledger"):
+        ledger_dir = repo_path / ".codelicious" / "state"
+        if ledger_dir.is_dir():
+            for ledger_file in ledger_dir.glob("*.json"):
+                try:
+                    ledger_file.unlink()
+                except OSError:
+                    pass
+            logger.info("Reset chunk-status ledger directory: %s", ledger_dir)
+
     v2_orch = V2Orchestrator(
         repo_path=repo_path,
         git_manager=git_manager,
         engine=engine,
-        max_commits_per_pr=opts.get("max_commits_per_pr", 8),
-        max_loc_per_pr=opts.get("max_loc_per_pr", 250),
+        engines=fallback_engines if len(fallback_engines) > 1 else None,
+        max_commits_per_pr=opts.get("max_commits_per_pr", _DEFAULT_PR_COMMIT_CAP),
+        max_loc_per_pr=opts.get("max_loc_per_pr", _DEFAULT_PR_LOC_CAP),
         model=opts.get("model", ""),
         progress_callback=_print_progress,
+        no_resume=opts.get("no_resume", False),
     )
 
     continuous = opts.get("continuous", False)
diff --git a/src/codelicious/engines/__init__.py b/src/codelicious/engines/__init__.py
index 2b4350c2..84fce40d 100644
--- a/src/codelicious/engines/__init__.py
+++ b/src/codelicious/engines/__init__.py
@@ -6,11 +6,11 @@
 import os
 import shutil
 
-from codelicious.engines.base import BuildEngine, BuildResult, ChunkResult, EngineContext
+from codelicious.engines.base import BuildEngine, ChunkResult, EngineContext
 
 logger = logging.getLogger("codelicious.engines")
 
-__all__ = ["BuildEngine", "BuildResult", "ChunkResult", "EngineContext", "select_engine"]
+__all__ = ["BuildEngine", "ChunkResult", "EngineContext", "select_engine"]
 
 
 def select_engine(engine_preference: str = "auto") -> BuildEngine:
diff --git a/src/codelicious/engines/base.py b/src/codelicious/engines/base.py
index 03b1ac69..a461170c 100644
--- a/src/codelicious/engines/base.py
+++ b/src/codelicious/engines/base.py
@@ -1,8 +1,10 @@
 """Abstract base class and shared dataclasses for codelicious build engines.
 
-Spec-27 Phase 3.1 adds the chunk-level interface (``execute_chunk``,
-``verify_chunk``, ``fix_chunk``) alongside the legacy ``run_build_cycle``
-so both old and new orchestration paths work during migration.
+Spec-27 Phase 3.1 defines the chunk-level interface:
+
+* ``execute_chunk`` — implement one work chunk
+* ``verify_chunk``  — lint/test/security check a completed chunk
+* ``fix_chunk``     — attempt to fix verification failures
 """
 
 from __future__ import annotations
@@ -11,21 +13,6 @@
 import dataclasses
 import pathlib
 
-# ---------------------------------------------------------------------------
-# Legacy result (kept for backward compatibility during migration)
-# ---------------------------------------------------------------------------
-
-
-@dataclasses.dataclass
-class BuildResult:
-    """Result from a legacy full-cycle build engine run."""
-
-    success: bool
-    message: str = ""
-    session_id: str = ""
-    elapsed_s: float = 0.0
-
-
 # ---------------------------------------------------------------------------
 # spec-27 Phase 3.1: Chunk-level dataclasses
 # ---------------------------------------------------------------------------
@@ -69,12 +56,10 @@ class BuildEngine(abc.ABC):
     """Abstract base for all codelicious build engines.
 
     **Chunk-level interface** (spec-27 Phase 3.1):
-    - ``execute_chunk`` — implement one work chunk
-    - ``verify_chunk``  — lint/test/security check a completed chunk
-    - ``fix_chunk``     — attempt to fix verification failures
 
-    **Legacy interface** (kept for migration):
-    - ``run_build_cycle`` — run the full build lifecycle
+    * ``execute_chunk`` — implement one work chunk
+    * ``verify_chunk``  — lint/test/security check a completed chunk
+    * ``fix_chunk``     — attempt to fix verification failures
     """
 
     @property
@@ -126,37 +111,3 @@ def fix_chunk(
         ``verify_chunk`` call.  The engine should try to resolve them
         and return the updated file list.
         """
-
-    # ------------------------------------------------------------------
-    # Legacy interface (kept for backward compatibility)
-    # ------------------------------------------------------------------
-
-    @abc.abstractmethod
-    def run_build_cycle(
-        self,
-        repo_path: pathlib.Path,
-        git_manager: object,
-        cache_manager: object,
-        spec_filter: str | None = None,
-        **kwargs,
-    ) -> BuildResult:
-        """Run the full build cycle (legacy interface).
-
-        Parameters
-        ----------
-        repo_path:
-            Path to the target repository.
-        git_manager:
-            GitManager instance for branch/commit/PR operations.
-        cache_manager:
-            CacheManager instance for state persistence.
-        spec_filter:
-            Optional: path to a specific spec to build.
-        **kwargs:
-            Engine-specific configuration (model, timeout, etc.)
-
-        Returns
-        -------
-        BuildResult
-            Outcome of the build cycle.
-        """
diff --git a/src/codelicious/engines/claude_engine.py b/src/codelicious/engines/claude_engine.py
index eb9077c0..f2068f12 100644
--- a/src/codelicious/engines/claude_engine.py
+++ b/src/codelicious/engines/claude_engine.py
@@ -1,9 +1,8 @@
 """Claude Code CLI build engine (spec-27 Phase 3.2).
 
 Delegates to the ``claude`` binary in headless mode for chunk execution.
-The v2 orchestrator (``V2Orchestrator``) drives the chunk loop — this
-engine only implements ``execute_chunk``, ``verify_chunk``, ``fix_chunk``,
-and a ``run_build_cycle`` that delegates to ``V2Orchestrator``.
+The orchestrator drives the chunk loop — this engine implements
+``execute_chunk``, ``verify_chunk``, and ``fix_chunk``.
 """
 
 from __future__ import annotations
@@ -13,10 +12,25 @@
 import sys
 import time
 
-from codelicious.engines.base import BuildEngine, BuildResult, ChunkResult, EngineContext
+from codelicious.engines.base import BuildEngine, ChunkResult, EngineContext
 
 logger = logging.getLogger("codelicious.engines.claude")
 
+# spec v29 Step 9 / spec 27 §3.2: the default Claude CLI ``--allowedTools``
+# allow-list applied to every chunk. Surfacing this list at the chunk site
+# (rather than burying it in ``agent_runner``) lets future engines tighten
+# the surface per chunk without forking the runner.
+_DEFAULT_ALLOWED_TOOLS: tuple[str, ...] = (
+    "Edit",
+    "Write",
+    "Bash(git status:*)",
+    "Bash(pytest:*)",
+    "Bash(ruff:*)",
+    "Read",
+    "Glob",
+    "Grep",
+)
+
 
 class ClaudeCodeEngine(BuildEngine):
     """Build engine that uses the Claude Code CLI as its backend."""
@@ -79,6 +93,11 @@ class _ChunkConfig:
         config.max_turns = 50
         config.agent_timeout_s = max(int(context.deadline - time.monotonic()), 60) if context.deadline else 1800
         config.dry_run = False
+        # spec v29 Step 9: surface the spec 27 §3.2 Claude CLI flags at the
+        # chunk call-site so per-chunk callers can tighten the tool allow-list
+        # or switch output formats without editing agent_runner.
+        config.output_format = "stream-json"
+        config.allowed_tools = list(_DEFAULT_ALLOWED_TOOLS)
 
         try:
             result = run_agent(
@@ -162,9 +181,11 @@ def verify_chunk(
         chunk_id = getattr(chunk, "id", "unknown")
 
         try:
-            from codelicious.verifier import verify
+            from codelicious.verifier import verify, verify_paths
 
-            vresult = verify(repo_path)
+            # spec v29 Step 6: scope checks to the chunk's files when known.
+            scoped = list(getattr(chunk, "estimated_files", []) or [])
+            vresult = verify_paths(repo_path, scoped) if scoped else verify(repo_path)
             if vresult.all_passed:
                 logger.info("Verification passed for chunk %s.", chunk_id)
                 return ChunkResult(success=True, message="All checks passed.")
@@ -252,48 +273,3 @@ class _FixConfig:
             message=f"Fix attempt for chunk {chunk_id}",
             retries_used=1,
         )
-
-    # ------------------------------------------------------------------
-    # Legacy interface — delegates to V2Orchestrator
-    # ------------------------------------------------------------------
-
-    def run_build_cycle(
-        self,
-        repo_path: pathlib.Path,
-        git_manager: object,
-        cache_manager: object,
-        spec_filter: str | None = None,
-        **kwargs,
-    ) -> BuildResult:
-        """Run the build lifecycle by delegating to V2Orchestrator.
-
-        This method exists for backward compatibility with the ``BuildEngine``
-        interface.  The ``cli.py`` main entry point now calls ``V2Orchestrator``
-        directly, so this path is only used if an external caller invokes the
-        engine directly.
-        """
-        from codelicious.orchestrator import V2Orchestrator
-        from codelicious.spec_discovery import discover_incomplete_specs
-
-        start = time.monotonic()
-        repo_path = pathlib.Path(repo_path).resolve()
-        agent_timeout_s = kwargs.get("agent_timeout_s", 1800)
-        push_pr = kwargs.get("push_pr", False)
-        max_commits_per_pr = kwargs.get("max_commits_per_pr", 50)
-
-        specs = discover_incomplete_specs(repo_path)
-        if not specs:
-            return BuildResult(success=True, message="No incomplete specs found.", elapsed_s=time.monotonic() - start)
-
-        orch = V2Orchestrator(repo_path, git_manager, self, max_commits_per_pr=max_commits_per_pr)
-        result = orch.run(
-            specs=specs,
-            deadline=start + agent_timeout_s,
-            push_pr=push_pr,
-        )
-
-        return BuildResult(
-            success=result.success,
-            message=result.message,
-            elapsed_s=result.elapsed_s,
-        )
diff --git a/src/codelicious/engines/huggingface_engine.py b/src/codelicious/engines/huggingface_engine.py
index cae22b4a..eaa9f4c4 100644
--- a/src/codelicious/engines/huggingface_engine.py
+++ b/src/codelicious/engines/huggingface_engine.py
@@ -13,12 +13,17 @@
 import random
 import time
 
-from codelicious.engines.base import BuildEngine, BuildResult, ChunkResult, EngineContext
+from codelicious.engines.base import BuildEngine, ChunkResult, EngineContext
 from codelicious.errors import LLMRateLimitError
 from codelicious.loop_controller import MAX_HISTORY_TOKENS, MAX_TOOL_RESULT_BYTES, truncate_history
 
 logger = logging.getLogger("codelicious.engines.huggingface")
 
+# spec v29 Step 11: cap on the number of post-reflection fix-cycle attempts
+# the HF engine will make before giving up. Prevents infinite loops when the
+# model cannot satisfy verification.
+_HF_MAX_FIX_ATTEMPTS: int = 2
+
 
 def _is_transient(exc: Exception) -> bool:
     """Classify an exception as transient (retryable) vs fatal."""
@@ -251,6 +256,33 @@ def execute_chunk(
         except Exception:
             files = []
 
+        # spec v29 Step 11: gating verification before signalling success.
+        # The HF model can emit CHUNK_COMPLETE without actually satisfying the
+        # spec; run our own verifier and, if it fails, give the model up to two
+        # fix-cycle attempts before giving up.
+        retries_used = 0
+        if completed:
+            verification = self.verify_chunk(chunk, repo_path)
+            while not verification.success and retries_used < _HF_MAX_FIX_ATTEMPTS:
+                retries_used += 1
+                logger.info(
+                    "Chunk %s: post-reflection verification failed (attempt %d/%d); invoking fix_chunk.",
+                    chunk_id,
+                    retries_used,
+                    _HF_MAX_FIX_ATTEMPTS,
+                )
+                self.fix_chunk(chunk, repo_path, [verification.message or "verification failed"])
+                verification = self.verify_chunk(chunk, repo_path)
+            if not verification.success:
+                return ChunkResult(
+                    success=False,
+                    files_modified=files,
+                    message=(
+                        f"HF engine could not satisfy verification after "
+                        f"{_HF_MAX_FIX_ATTEMPTS} fix attempts: {verification.message}"
+                    ),
+                )
+
         return ChunkResult(
             success=completed,
             files_modified=files,
@@ -264,9 +296,11 @@ def verify_chunk(
     ) -> ChunkResult:
         """Run verification checks on the repo after a chunk."""
         try:
-            from codelicious.verifier import verify
+            from codelicious.verifier import verify, verify_paths
 
-            vresult = verify(repo_path)
+            # spec v29 Step 6: scope checks to the chunk's files when known.
+            scoped = list(getattr(chunk, "estimated_files", []) or [])
+            vresult = verify_paths(repo_path, scoped) if scoped else verify(repo_path)
             if vresult.all_passed:
                 return ChunkResult(success=True, message="All checks passed.")
 
@@ -315,48 +349,3 @@ class _FixChunk:
             message=result.message,
             retries_used=1,
         )
-
-    # ------------------------------------------------------------------
-    # Legacy interface — delegates to V2Orchestrator
-    # ------------------------------------------------------------------
-
-    def run_build_cycle(
-        self,
-        repo_path: pathlib.Path,
-        git_manager: object,
-        cache_manager: object,
-        spec_filter: str | None = None,
-        **kwargs,
-    ) -> BuildResult:
-        """Run the build lifecycle by delegating to V2Orchestrator.
-
-        This method exists for backward compatibility with the ``BuildEngine``
-        interface.  The ``cli.py`` main entry point now calls ``V2Orchestrator``
-        directly, so this path is only used if an external caller invokes the
-        engine directly.
-        """
-        from codelicious.orchestrator import V2Orchestrator
-        from codelicious.spec_discovery import discover_incomplete_specs
-
-        start = time.monotonic()
-        repo_path = pathlib.Path(repo_path).resolve()
-        agent_timeout_s = kwargs.get("agent_timeout_s", 1800)
-        push_pr = kwargs.get("push_pr", False)
-        max_commits_per_pr = kwargs.get("max_commits_per_pr", 50)
-
-        specs = discover_incomplete_specs(repo_path)
-        if not specs:
-            return BuildResult(success=True, message="No incomplete specs found.", elapsed_s=time.monotonic() - start)
-
-        orch = V2Orchestrator(repo_path, git_manager, self, max_commits_per_pr=max_commits_per_pr)
-        result = orch.run(
-            specs=specs,
-            deadline=start + agent_timeout_s,
-            push_pr=push_pr,
-        )
-
-        return BuildResult(
-            success=result.success,
-            message=result.message,
-            elapsed_s=result.elapsed_s,
-        )
diff --git a/src/codelicious/git/git_orchestrator.py b/src/codelicious/git/git_orchestrator.py
index f06fbffa..657b1984 100644
--- a/src/codelicious/git/git_orchestrator.py
+++ b/src/codelicious/git/git_orchestrator.py
@@ -643,6 +643,7 @@ def ensure_draft_pr_exists(
         part: int = 0,
         prev_pr_url: str = "",
         chunk_summaries: list[str] | None = None,
+        chunk_metadata: dict | None = None,
     ) -> int | None:
         """Ensure exactly one PR/MR exists for the current spec (spec-27 Phase 5).
 
@@ -707,7 +708,7 @@ def ensure_draft_pr_exists(
             title = f"{title} (part {part})"
         title = title.replace("\n", " ").replace("\r", " ").replace("\x00", "")[:70]
 
-        body = self._build_pr_body(spec_id, chunk_summaries, prev_pr_url)
+        body = self._build_pr_body(spec_id, chunk_summaries, prev_pr_url, chunk_metadata)
 
         if platform == "gitlab":
             return self._create_gitlab_mr(cli_tool, title, body, _TIMEOUT_S)
@@ -718,8 +719,16 @@ def _build_pr_body(
         spec_id: str,
         chunk_summaries: list[str] | None,
         prev_pr_url: str,
+        chunk_metadata: dict | None = None,
     ) -> str:
-        """Build the PR/MR body with spec link, chunk summary, and part links."""
+        """Build the PR/MR body with spec link, chunk summary, and part links.
+
+        spec v30 Step 9: when ``chunk_metadata`` is provided, render two extra
+        sections — ``## Chunk Context`` (spec source path, chunk id/title,
+        dependency graph slice) and ``## Verifier Summary`` (test count, lint
+        warnings, coverage Δ) — so a human reviewer has full context in one
+        glance. Missing fields render as ``n/a``.
+        """
         parts = [
             "## Summary\n",
             f"Autonomous implementation by Codelicious (spec-{spec_id}).\n",
@@ -729,6 +738,39 @@ def _build_pr_body(
             for cs in chunk_summaries[:50]:
                 parts.append(f"- {cs}")
             parts.append("")
+
+        if chunk_metadata is not None:
+            md = chunk_metadata
+            parts.append("## Chunk Context")
+            parts.append(f"- Spec source: `{md.get('spec_path') or 'n/a'}`")
+            parts.append(f"- Chunk id: `{md.get('chunk_id') or 'n/a'}`")
+            parts.append(f"- Chunk title: {md.get('chunk_title') or 'n/a'}")
+            depends_on = md.get("depends_on") or []
+            dependents = md.get("dependents") or []
+            parts.append(f"- Depends on: {', '.join(depends_on) if depends_on else 'n/a'}")
+            parts.append(f"- Dependents: {', '.join(dependents) if dependents else 'n/a'}")
+            parts.append("")
+
+            verifier = md.get("verifier") or {}
+            parts.append("## Verifier Summary")
+            tests = verifier.get("tests_passed", "n/a")
+            warnings = verifier.get("lint_warnings", "n/a")
+            coverage = verifier.get("coverage_pct", "n/a")
+            cov_delta = verifier.get("coverage_delta_pp", "n/a")
+            cov_str = f"{coverage}%" if coverage != "n/a" else "n/a"
+            delta_str = (
+                f" (Δ {'+' if isinstance(cov_delta, (int, float)) and cov_delta >= 0 else ''}{cov_delta} pp)"
+                if cov_delta != "n/a"
+                else ""
+            )
+            parts.append(f"tests passed: {tests} | lint warnings: {warnings} | coverage: {cov_str}{delta_str}")
+            parts.append("")
+
+            audit_log = md.get("audit_log_path")
+            parts.append("## Audit Log")
+            parts.append(f"- {audit_log if audit_log else 'n/a'}")
+            parts.append("")
+
         if prev_pr_url:
             parts.append(f"**Previous part:** {prev_pr_url}\n")
         parts.append("This PR updates automatically as new commits are pushed.\n")
@@ -1144,17 +1186,72 @@ def revert_chunk_changes(self) -> bool:
             logger.error("Failed to revert chunk changes: %s", e)
             return False
 
+    def _branch_exists_locally(self, branch: str) -> bool:
+        """Return True iff ``branch`` is a local ref."""
+        try:
+            result = self._run_cmd(["git", "branch", "--list", branch], check=False)
+        except RuntimeError:
+            return False
+        # `git branch --list <name>` prints the branch (with optional `*` prefix)
+        # when present, empty output otherwise.
+        return bool((getattr(result, "stdout", "") or "").strip())
+
+    def _branch_exists_remotely(self, branch: str) -> bool:
+        """Return True iff ``branch`` exists on ``origin``. Network failures
+        treated as "unknown / assume not present" — disambiguation is best-effort."""
+        try:
+            result = self._run_cmd(
+                ["git", "ls-remote", "--heads", "origin", branch],
+                check=False,
+                timeout=15,
+            )
+        except (RuntimeError, TypeError):
+            return False
+        return bool((getattr(result, "stdout", "") or "").strip())
+
+    def _disambiguate_branch(self, candidate: str, *, suffix_hint: str = "") -> str:
+        """Resolve branch-name collisions (spec v30 Step 10).
+
+        If ``candidate`` exists either locally or on origin, append ``-suffix_hint``
+        (or a 6-char hex tag derived from the current time when no hint is given).
+        If even that suffixed name collides, fall back to ``-<unix_ts>``. Logs INFO
+        on every disambiguation so reviewers can match branch back to chunk.
+        """
+        local = self._branch_exists_locally(candidate)
+        remote = self._branch_exists_remotely(candidate)
+        if not (local or remote):
+            return candidate
+
+        import secrets as _secrets
+        import time as _time
+
+        suffix = suffix_hint or _secrets.token_hex(3)
+        first = f"{candidate}-{suffix}"
+        if not (self._branch_exists_locally(first) or self._branch_exists_remotely(first)):
+            logger.info("Branch %s exists; using %s", candidate, first)
+            return first
+
+        ts = int(_time.time())
+        fallback = f"{candidate}-{ts}"
+        logger.info("Branch %s and %s exist; using %s", candidate, first, fallback)
+        return fallback
+
     def create_continuation_branch(self, spec_id: str, part: int) -> str:
         """Create a new branch for the next part of a split PR (spec-27 Phase 2.3).
 
         Returns the new branch name.
         """
-        branch_name = f"codelicious/spec-{spec_id}-part-{part}"
+        # spec v30 Step 10: disambiguate ahead of `git checkout -b` so a stale
+        # branch from a previous run doesn't get silently re-checked-out.
+        candidate = f"codelicious/spec-{spec_id}-part-{part}"
+        branch_name = self._disambiguate_branch(candidate)
         try:
             self._run_cmd(["git", "checkout", "-b", branch_name])
             logger.info("Created continuation branch: %s", branch_name)
         except RuntimeError:
-            # Branch might already exist
+            # Local checkout -b can still fail if the branch was created since
+            # our disambiguation probe; checkout the existing one rather than
+            # crashing the build.
             self._run_cmd(["git", "checkout", branch_name])
             logger.info("Checked out existing continuation branch: %s", branch_name)
         return branch_name
diff --git a/src/codelicious/llm_client.py b/src/codelicious/llm_client.py
index b99dc31c..cb694f42 100644
--- a/src/codelicious/llm_client.py
+++ b/src/codelicious/llm_client.py
@@ -1,9 +1,11 @@
 from __future__ import annotations
 
+import email.utils
 import ipaddress
 import json
 import logging
 import os
+import secrets
 import socket
 import ssl
 import time
@@ -25,6 +27,46 @@
 _DEFAULT_ENDPOINT = "https://router.huggingface.co/sambanova/v1/chat/completions"
 
 
+# Cap for any provider-supplied Retry-After value, in seconds. A misbehaving
+# or malicious provider could otherwise stall codelicious indefinitely.
+_RETRY_AFTER_CAP_S = 120.0
+
+
+def _jittered_backoff(base: float, attempt: int) -> float:
+    """Compute exponential backoff with multiplicative jitter in [0.5x, 1.5x].
+
+    Uses ``secrets.SystemRandom`` per project security rules.
+    """
+    return base * (2**attempt) * (0.5 + secrets.SystemRandom().random())
+
+
+def _parse_retry_after(value: str | None) -> float | None:
+    """Parse an HTTP Retry-After header value (seconds or HTTP-date).
+
+    Returns the wait in seconds, capped at ``_RETRY_AFTER_CAP_S``, or None if
+    the header is missing/unparseable/in the past.
+    """
+    if not value:
+        return None
+    value = value.strip()
+    try:
+        secs = float(value)
+        if secs >= 0:
+            return min(secs, _RETRY_AFTER_CAP_S)
+    except ValueError:
+        pass
+    try:
+        when = email.utils.parsedate_to_datetime(value)
+    except (TypeError, ValueError):
+        return None
+    if when is None:
+        return None
+    delta = when.timestamp() - time.time()
+    if delta <= 0:
+        return None
+    return min(delta, _RETRY_AFTER_CAP_S)
+
+
 # Known-good endpoint base URLs that bypass DNS resolution checks (S20-P1-1)
 _ALLOWED_ENDPOINT_BASES: frozenset[str] = frozenset(
     {
@@ -218,9 +260,15 @@ def chat_completion(
                 logger.debug("LLM API error body (status %s): %s", e.code, sanitized_body)
 
                 if e.code in self._RETRYABLE_HTTP_CODES and attempt < self._MAX_RETRIES:
-                    backoff = self._BACKOFF_BASE_S * (2**attempt)
+                    retry_after = _parse_retry_after(e.headers.get("Retry-After")) if e.code == 429 else None
+                    if retry_after is not None:
+                        # Apply jitter to provider-supplied delay so concurrent runs don't thunder-herd.
+                        backoff = retry_after * (0.5 + secrets.SystemRandom().random())
+                        logger.info("Honoring Retry-After: %.1fs for HTTP 429 (%s)", retry_after, model)
+                    else:
+                        backoff = _jittered_backoff(self._BACKOFF_BASE_S, attempt)
                     logger.warning(
-                        "Transient HTTP %d from LLM API (%s); retrying in %.0fs (attempt %d/%d).",
+                        "Transient HTTP %d from LLM API (%s); retrying in %.1fs (attempt %d/%d).",
                         e.code,
                         model,
                         backoff,
@@ -235,7 +283,7 @@ def chat_completion(
                 raise RuntimeError("LLM API Error (%s): HTTP %s - see debug logs for details" % (model, e.code)) from e
             except (TimeoutError, urllib.error.URLError, ssl.SSLError, ConnectionResetError, OSError) as e:
                 if attempt < self._MAX_RETRIES:
-                    backoff = self._BACKOFF_BASE_S * (2**attempt)
+                    backoff = _jittered_backoff(self._BACKOFF_BASE_S, attempt)
                     logger.warning(
                         "Transient network error from LLM API (%s): %s; retrying in %.0fs (attempt %d/%d).",
                         model,
diff --git a/src/codelicious/loop_controller.py b/src/codelicious/loop_controller.py
index b53cb56c..66fa8496 100644
--- a/src/codelicious/loop_controller.py
+++ b/src/codelicious/loop_controller.py
@@ -197,7 +197,9 @@ def _execute_agentic_iteration(self) -> bool:
                 break
             except Exception as llm_exc:
                 last_llm_error = llm_exc
-                wait_s = _LLM_BACKOFF_BASE_S * (2**_attempt)
+                from codelicious.llm_client import _jittered_backoff
+
+                wait_s = _jittered_backoff(_LLM_BACKOFF_BASE_S, _attempt)
                 logger.warning(
                     "LLM call failed (attempt %d/%d): %s — retrying in %.1fs",
                     _attempt + 1,
diff --git a/src/codelicious/orchestrator.py b/src/codelicious/orchestrator.py
index 2a2fdbe0..cf26973e 100644
--- a/src/codelicious/orchestrator.py
+++ b/src/codelicious/orchestrator.py
@@ -1,154 +1,43 @@
-"""Phase-based orchestrator for parallel agent workflows.
+"""Chunk-based orchestrator for codelicious v2 (spec-27 Phase 4.1).
 
-This module implements the conflict-free parallelization strategy:
+Runs the simplified workflow::
 
-    Phase 1: BUILD   — one builder per spec, each in an isolated git worktree
-    Phase 2: MERGE   — deterministic serial merge of worktree branches
-    Phase 3: REVIEW  — N read-only reviewer agents in parallel (security, QA, perf, …)
-    Phase 4: FIX     — single fixer agent applies triaged findings serially
+    for each spec:
+        chunk the spec → for each chunk:
+            execute → verify → fix → commit → push
+        transition PR to review
 
-The key invariant: **two writers never touch the same working tree at the
-same time**.  Builders get worktree isolation.  Reviewers are read-only.
-The fixer is the only writer during Phase 4.
+No worktree isolation.  Each spec gets a branch.  Chunks are executed
+serially.  One commit per chunk.
 
-Usage from the engine::
+Usage::
 
-    orch = Orchestrator(repo_path, git_manager, config)
-    result = orch.run(specs=[...], reviewers=["security", "qa", "performance"])
+    from codelicious.orchestrator import Orchestrator
+
+    orch = Orchestrator(repo_path, git_manager, engine)
+    result = orch.run(specs=[Path("docs/specs/01_feature.md")])
 """
 
 from __future__ import annotations
 
-import concurrent.futures
-import json
 import logging
 import pathlib
 import re
 import subprocess
-import sys
-import threading
 import time
 from dataclasses import dataclass, field
 
 logger = logging.getLogger("codelicious.orchestrator")
 
 __all__ = [
-    "REVIEWER_PROMPTS",
+    "ChunkOutcome",
     "Finding",
     "Orchestrator",
     "OrchestratorResult",
-    "ReviewRole",
     "V2Orchestrator",
 ]
 
 
-# ---------------------------------------------------------------------------
-# Reviewer roles and prompts
-# ---------------------------------------------------------------------------
-
-
-@dataclass(frozen=True)
-class ReviewRole:
-    """A named reviewer role with a system prompt."""
-
-    name: str
-    prompt: str
-
-
-REVIEWER_PROMPTS: dict[str, str] = {
-    "security": """\
-You are a **security engineer** reviewing {{project_name}}.
-
-GUARDRAILS: Do NOT modify any files. Read only.
-
-Perform a thorough security audit:
-- OWASP Top 10 vulnerabilities (injection, XSS, SSRF, IDOR, etc.)
-- Hardcoded secrets, API keys, tokens in source code
-- Unsafe deserialization, eval(), exec(), shell=True
-- Path traversal, symlink attacks, TOCTOU races
-- Authentication and authorization gaps
-- Cryptographic misuse (weak algorithms, hardcoded IVs)
-
-For every finding:
-- Cite the exact file path and line number
-- Rate severity: P1 (critical), P2 (important), P3 (minor)
-- Describe the attack scenario
-- Suggest the fix
-
-Write ALL findings as JSON to `.codelicious/review_security.json`:
-```json
-[{"severity": "P1", "file": "src/foo.py", "line": 42, "title": "...", "description": "...", "fix": "..."}]
-```
-
-""",
-    "qa": """\
-You are a **QA engineer** reviewing {{project_name}}.
-
-GUARDRAILS: Do NOT modify any files. Read only.
-
-Run the full test suite and analyze coverage:
-- Identify untested code paths and edge cases
-- Check for flaky tests (non-deterministic, order-dependent)
-- Verify error handling paths are tested
-- Look for missing boundary condition tests
-- Check that test assertions are meaningful (not just "assert True")
-- Verify mocks match real interfaces
-
-For every finding, cite file:line and severity (P1/P2/P3).
-
-Write ALL findings as JSON to `.codelicious/review_qa.json`:
-```json
-[{"severity": "P2", "file": "tests/test_foo.py", "line": 10, "title": "...", "description": "...", "fix": "..."}]
-```
-
-""",
-    "performance": """\
-You are a **performance engineer** reviewing {{project_name}}.
-
-GUARDRAILS: Do NOT modify any files. Read only.
-
-Analyze for performance issues:
-- O(n^2) or worse algorithms where O(n) would work
-- Unnecessary memory allocations or copies
-- Missing caching opportunities
-- Unbounded data structures (lists that grow without limit)
-- Regex catastrophic backtracking (ReDoS)
-- Blocking I/O in hot paths
-- N+1 query patterns
-
-For every finding, cite file:line and severity (P1/P2/P3).
-
-Write ALL findings as JSON to `.codelicious/review_performance.json`:
-```json
-[{"severity": "P2", "file": "src/foo.py", "line": 99, "title": "...", "description": "...", "fix": "..."}]
-```
-
-""",
-    "reliability": """\
-You are a **reliability engineer** reviewing {{project_name}}.
-
-GUARDRAILS: Do NOT modify any files. Read only.
-
-Analyze for reliability issues:
-- Race conditions and thread safety problems
-- Resource leaks (file handles, connections, threads)
-- Missing timeouts on I/O operations
-- Unhandled exceptions that crash the process
-- Missing retry logic for transient failures
-- Deadlock potential in concurrent code
-- State corruption from partial failures
-
-For every finding, cite file:line and severity (P1/P2/P3).
-
-Write ALL findings as JSON to `.codelicious/review_reliability.json`:
-```json
-[{"severity": "P1", "file": "src/foo.py", "line": 55, "title": "...", "description": "...", "fix": "..."}]
-```
-
-""",
-}
-
-
 # ---------------------------------------------------------------------------
 # Data structures
 # ---------------------------------------------------------------------------
@@ -189,903 +78,12 @@ class OrchestratorResult:
     cycles_completed: int = 0
 
 
-# ---------------------------------------------------------------------------
-# Worktree helpers
-# ---------------------------------------------------------------------------
-
-_WORKTREE_TIMEOUT_S: int = 120  # Max seconds for worktree subprocess operations
-_MERGE_ABORT_TIMEOUT_S: int = 30  # Max seconds for git merge --abort
-
-
-def _create_worktree(repo_path: pathlib.Path, branch_name: str) -> pathlib.Path:
-    """Create a git worktree for isolated building.
-
-    Returns the path to the new worktree directory.
-    """
-    # Sanitize branch_name to prevent path traversal (Finding 30)
-    safe_branch = re.sub(r"[^a-zA-Z0-9_\-/]", "_", branch_name)
-    safe_branch = safe_branch.replace("..", "_")
-    worktree_dir = repo_path / ".codelicious" / "worktrees" / safe_branch
-    worktrees_root = (repo_path / ".codelicious" / "worktrees").resolve()
-    if not worktree_dir.resolve().is_relative_to(worktrees_root):
-        raise RuntimeError(f"Worktree path escapes allowed directory: {branch_name}")
-    worktree_dir.parent.mkdir(parents=True, exist_ok=True)
-
-    # Clean up stale worktree if it exists
-    if worktree_dir.exists():
-        try:
-            subprocess.run(
-                ["git", "worktree", "remove", "--force", str(worktree_dir)],
-                cwd=str(repo_path),
-                capture_output=True,
-                timeout=_WORKTREE_TIMEOUT_S,
-            )
-        except subprocess.TimeoutExpired:
-            logger.warning("Timed out removing stale worktree %s; proceeding anyway.", worktree_dir)
-
-    # Create the worktree with a new branch
-    try:
-        result = subprocess.run(
-            ["git", "worktree", "add", "-b", safe_branch, str(worktree_dir)],
-            cwd=str(repo_path),
-            capture_output=True,
-            text=True,
-            timeout=_WORKTREE_TIMEOUT_S,
-        )
-    except subprocess.TimeoutExpired as exc:
-        raise RuntimeError(f"Timed out creating worktree for branch {branch_name}") from exc
-
-    if result.returncode != 0:
-        # Branch might already exist — try without -b
-        try:
-            result = subprocess.run(
-                ["git", "worktree", "add", str(worktree_dir), safe_branch],
-                cwd=str(repo_path),
-                capture_output=True,
-                text=True,
-                timeout=_WORKTREE_TIMEOUT_S,
-            )
-        except subprocess.TimeoutExpired as exc:
-            raise RuntimeError(f"Timed out creating worktree (fallback) for branch {branch_name}") from exc
-        if result.returncode != 0:
-            raise RuntimeError(f"Failed to create worktree: {result.stderr}")
-
-    logger.info("Created worktree at %s (branch: %s)", worktree_dir, branch_name)
-    return worktree_dir
-
-
-def _remove_worktree(repo_path: pathlib.Path, worktree_dir: pathlib.Path) -> None:
-    """Remove a git worktree."""
-    try:
-        subprocess.run(
-            ["git", "worktree", "remove", "--force", str(worktree_dir)],
-            cwd=str(repo_path),
-            capture_output=True,
-            timeout=_WORKTREE_TIMEOUT_S,
-        )
-    except subprocess.TimeoutExpired:
-        logger.warning("Timed out removing worktree %s; it may need manual cleanup.", worktree_dir)
-        return
-    logger.info("Removed worktree: %s", worktree_dir)
-
-
-def _commit_worktree_changes(worktree_dir: pathlib.Path, spec_name: str) -> bool:
-    """Stage and commit all changes in a worktree.
-
-    The build agent is forbidden from running git commands, so the
-    orchestrator must commit changes on the agent's behalf before the
-    worktree is removed.  Without this commit, changes would be lost.
-
-    Excludes ``.codelicious/`` from the commit to prevent merge conflicts
-    when multiple worktrees modify STATE.md or BUILD_COMPLETE.
-
-    Attempts a GPG-signed commit first. Falls back to ``--no-gpg-sign``
-    only when GPG-related errors are detected in stderr (e.g. no GPG agent
-    is available in the worktree environment).
-
-    Returns True if a commit was created, False if the worktree was clean.
-    """
-    # Stage everything EXCEPT .codelicious/ (which causes merge conflicts)
-    try:
-        subprocess.run(
-            ["git", "add", "--all", "--", ".", ":!.codelicious/"],
-            cwd=str(worktree_dir),
-            capture_output=True,
-            timeout=_WORKTREE_TIMEOUT_S,
-        )
-    except subprocess.TimeoutExpired:
-        logger.warning("Timed out staging changes in worktree %s.", worktree_dir)
-        return False
-
-    # Check if there's anything staged
-    try:
-        status = subprocess.run(
-            ["git", "diff", "--cached", "--quiet"],
-            cwd=str(worktree_dir),
-            capture_output=True,
-            timeout=_WORKTREE_TIMEOUT_S,
-        )
-    except subprocess.TimeoutExpired:
-        logger.warning("Timed out checking staged diff in worktree %s.", worktree_dir)
-        return False
-
-    if status.returncode == 0:
-        logger.debug("Worktree %s has no staged changes — nothing to commit.", worktree_dir)
-        return False
-
-    # Try a signed commit first (Finding 42: honour GPG signing policy)
-    try:
-        result = subprocess.run(
-            ["git", "commit", "-m", f"codelicious: build {spec_name}"],
-            cwd=str(worktree_dir),
-            capture_output=True,
-            text=True,
-            timeout=_WORKTREE_TIMEOUT_S,
-        )
-    except subprocess.TimeoutExpired:
-        logger.warning("Timed out committing worktree changes for %s.", spec_name)
-        return False
-
-    if result.returncode != 0:
-        stderr_lower = result.stderr.lower()
-        gpg_related = any(kw in stderr_lower for kw in ("gpg", "signing", "sign", "secret key"))
-        if gpg_related:
-            logger.warning(
-                "GPG signing unavailable in worktree (no GPG agent); falling back to unsigned commit. stderr: %s",
-                result.stderr.strip(),
-            )
-            try:
-                result = subprocess.run(
-                    ["git", "commit", "--no-gpg-sign", "-m", f"codelicious: build {spec_name}"],
-                    cwd=str(worktree_dir),
-                    capture_output=True,
-                    text=True,
-                    timeout=_WORKTREE_TIMEOUT_S,
-                )
-            except subprocess.TimeoutExpired:
-                logger.warning("Timed out committing (unsigned) worktree changes for %s.", spec_name)
-                return False
-        if result.returncode != 0:
-            logger.warning("Failed to commit worktree changes: %s", result.stderr.strip())
-            return False
-
-    logger.info("Committed agent changes in worktree for %s", spec_name)
-    return True
-
-
-def _abort_merge(repo_path: pathlib.Path) -> None:
-    """Abort an in-progress git merge, with timeout and error handling."""
-    try:
-        abort_result = subprocess.run(
-            ["git", "merge", "--abort"],
-            cwd=str(repo_path),
-            capture_output=True,
-            text=True,
-            timeout=_MERGE_ABORT_TIMEOUT_S,
-        )
-        if abort_result.returncode != 0:
-            logger.critical(
-                "git merge --abort failed (exit %d): %s",
-                abort_result.returncode,
-                abort_result.stderr.strip(),
-            )
-        else:
-            logger.info("Merge aborted successfully.")
-    except subprocess.TimeoutExpired:
-        logger.critical(
-            "git merge --abort timed out after %ds — repository may be in a dirty state.",
-            _MERGE_ABORT_TIMEOUT_S,
-        )
-
-
-def _merge_worktree_branch(repo_path: pathlib.Path, branch_name: str) -> bool:
-    """Merge a worktree branch back into the current branch.
-
-    Returns True on success, False on merge conflict or timeout.
-    """
-    try:
-        result = subprocess.run(
-            ["git", "merge", "--no-ff", "-m", f"codelicious: merge {branch_name}", branch_name],
-            cwd=str(repo_path),
-            capture_output=True,
-            text=True,
-            timeout=_WORKTREE_TIMEOUT_S,
-        )
-    except subprocess.TimeoutExpired:
-        logger.error("Timed out merging branch %s; attempting abort.", branch_name)
-        _abort_merge(repo_path)
-        return False
-
-    if result.returncode != 0:
-        logger.error("Merge conflict for branch %s: %s", branch_name, result.stderr)
-        # Abort the merge to leave repo in clean state
-        _abort_merge(repo_path)
-        return False
-
-    logger.info("Merged branch %s successfully.", branch_name)
-    return True
-
-
-def _delete_branch(repo_path: pathlib.Path, branch_name: str) -> None:
-    """Delete a local branch after merge."""
-    try:
-        result = subprocess.run(
-            ["git", "branch", "-d", branch_name],
-            cwd=str(repo_path),
-            capture_output=True,
-            text=True,
-            timeout=_WORKTREE_TIMEOUT_S,
-        )
-    except subprocess.TimeoutExpired:
-        logger.warning("Timed out deleting branch %s.", branch_name)
-        return
-    if result.returncode != 0:
-        logger.warning("Failed to delete branch %s: %s", branch_name, result.stderr.strip())
-
-
-# ---------------------------------------------------------------------------
-# Review output parsing
-# ---------------------------------------------------------------------------
-
-
-def _collect_review_findings(repo_path: pathlib.Path, role: str) -> list[Finding]:
-    """Read the JSON findings file written by a reviewer agent."""
-    review_file = repo_path / ".codelicious" / f"review_{role}.json"
-    if not review_file.is_file():
-        logger.debug("No review file for role %s", role)
-        return []
-
-    try:
-        data = json.loads(review_file.read_text(encoding="utf-8"))
-        if not isinstance(data, list):
-            logger.warning("Review file for %s is not a JSON array", role)
-            return []
-        findings = []
-        for item in data:
-            if isinstance(item, dict):
-                findings.append(
-                    Finding(
-                        role=role,
-                        severity=item.get("severity", "P3"),
-                        file=item.get("file", ""),
-                        line=item.get("line", 0),
-                        title=item.get("title", ""),
-                        description=item.get("description", ""),
-                        fix=item.get("fix", ""),
-                    )
-                )
-        return findings
-    except (json.JSONDecodeError, OSError) as e:
-        logger.warning("Failed to parse review findings for %s: %s", role, e)
-        return []
-
-
-def _triage_findings(findings: list[Finding]) -> list[Finding]:
-    """Sort findings by severity (P1 first) and deduplicate by file+line."""
-    severity_order = {"P1": 0, "P2": 1, "P3": 2}
-    seen: set[tuple[str, int]] = set()
-    deduped: list[Finding] = []
-    for f in sorted(findings, key=lambda f: severity_order.get(f.severity, 9)):
-        key = (f.file, f.line)
-        if key not in seen:
-            seen.add(key)
-            deduped.append(f)
-    return deduped
-
-
-# ---------------------------------------------------------------------------
-# Fix prompt generation
-# ---------------------------------------------------------------------------
-
-_FIX_PROMPT_TEMPLATE: str = """\
-You are fixing issues in {{project_name}} identified by automated review.
-
-## CRITICAL: Do NOT run git or gh commands
-
-The codelicious orchestrator manages all git and GitHub operations.
-You MUST NOT run git add, git commit, git push, gh pr create, or any
-other git/gh commands. The orchestrator will commit your changes.
-
-## Triaged Findings (ordered by severity)
-
-{{findings_text}}
-
-## Instructions
-
-Fix the findings above, starting with P1 (most critical).
-For each fix:
-1. Read the file cited in the finding
-2. Apply the suggested fix (or a better one if you see one)
-3. Run tests after each fix to ensure no regressions
-
-When all fixable findings are addressed, run /verify-all.
-Then write "DONE" to .codelicious/BUILD_COMPLETE
-"""
-
-
-def _render_fix_prompt(project_name: str, findings: list[Finding]) -> str:
-    """Render the fix prompt with the triaged findings."""
-    lines = []
-    for i, f in enumerate(findings, 1):
-        lines.append(
-            f"{i}. **[{f.severity}]** `{f.file}:{f.line}` — {f.title}\n"
-            f"   {f.description}\n"
-            f"   **Fix:** {f.fix}\n"
-            f"   _(from: {f.role} reviewer)_\n"
-        )
-    findings_text = "\n".join(lines) if lines else "No findings to fix."
-
-    result = _FIX_PROMPT_TEMPLATE
-    result = result.replace("{{project_name}}", project_name)
-    result = result.replace("{{findings_text}}", findings_text)
-    return result
-
-
-# ---------------------------------------------------------------------------
-# Orchestrator
-# ---------------------------------------------------------------------------
-
-
-class Orchestrator:
-    """Phase-based orchestrator for parallel agent workflows.
-
-    Usage::
-
-        orch = Orchestrator(repo_path, git_manager, agent_config)
-        result = orch.run(
-            specs=[Path("spec-17.md"), Path("spec-18.md")],
-            reviewers=["security", "qa", "performance"],
-        )
-    """
-
-    def __init__(
-        self,
-        repo_path: pathlib.Path,
-        git_manager: object,
-        config: object,
-    ) -> None:
-        self.repo_path = pathlib.Path(repo_path).resolve()
-        self.git_manager = git_manager
-        self.config = config
-        self.project_name = self.repo_path.name
-
-    def _run_agent(self, prompt: str, project_root: pathlib.Path, session_id: str = "") -> object:
-        """Run a Claude agent, returning the AgentResult."""
-        from codelicious.agent_runner import run_agent
-
-        return run_agent(
-            prompt=prompt,
-            project_root=project_root,
-            config=self.config,
-            tee_to=sys.stdout,
-            resume_session_id=session_id,
-        )
-
-    # ------------------------------------------------------------------
-    # Phase 1: BUILD (parallel, isolated worktrees)
-    # ------------------------------------------------------------------
-
-    def _build_spec_in_worktree(self, spec_path: pathlib.Path) -> tuple[str, bool]:
-        """Build a single spec in an isolated git worktree.
-
-        The agent is instructed to build ALL unchecked tasks in the
-        assigned spec file, not just one.
-
-        Returns (branch_name, success).
-        """
-        from codelicious.git.git_orchestrator import spec_branch_name
-        from codelicious.prompts import AGENT_BUILD_SPEC, render
-
-        branch_name = spec_branch_name(spec_path.name)
-        worktree_dir: pathlib.Path | None = None
-
-        try:
-            worktree_dir = _create_worktree(self.repo_path, branch_name)
-
-            # Resolve spec_path relative to the worktree so the agent
-            # sees the correct file path in its working directory.
-            try:
-                rel = spec_path.relative_to(self.repo_path)
-            except ValueError:
-                # spec_path is not under repo_path — use just the filename
-                # to avoid joining an absolute path (which discards the left operand)
-                rel = pathlib.Path(spec_path.name)
-                logger.warning(
-                    "Spec %s is not under repo %s — using filename only.",
-                    spec_path,
-                    self.repo_path,
-                )
-            spec_in_worktree = worktree_dir / rel
-
-            if not spec_in_worktree.is_file():
-                logger.warning(
-                    "Spec file %s not found in worktree %s. Agent will search for specs automatically.",
-                    spec_in_worktree,
-                    worktree_dir,
-                )
-                # Fall back to telling the agent to find the spec itself
-                spec_filter_str = (
-                    f"File not found at {spec_in_worktree}. Look for a spec file named '{spec_path.name}' in the repo."
-                )
-            else:
-                spec_filter_str = str(spec_in_worktree)
-
-            build_prompt = render(
-                AGENT_BUILD_SPEC,
-                project_name=self.project_name,
-                spec_filter=spec_filter_str,
-            )
-
-            result = self._run_agent(build_prompt, worktree_dir)
-
-            # Don't trust result.success alone — it only reflects subprocess
-            # exit code (0 = success).  Check BUILD_COMPLETE in the worktree
-            # to verify the agent actually finished building.
-            from codelicious.prompts import check_build_complete
-
-            agent_done = check_build_complete(worktree_dir)
-            success = result.success and agent_done
-
-            logger.info(
-                "Build for %s complete: process_ok=%s, build_complete=%s",
-                spec_path.name,
-                result.success,
-                agent_done,
-            )
-
-            # Commit the agent's changes in the worktree so they survive
-            # worktree removal and can be merged back.  Agents are forbidden
-            # from running git commands — the orchestrator owns all git ops.
-            commit_ok = _commit_worktree_changes(worktree_dir, spec_path.name)
-
-            # If the build succeeded but we couldn't commit its changes, mark the
-            # overall result as failed and preserve the worktree so the changes are
-            # not silently discarded.  The caller will see success=False and can
-            # investigate the worktree directory for manual recovery.
-            if not commit_ok and success:
-                logger.error(
-                    "Build for %s succeeded but committing worktree changes failed. "
-                    "Preserving worktree at %s to prevent data loss.",
-                    spec_path.name,
-                    worktree_dir,
-                )
-                success = False
-                # Signal the finally block to skip removal by clearing worktree_dir
-                worktree_dir = None
-
-            return branch_name, success
-
-        except Exception as e:
-            logger.error("Build for %s failed: %s", spec_path.name, e)
-            return branch_name, False
-
-        finally:
-            if worktree_dir is not None:
-                try:
-                    _remove_worktree(self.repo_path, worktree_dir)
-                except Exception as e:
-                    logger.warning("Failed to clean up worktree %s: %s", worktree_dir, e)
-
-    def _phase_build(
-        self,
-        specs: list[pathlib.Path],
-        max_workers: int,
-    ) -> list[tuple[str, bool]]:
-        """Phase 1: Build specs in parallel worktrees.
-
-        Returns list of (branch_name, success) tuples.
-        """
-        from codelicious.git.git_orchestrator import spec_branch_name
-
-        if not specs:
-            return []
-
-        workers = min(max_workers, len(specs))
-        logger.info(
-            "PHASE 1 BUILD: %d specs across %d workers",
-            len(specs),
-            workers,
-        )
-
-        results: list[tuple[str, bool]] = []
-        completed_count = 0
-        count_lock = threading.Lock()
-
-        def _log_spec_progress(spec: pathlib.Path, branch: str, ok: bool) -> None:
-            nonlocal completed_count
-            with count_lock:
-                completed_count += 1
-                count = completed_count
-            status = "OK" if ok else "FAILED"
-            logger.info(
-                "  [%d/%d] %s — %s (branch: %s)",
-                count,
-                len(specs),
-                spec.name,
-                status,
-                branch,
-            )
-
-        if workers <= 1:
-            # Serial fallback
-            for spec in specs:
-                logger.info("  Building spec: %s ...", spec.name)
-                branch, ok = self._build_spec_in_worktree(spec)
-                _log_spec_progress(spec, branch, ok)
-                results.append((branch, ok))
-        else:
-            with concurrent.futures.ThreadPoolExecutor(max_workers=workers) as pool:
-                futures = {pool.submit(self._build_spec_in_worktree, spec): spec for spec in specs}
-                for spec in specs:
-                    logger.info("  Queued spec: %s", spec.name)
-                try:
-                    for future in concurrent.futures.as_completed(futures):
-                        spec = futures[future]
-                        try:
-                            branch, ok = future.result()
-                            _log_spec_progress(spec, branch, ok)
-                            results.append((branch, ok))
-                        except Exception as e:
-                            branch = spec_branch_name(spec.name)
-                            with count_lock:
-                                completed_count += 1
-                                count = completed_count
-                            logger.error(
-                                "  [%d/%d] %s — ERROR: %s",
-                                count,
-                                len(specs),
-                                spec.name,
-                                e,
-                            )
-                            results.append((branch, False))
-                except KeyboardInterrupt:
-                    logger.warning("KeyboardInterrupt received — cancelling pending build futures.")
-                    for f in futures:
-                        f.cancel()
-                    pool.shutdown(wait=False, cancel_futures=True)
-                    raise
-
-        return results
-
-    # ------------------------------------------------------------------
-    # Phase 2: MERGE (serial, deterministic)
-    # ------------------------------------------------------------------
-
-    def _phase_merge(self, build_results: list[tuple[str, bool]]) -> int:
-        """Phase 2: Merge successful build branches back.
-
-        Returns the number of successfully merged branches.
-        """
-        successful = [(branch, ok) for branch, ok in build_results if ok]
-        if not successful:
-            logger.warning("PHASE 2 MERGE: no successful builds to merge.")
-            return 0
-
-        logger.info("PHASE 2 MERGE: merging %d branches", len(successful))
-        merged = 0
-
-        for branch_name, _ in successful:
-            if _merge_worktree_branch(self.repo_path, branch_name):
-                _delete_branch(self.repo_path, branch_name)
-                merged += 1
-            else:
-                logger.warning(
-                    "Skipping branch %s due to merge conflict. Manual resolution required.",
-                    branch_name,
-                )
-
-        return merged
-
-    # ------------------------------------------------------------------
-    # Phase 3: REVIEW (parallel, read-only)
-    # ------------------------------------------------------------------
-
-    def _run_reviewer(self, role: str) -> list[Finding]:
-        """Run a single reviewer agent (read-only) and collect findings."""
-        from codelicious.prompts import render
-
-        prompt_template = REVIEWER_PROMPTS.get(role)
-        if not prompt_template:
-            logger.warning("Unknown reviewer role: %s", role)
-            return []
-
-        prompt = render(prompt_template, project_name=self.project_name)
-
-        try:
-            self._run_agent(prompt, self.repo_path)
-        except Exception as e:
-            logger.warning("Reviewer %s failed: %s", role, e)
-
-        return _collect_review_findings(self.repo_path, role)
-
-    def _phase_review(
-        self,
-        roles: list[str],
-        max_workers: int,
-    ) -> list[Finding]:
-        """Phase 3: Run read-only reviewers in parallel.
-
-        Returns all collected findings.
-        """
-        if not roles:
-            return []
-
-        workers = min(max_workers, len(roles))
-        logger.info(
-            "PHASE 3 REVIEW: %d reviewers across %d workers: %s",
-            len(roles),
-            workers,
-            roles,
-        )
-
-        all_findings: list[Finding] = []
-
-        if workers <= 1:
-            for role in roles:
-                all_findings.extend(self._run_reviewer(role))
-        else:
-            with concurrent.futures.ThreadPoolExecutor(max_workers=workers) as pool:
-                futures = {pool.submit(self._run_reviewer, role): role for role in roles}
-                try:
-                    for future in concurrent.futures.as_completed(futures):
-                        role = futures[future]
-                        try:
-                            all_findings.extend(future.result())
-                        except Exception as e:
-                            logger.error("Reviewer %s raised: %s", role, e)
-                except KeyboardInterrupt:
-                    logger.warning("KeyboardInterrupt received — cancelling pending review futures.")
-                    for f in futures:
-                        f.cancel()
-                    pool.shutdown(wait=False, cancel_futures=True)
-                    raise
-
-        triaged = _triage_findings(all_findings)
-        logger.info(
-            "PHASE 3 REVIEW: %d total findings, %d after triage (P1: %d, P2: %d, P3: %d)",
-            len(all_findings),
-            len(triaged),
-            sum(1 for f in triaged if f.severity == "P1"),
-            sum(1 for f in triaged if f.severity == "P2"),
-            sum(1 for f in triaged if f.severity == "P3"),
-        )
-        return triaged
-
-    # ------------------------------------------------------------------
-    # Phase 4: FIX (serial, one agent)
-    # ------------------------------------------------------------------
-
-    def _phase_fix(self, findings: list[Finding]) -> bool:
-        """Phase 4: Apply fixes for triaged findings.
-
-        Only runs if there are P1 or P2 findings. P3s are logged but
-        not auto-fixed.
-
-        Returns True if the fix agent completed successfully.
-        """
-        actionable = [f for f in findings if f.severity in ("P1", "P2")]
-        if not actionable:
-            logger.info("PHASE 4 FIX: no P1/P2 findings to fix.")
-            return True
-
-        logger.info("PHASE 4 FIX: applying %d P1/P2 findings", len(actionable))
-
-        from codelicious.prompts import check_build_complete, clear_build_complete
-
-        clear_build_complete(self.repo_path)
-        fix_prompt = _render_fix_prompt(self.project_name, actionable)
-
-        try:
-            self._run_agent(fix_prompt, self.repo_path)
-        except Exception as e:
-            logger.error("Fix agent failed: %s", e)
-            return False
-
-        return check_build_complete(self.repo_path)
-
-    # ------------------------------------------------------------------
-    # Public API
-    # ------------------------------------------------------------------
-
-    def run(
-        self,
-        specs: list[pathlib.Path],
-        reviewers: list[str] | None = None,
-        max_build_workers: int = 3,
-        max_review_workers: int = 4,
-        max_build_cycles: int = 10,
-        push_pr: bool = False,
-        max_wall_clock_s: float = 7200,
-    ) -> OrchestratorResult:
-        """Run the full orchestrated pipeline.
-
-        Build→merge cycles repeat until all specs are complete (no
-        unchecked ``- [ ]`` items remain) or the cycle cap is reached.
-        Review and fix run once at the end, after building is done.
-
-        Parameters
-        ----------
-        specs:
-            List of spec file paths to build.
-        reviewers:
-            List of reviewer role names (e.g. ["security", "qa"]).
-            Defaults to all available roles.
-        max_build_workers:
-            Max concurrent builder agents.
-        max_review_workers:
-            Max concurrent reviewer agents.
-        max_build_cycles:
-            Max build→merge iterations before giving up.
-        push_pr:
-            Whether to push and create/update PR after completion.
-        max_wall_clock_s:
-            Hard wall-clock limit in seconds for the entire run (Finding 22).
-            Defaults to 7200 (2 hours). The build loop is aborted if this
-            limit is reached before all cycles complete.
-        """
-        from codelicious.prompts import scan_remaining_tasks_for_spec
-
-        if reviewers is None:
-            reviewers = list(REVIEWER_PROMPTS.keys())
-
-        # Normalize all spec paths to absolute so comparisons are reliable
-        specs = [s.resolve() if not s.is_absolute() else s for s in specs]
-
-        start = time.monotonic()
-        total_builds = 0
-        total_merged = 0
-        cycles = 0
-
-        logger.info(
-            "ORCHESTRATOR: %d specs, %d reviewers, build_workers=%d, review_workers=%d, max_build_cycles=%d",
-            len(specs),
-            len(reviewers),
-            max_build_workers,
-            max_review_workers,
-            max_build_cycles,
-        )
-
-        # ── BUILD LOOP: repeat build→merge until all specs complete ──
-        incomplete_specs = list(specs)
-        consecutive_failures = 0
-
-        for cycle in range(1, max_build_cycles + 1):
-            # Wall-clock timeout guard (Finding 22)
-            elapsed_so_far = time.monotonic() - start
-            if elapsed_so_far >= max_wall_clock_s:
-                logger.error(
-                    "Wall-clock timeout reached after %.1fs (limit=%ss). Aborting build loop.",
-                    elapsed_so_far,
-                    max_wall_clock_s,
-                )
-                break
-
-            # Cache scan_remaining_tasks_for_spec results keyed by spec path so
-            # each spec is queried at most once per cycle (Finding 26).
-            remaining_cache: dict[pathlib.Path, int] = {s: scan_remaining_tasks_for_spec(s) for s in incomplete_specs}
-            # Check which specs still have unchecked tasks
-            still_incomplete = [s for s, n in remaining_cache.items() if n > 0]
-            if not still_incomplete:
-                logger.info("All %d specs are complete after %d build cycle(s).", len(specs), cycles)
-                break
-
-            cycles = cycle
-            logger.info("")
-            logger.info(
-                "══════ Build cycle %d/%d (%d specs remaining) ══════", cycle, max_build_cycles, len(still_incomplete)
-            )
-
-            # ── Phase 1: BUILD ────────────────────────────────────
-            logger.info("---- BUILD ----")
-            build_results = self._phase_build(still_incomplete, max_build_workers)
-            successful = sum(1 for _, ok in build_results if ok)
-            total_builds += successful
-            logger.info("Build: %d/%d specs built successfully.", successful, len(still_incomplete))
-
-            if successful == 0:
-                consecutive_failures += 1
-                logger.warning("No specs built in cycle %d (%d consecutive failures).", cycle, consecutive_failures)
-                if consecutive_failures >= 3:
-                    logger.error("Aborting: %d consecutive build cycles with zero progress.", consecutive_failures)
-                    break
-                continue
-            else:
-                consecutive_failures = 0
-
-            # ── Phase 2: MERGE ────────────────────────────────────
-            logger.info("---- MERGE ----")
-            merged = self._phase_merge(build_results)
-            total_merged += merged
-            logger.info("Merge: %d branches merged.", merged)
-
-            # Commit merged work and push before next cycle
-            try:
-                self.git_manager.commit_verified_changes(
-                    commit_message=f"codelicious: build cycle {cycle} of {self.project_name}"
-                )
-            except Exception as e:
-                logger.warning("Mid-cycle commit failed: %s", e)
-
-            # Push even if commit_verified_changes found nothing new to
-            # commit — merge commits need to be pushed too.
-            push = self.git_manager.push_to_origin()
-            if not push.success:
-                logger.warning("Mid-cycle push failed (type=%s): %s", push.error_type, push.message)
-
-            # Update incomplete list for next iteration
-            incomplete_specs = still_incomplete
-
-        # ── Check final completion status ─────────────────────────
-        # Cache results to avoid calling scan_remaining_tasks_for_spec twice
-        # for the same spec (Finding 26).
-        final_remaining_cache: dict[pathlib.Path, int] = {s: scan_remaining_tasks_for_spec(s) for s in specs}
-        final_incomplete = [s for s, n in final_remaining_cache.items() if n > 0]
-        all_complete = len(final_incomplete) == 0
-
-        if all_complete:
-            logger.info("All specs complete. Proceeding to review phase.")
-        else:
-            remaining_tasks = sum(final_remaining_cache[s] for s in final_incomplete)
-            logger.warning(
-                "%d spec(s) still incomplete (%d unchecked tasks). Proceeding to review.",
-                len(final_incomplete),
-                remaining_tasks,
-            )
-
-        # ── REVIEW (once, after all building is done) ─────────────
-        logger.info("")
-        logger.info("---- REVIEW ----")
-        findings = self._phase_review(reviewers, max_review_workers)
-
-        # ── FIX (once, after review) ──────────────────────────────
-        logger.info("")
-        logger.info("---- FIX ----")
-        fix_ok = self._phase_fix(findings)
-
-        # ── Final commit, push & PR ───────────────────────────────
-        try:
-            self.git_manager.commit_verified_changes(
-                commit_message=f"codelicious: orchestrated build of {self.project_name}"
-            )
-        except Exception as e:
-            logger.warning("Post-orchestration commit failed: %s", e)
-
-        # Always push — commit_verified_changes skips push when working
-        # tree is clean, but merge commits still need to be pushed.
-        push = self.git_manager.push_to_origin()
-        if not push.success:
-            logger.error("Final push failed (type=%s): %s", push.error_type, push.message)
-
-        if push_pr:
-            # Create/reuse one PR per successfully built spec (spec-22 Phase 4)
-            for spec in specs:
-                _m = re.match(r"^(\d+)", spec.stem)
-                _sid = _m.group(1) if _m else spec.stem
-                try:
-                    self.git_manager.ensure_draft_pr_exists(
-                        spec_id=_sid,
-                        spec_summary=f"build {self.project_name}",
-                    )
-                except Exception as e:
-                    logger.warning("PR creation for spec-%s failed: %s", _sid, e)
-
-        elapsed = time.monotonic() - start
-        return OrchestratorResult(
-            success=all_complete and fix_ok,
-            message=(
-                f"Orchestrated: {total_builds} builds across {cycles} cycle(s), "
-                f"{total_merged} merged, {len(final_incomplete)}/{len(specs)} specs still incomplete, "
-                f"{len(findings)} findings, fix={'OK' if fix_ok else 'FAILED'} "
-                f"in {elapsed:.1f}s"
-            ),
-            findings=findings,
-            elapsed_s=elapsed,
-            cycles_completed=cycles,
-        )
+# Keep a public alias so callers that import ChunkOutcome don't break.
+ChunkOutcome = OrchestratorResult
 
 
 # ═══════════════════════════════════════════════════════════════════════
-# spec-27 Phase 4: V2 Orchestrator — chunk-based serial loop
+# spec-27 Phase 4: Orchestrator — chunk-based serial loop
 # ═══════════════════════════════════════════════════════════════════════
 
 
@@ -1121,7 +119,7 @@ def _ensure_codelicious_gitignored(repo_path: pathlib.Path) -> None:
         logger.warning("Could not update .gitignore: %s", e)
 
 
-class V2Orchestrator:
+class Orchestrator:
     """Chunk-based orchestrator for codelicious v2 (spec-27 Phase 4.1).
 
     Runs the simplified workflow::
@@ -1144,14 +142,68 @@ def __init__(
         max_loc_per_pr: int = 250,
         model: str = "",
         progress_callback: object = None,
+        no_resume: bool = False,
+        engines: list[object] | None = None,
     ) -> None:
         self.repo_path = pathlib.Path(repo_path).resolve()
         self.git_manager = git_manager
         self.engine = engine
+        # spec v30 Step 5: optional engine list for rate-limit fallover. The
+        # primary ``engine`` argument stays first; additional engines tail it.
+        # When an engine rate-limits we drop it and continue on the next one.
+        if engines:
+            self._engines: list[object] = list(engines)
+        else:
+            self._engines = [engine]
         self.max_commits_per_pr = max_commits_per_pr
         self.max_loc_per_pr = max_loc_per_pr
         self.model = model
         self.progress_callback = progress_callback
+        # spec v30 Step 2: when True, ignore any persisted chunk-status ledger
+        # and re-execute every chunk regardless of prior runs.
+        self.no_resume = no_resume
+
+    # ── spec v30 Step 2: chunk-status ledger helpers ─────────────────
+    def _ledger_path(self, spec: pathlib.Path) -> pathlib.Path:
+        slug = re.sub(r"[^A-Za-z0-9_-]+", "_", spec.stem)
+        return self.repo_path / ".codelicious" / "state" / f"{slug}.json"
+
+    def _load_ledger(self, spec: pathlib.Path) -> dict:
+        if self.no_resume:
+            return {"chunks": {}}
+        path = self._ledger_path(spec)
+        try:
+            import json as _json
+
+            data = _json.loads(path.read_text(encoding="utf-8"))
+            if isinstance(data, dict) and isinstance(data.get("chunks"), dict):
+                return data
+        except (OSError, ValueError):
+            pass
+        return {"chunks": {}}
+
+    def _save_ledger(self, spec: pathlib.Path, ledger: dict) -> None:
+        import json as _json
+        import os as _os
+
+        path = self._ledger_path(spec)
+        try:
+            path.parent.mkdir(parents=True, exist_ok=True)
+            tmp = path.with_suffix(path.suffix + ".tmp")
+            tmp.write_text(_json.dumps(ledger, indent=2, sort_keys=True), encoding="utf-8")
+            _os.replace(tmp, path)
+        except OSError as exc:  # nosec B110
+            logger.warning("Could not persist chunk ledger %s: %s", path, exc)
+
+    def _ledger_set(self, ledger: dict, chunk_id: str, *, status: str, **fields: object) -> None:
+        import datetime as _dt
+
+        entry = ledger.setdefault("chunks", {}).get(chunk_id, {})
+        entry["status"] = status
+        entry["updated_at"] = _dt.datetime.now(_dt.timezone.utc).isoformat()
+        for k, v in fields.items():
+            entry[k] = v
+        ledger["chunks"][chunk_id] = entry
 
     def _report(
         self,
@@ -1213,7 +265,7 @@ def run(
         push_pr:
             Whether to create/update PRs on GitHub/GitLab.
         """
-        from codelicious.chunker import chunk_spec
+        from codelicious.chunker import chunk_spec, enforce_token_budget
         from codelicious.engines.base import EngineContext
         from codelicious.scaffolder import scaffold_claude_dir
         from codelicious.spec_discovery import mark_chunk_complete
@@ -1247,6 +299,11 @@ def run(
                 logger.error("Failed to chunk spec %s: %s", spec.name, e)
                 continue
 
+            # spec v30 Step 6: split any chunk whose estimated token cost
+            # exceeds the active engine's context window before dispatch.
+            engine_name = getattr(self.engine, "name", "") or ""
+            chunks = enforce_token_budget(chunks, self.repo_path, engines=[engine_name] if engine_name else None)
+
             if not chunks:
                 logger.info("Spec %s has no chunks to build.", spec.name)
                 specs_completed += 1
@@ -1275,13 +332,32 @@ def run(
 
             previous_chunks: list[str] = []
 
+            # spec v30 Step 2: load chunk-status ledger so a re-run after a
+            # mid-spec abort skips already-merged chunks.
+            ledger = self._load_ledger(spec)
+
             # ── Execute each chunk ────────────────────────────────
             all_chunks_ok = True
             spec_chunks_failed = 0
             for chunk_idx, chunk in enumerate(chunks, 1):
-                # Deadline check
-                if deadline and time.monotonic() > deadline:
-                    logger.warning("Build deadline reached during spec %s, chunk %d.", spec.name, chunk_idx)
+                ledger_entry = ledger.get("chunks", {}).get(chunk.id, {})
+                if ledger_entry.get("status") == "merged":
+                    logger.info(
+                        "Skipping already-merged chunk %s: %s",
+                        chunk.id,
+                        chunk.title,
+                    )
+                    previous_chunks.append(f"{chunk.id}: {chunk.title}")
+                    continue
+                # spec v29 Step 10: deadline gate before each chunk so an
+                # expired budget never starts a fresh execute_chunk call.
+                if deadline and time.monotonic() >= deadline:
+                    logger.warning(
+                        "Deadline reached after %d/%d chunks in spec %s; stopping early.",
+                        chunk_idx - 1,
+                        total_chunks,
+                        spec.name,
+                    )
                     all_chunks_ok = False
                     break
 
@@ -1315,7 +391,19 @@ def run(
                 )
 
                 # ── Execute ───────────────────────────────────────
+                # spec v30 Step 5: try the primary engine, then fail over to
+                # any remaining engines on a rate-limit signal.
                 result = self.engine.execute_chunk(chunk, self.repo_path, context)
+                while result.message and "Rate limited" in (result.message or "") and len(self._engines) > 1:
+                    rate_limited = self._engines.pop(0)
+                    next_engine = self._engines[0]
+                    logger.warning(
+                        "%s rate-limited; failing over to %s for the remainder of this spec.",
+                        getattr(rate_limited, "name", "engine"),
+                        getattr(next_engine, "name", "engine"),
+                    )
+                    self.engine = next_engine
+                    result = self.engine.execute_chunk(chunk, self.repo_path, context)
 
                 # ── Verify ────────────────────────────────────────
                 if result.success:
@@ -1420,6 +508,14 @@ def run(
                         mark_chunk_complete(spec, chunk.title)
                         previous_chunks.append(f"{chunk.id}: {chunk.title}")
                         total_chunks_completed += 1
+                        self._ledger_set(
+                            ledger,
+                            chunk.id,
+                            status="merged",
+                            title=chunk.title,
+                            pr_number=pr_number,
+                        )
+                        self._save_ledger(spec, ledger)
                         self._report(
                             spec_run_idx,
                             total_specs_in_run,
@@ -1433,6 +529,8 @@ def run(
                         total_chunks_failed += 1
                         spec_chunks_failed += 1
                         all_chunks_ok = False
+                        self._ledger_set(ledger, chunk.id, status="failed", title=chunk.title, reason="no-diff")
+                        self._save_ledger(spec, ledger)
                         self._report(
                             spec_run_idx,
                             total_specs_in_run,
@@ -1455,6 +553,14 @@ def run(
                     total_chunks_failed += 1
                     spec_chunks_failed += 1
                     all_chunks_ok = False
+                    self._ledger_set(
+                        ledger,
+                        chunk.id,
+                        status="failed",
+                        title=chunk.title,
+                        reason=(result.message or "engine-failure")[:200],
+                    )
+                    self._save_ledger(spec, ledger)
                     self._report(
                         spec_run_idx,
                         total_specs_in_run,
@@ -1497,3 +603,7 @@ def run(
             elapsed_s=elapsed,
             cycles_completed=1,
         )
+
+
+# Backward-compatibility alias — external imports of V2Orchestrator still work.
+V2Orchestrator = Orchestrator
diff --git a/src/codelicious/prompts.py b/src/codelicious/prompts.py
index 5bc832b9..8c6879e3 100644
--- a/src/codelicious/prompts.py
+++ b/src/codelicious/prompts.py
@@ -3,6 +3,15 @@
 All prompt text used in agent mode lives in this module. No other module
 contains agent prompt strings. This separation makes prompts auditable,
 testable, and modifiable without touching orchestration logic.
+
+Live templates (per spec 27 §6.2 and spec v29 Steps 2 and 17):
+
+* ``CHUNK_EXECUTE`` — per-chunk execution prompt for the chunk-based loop.
+* ``CHUNK_VERIFY`` — per-chunk verification prompt.
+* ``CHUNK_FIX`` — per-chunk fix prompt invoked when verification fails.
+
+Multi-phase templates (SCAFFOLD, ANALYZE, REFLECT, AGENT_BUILD_SPEC, etc.)
+called out in spec 27 §6.2 have been removed.
 """
 
 from __future__ import annotations
@@ -11,7 +20,6 @@
 import re
 
 __all__ = [
-    "AGENT_BUILD_SPEC",
     "CHUNK_EXECUTE",
     "CHUNK_FIX",
     "CHUNK_VERIFY",
@@ -22,83 +30,6 @@
     "scan_remaining_tasks_for_spec",
 ]
 
-# ---------------------------------------------------------------------------
-# Active prompts — Claude Code gets full autonomy
-# ---------------------------------------------------------------------------
-
-AGENT_BUILD_SPEC: str = """\
-You are codelicious, an autonomous build agent for {{project_name}}.
-
-## Your mission
-
-Build ALL incomplete tasks from your assigned spec file. You handle
-understanding, implementation, and testing. The orchestrator handles
-all git operations (branching, committing, pushing, PRs) — you MUST NOT.
-
-## Your assigned spec file
-
-{{spec_filter}}
-
-**IMPORTANT:** Only build tasks from the spec file listed above. Do NOT
-look at or build tasks from other spec files. If the spec_filter above
-is empty, find the first incomplete spec file in the repo and build from
-that one only.
-
-## CRITICAL: Do NOT run git or gh commands
-
-The codelicious orchestrator manages all git and GitHub operations
-deterministically. You MUST NOT run any of these commands:
-- git checkout, git branch, git add, git commit, git push
-- gh pr create, gh pr view, gh pr edit
-- Any other git or gh CLI commands
-
-If you run git or gh commands, you will create duplicate branches and
-PRs. The orchestrator will commit and push your work automatically.
-
-### Step 1: Understand the project
-
-- Read your assigned spec file first.
-- Read CLAUDE.md, README, and the project manifest (package.json,
-  pyproject.toml, Cargo.toml, go.mod, etc.). Learn the tech stack.
-- Figure out how to run tests and lint for THIS project.
-- If this is your first time in this repo, write what you learned to
-  CLAUDE.md so future runs are faster.
-
-### Step 2: Find tasks in your assigned spec
-
-- Look through your assigned spec file for unchecked `- [ ]` items.
-- **If ALL tasks in your spec are `- [x]` (nothing left to do):**
-  1. Update .codelicious/STATE.md to reflect completion.
-  2. Write "DONE" to `.codelicious/BUILD_COMPLETE` and stop.
-
-### Step 3: Build each task
-
-For each unchecked `- [ ]` item in your assigned spec, in order:
-
-1. Read existing code before modifying. Match existing patterns.
-2. Implement the task completely.
-3. Run tests and lint. Fix ALL failures:
-   - Run the test suite
-   - If failures, read errors carefully, fix the root cause
-   - Run tests again
-   - Repeat until green (up to 3 attempts)
-4. Mark the task done: change `- [ ]` to `- [x]` in the spec file.
-5. Move on to the next unchecked `- [ ]` item.
-
-### Step 4: When all tasks are done
-
-- Update .codelicious/STATE.md with current status.
-- Write "DONE" to `.codelicious/BUILD_COMPLETE`
-
-## Rules
-
-- **Build ALL unchecked tasks** in your assigned spec before stopping.
-- Every change MUST pass tests. No broken code.
-- Keep docs (README, CLAUDE.md) current if your changes affect them.
-- Do NOT run git or gh commands. The orchestrator handles all git ops.
-"""
-
-
 # ---------------------------------------------------------------------------
 # spec-27 Phase 6.2: Chunk-focused prompt templates
 # ---------------------------------------------------------------------------
diff --git a/src/codelicious/sandbox.py b/src/codelicious/sandbox.py
index b40752ce..7826c877 100644
--- a/src/codelicious/sandbox.py
+++ b/src/codelicious/sandbox.py
@@ -108,6 +108,16 @@ def __init__(
         # Merge extra extensions from CODELICIOUS_EXTRA_EXTENSIONS env var
         self._allowed_extensions: frozenset[str] = self._build_allowed_extensions()
 
+        # spec v29 Step 15: explicit warning when O_NOFOLLOW is unavailable
+        # (Windows). The post-write symlink re-check is silently skipped on
+        # such platforms, downgrading TOCTOU protection to best-effort.
+        if not hasattr(os, "O_NOFOLLOW"):
+            logger.warning(
+                "Sandbox: os.O_NOFOLLOW unavailable on this platform — TOCTOU "
+                "protection on atomic writes is best-effort. Run codelicious "
+                "on a POSIX system for full guarantees."
+            )
+
     @staticmethod
     def _build_allowed_extensions() -> frozenset[str]:
         """Merge CODELICIOUS_EXTRA_EXTENSIONS into the base allowlist."""
diff --git a/src/codelicious/scaffolder.py b/src/codelicious/scaffolder.py
index fb8b89cf..70fdd9b4 100644
--- a/src/codelicious/scaffolder.py
+++ b/src/codelicious/scaffolder.py
@@ -20,6 +20,29 @@
 _SENTINEL_START: str = "<!-- codelicious:start -->"
 _SENTINEL_END: str = "<!-- codelicious:end -->"
 
+# spec v29 Step 18: single source of truth for the forbidden-pattern blurb
+# scaffolded into CLAUDE.md, the verify-all skill, and the security rule
+# file. Three string-literal copies previously drifted independently; any
+# future change must update this tuple, which makes the intent auditable.
+_FORBIDDEN_PATTERNS_DOC: tuple[str, ...] = (
+    "eval()",
+    "exec()",
+    "shell=True",
+    "os.system()",
+    "subprocess.call(..., shell=True)",
+)
+
+
+def _forbidden_patterns_bullets() -> str:
+    """Render ``_FORBIDDEN_PATTERNS_DOC`` as a markdown bullet list."""
+    return "\n".join(f"- {pattern}" for pattern in _FORBIDDEN_PATTERNS_DOC)
+
+
+def _forbidden_patterns_inline() -> str:
+    """Render ``_FORBIDDEN_PATTERNS_DOC`` as a comma-joined inline phrase."""
+    return ", ".join(_FORBIDDEN_PATTERNS_DOC)
+
+
 _MANAGED_BLOCK: str = f"""{_SENTINEL_START}
 
 # codelicious
@@ -145,10 +168,11 @@ def scaffold(project_root: pathlib.Path, dry_run: bool = False) -> None:
 
 QUALITY:
 - Every new function needs tests.
-- No hardcoded secrets, no eval(), no shell=True.
+- No hardcoded secrets. Forbidden patterns: __FORBIDDEN_INLINE__.
 - Handle errors explicitly — no bare except.
 - Follow the project's existing patterns exactly.
 """
+_AGENT_BUILDER = _AGENT_BUILDER.replace("__FORBIDDEN_INLINE__", _forbidden_patterns_inline())
 
 _AGENT_TESTER = """\
 ---
@@ -288,13 +312,14 @@ def scaffold(project_root: pathlib.Path, dry_run: bool = False) -> None:
 2. **Lint**: Run the linter. Fix all violations.
 3. **Format**: Run the formatter. Fix any formatting issues.
 4. **Security**: Grep for common security anti-patterns:
-   - eval(, exec(, shell=True, subprocess.call.*shell
+   - __FORBIDDEN_INLINE__
    - Hardcoded secrets: password\\s*=\\s*["'], api_key\\s*=\\s*["']
    - SQL injection: f"SELECT, f"INSERT, f"UPDATE, f"DELETE
 5. **State**: Update .codelicious/STATE.md with current test count and status.
 
 Report a summary of each check: pass/fail, issues found, fixes applied.
 """
+_SKILL_VERIFY_ALL = _SKILL_VERIFY_ALL.replace("__FORBIDDEN_INLINE__", _forbidden_patterns_inline())
 
 _SKILL_UPDATE_STATE = """\
 ---
@@ -332,9 +357,7 @@ def scaffold(project_root: pathlib.Path, dry_run: bool = False) -> None:
 When writing or modifying code, always follow these security rules:
 
 ## Never Use
-- eval() or exec() with user-controlled input
-- shell=True in subprocess calls
-- os.system() — use subprocess with shell=False
+__FORBIDDEN_BULLETS__
 - String formatting in SQL queries — use parameterized queries
 - pickle.loads() on untrusted data
 - yaml.load() without SafeLoader
@@ -353,6 +376,7 @@ def scaffold(project_root: pathlib.Path, dry_run: bool = False) -> None:
 - Never log secrets — use masking/redaction
 - Never commit .env files
 """
+_RULES_SECURITY = _RULES_SECURITY.replace("__FORBIDDEN_BULLETS__", _forbidden_patterns_bullets())
 
 
 def _detect_conventions(project_root: pathlib.Path) -> str:
diff --git a/src/codelicious/tools/audit_logger.py b/src/codelicious/tools/audit_logger.py
index a2d9d5ab..1a62a999 100644
--- a/src/codelicious/tools/audit_logger.py
+++ b/src/codelicious/tools/audit_logger.py
@@ -1,8 +1,10 @@
 from __future__ import annotations
 
+import contextlib
 import datetime
 import json
 import logging
+import os
 import sys
 import threading
 from enum import Enum
@@ -98,6 +100,18 @@ def __init__(self, repo_path: Path):
         # Lock that serialises all file writes so concurrent threads cannot
         # interleave entries (Finding 51).
         self._write_lock = threading.Lock()
+        # spec v30 Step 11: cross-process append guard. When two `codelicious`
+        # processes share an audit dir (e.g. CODELICIOUS_AUDIT_DIR pointed at a
+        # shared location) we need an OS-level lock so entries don't interleave
+        # across rotation boundaries. We lock a dedicated file because
+        # rotation may move the audit log itself.
+        self._lock_path = self.log_file.parent / ".audit.lock"
+        try:
+            self._lock_fd = os.open(str(self._lock_path), os.O_CREAT | os.O_RDWR, 0o600)
+        except OSError as exc:
+            console_logger.warning("AuditLogger: cannot open audit lock %s: %s", self._lock_path, exc)
+            self._lock_fd = None
+        self._cross_process_lock_warned: bool = False
         # Keep file handles open for the lifetime of the instance to avoid the
         # overhead of open/close on every tool call (Finding 18).
         # buffering=1 enables line-buffered mode so entries are flushed after
@@ -117,6 +131,54 @@ def __init__(self, repo_path: Path):
             console_logger.warning("AuditLogger: cannot open security log %s: %s", self.security_log_file, exc)
             self._security_fh = None
 
+    @contextlib.contextmanager
+    def _cross_process_lock(self):
+        """Acquire an exclusive OS lock on ``.audit.lock`` for the critical
+        section (spec v30 Step 11).
+
+        On Windows, ``fcntl`` is unavailable; falls back to ``msvcrt.locking``
+        when present, otherwise logs a one-shot WARNING and proceeds with
+        intra-process locking only.
+        """
+        if self._lock_fd is None:
+            yield
+            return
+
+        try:
+            import fcntl
+
+            fcntl.flock(self._lock_fd, fcntl.LOCK_EX)
+            try:
+                yield
+            finally:
+                try:
+                    fcntl.flock(self._lock_fd, fcntl.LOCK_UN)
+                except OSError:
+                    pass
+            return
+        except ImportError:
+            pass  # fall through to msvcrt branch
+
+        try:
+            import msvcrt
+
+            msvcrt.locking(self._lock_fd, msvcrt.LK_LOCK, 1)
+            try:
+                yield
+            finally:
+                try:
+                    # Rewind so the unlock targets the same byte we locked.
+                    os.lseek(self._lock_fd, 0, os.SEEK_SET)
+                    msvcrt.locking(self._lock_fd, msvcrt.LK_UNLCK, 1)
+                except OSError:
+                    pass
+            return
+        except ImportError:
+            if not self._cross_process_lock_warned:
+                console_logger.warning("AuditLogger: cross-process audit-log locking unavailable on this platform")
+                self._cross_process_lock_warned = True
+            yield
+
     def close(self) -> None:
         """Close the persistent file handles.
 
@@ -133,6 +195,12 @@ def close(self) -> None:
                 self._security_fh.close()
         except OSError:
             pass
+        try:
+            if getattr(self, "_lock_fd", None) is not None:
+                os.close(self._lock_fd)
+                self._lock_fd = None
+        except OSError:
+            pass
 
     def __del__(self) -> None:
         """Best-effort cleanup of file handles on garbage collection."""
@@ -155,8 +223,12 @@ def set_current_tool(self, tool_name: str) -> None:
     def _write_to_file(self, level: str, tag: str, message: str):
         timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat()
         try:
-            with self._write_lock:
+            # spec v30 Step 11: hold both intra- and inter-process locks so a
+            # second codelicious process sharing the audit dir cannot interleave
+            # bytes mid-line.
+            with self._write_lock, self._cross_process_lock():
                 self._audit_fh.write(f"[{timestamp}] [{level}] [{tag}] {message}\n")
+                self._audit_fh.flush()
         except Exception as e:
             # Fallback if logging fails, at least print to stdout
             print(f"FATAL: Audit log write failed: {e}")
@@ -187,11 +259,13 @@ def _write_to_security_log(
         full_message = f"{message} ({context})"
         log_line = f"{timestamp} [SECURITY] {event.value}: {full_message}\n"
 
-        # Write to both logs under a single lock to keep entries atomic
+        # spec v30 Step 11: dual-write under both intra- and inter-process locks.
         try:
-            with self._write_lock:
+            with self._write_lock, self._cross_process_lock():
                 self._audit_fh.write(log_line)
+                self._audit_fh.flush()
                 self._security_fh.write(log_line)
+                self._security_fh.flush()
         except Exception as e:
             print(f"FATAL: Security log write failed: {e}")
 
diff --git a/src/codelicious/verifier.py b/src/codelicious/verifier.py
index 03b481e1..5143060d 100644
--- a/src/codelicious/verifier.py
+++ b/src/codelicious/verifier.py
@@ -1170,6 +1170,199 @@ def check_custom_command(
     )
 
 
+# spec v30 Step 8: project-wide coverage floor used by ``verify`` /
+# ``verify_paths`` when no explicit value is supplied. CLI flag overrides
+# pyproject value; pyproject value overrides this default.
+_DEFAULT_MIN_COVERAGE: float = 90.0
+
+
+def _read_min_coverage_from_pyproject(repo: pathlib.Path) -> float | None:
+    """Return ``[tool.codelicious].min_coverage`` from pyproject.toml or None."""
+    pyproject = repo / "pyproject.toml"
+    if not pyproject.is_file():
+        return None
+    try:
+        import tomllib
+    except ImportError:  # Python <3.11 — codelicious requires 3.10+, may lack tomllib
+        try:
+            import tomli as tomllib  # type: ignore[no-redef]
+        except ImportError:
+            return None
+    try:
+        with open(pyproject, "rb") as fh:
+            data = tomllib.load(fh)
+    except (OSError, ValueError):
+        return None
+    section = data.get("tool", {}).get("codelicious", {})
+    val = section.get("min_coverage")
+    if isinstance(val, (int, float)) and val >= 0:
+        return float(val)
+    return None
+
+
+def resolve_min_coverage(
+    repo: pathlib.Path,
+    *,
+    cli_override: float | None = None,
+) -> float:
+    """Resolve the active coverage floor (CLI > pyproject > default)."""
+    if cli_override is not None and cli_override >= 0:
+        return float(cli_override)
+    pyproject_val = _read_min_coverage_from_pyproject(repo)
+    if pyproject_val is not None:
+        return pyproject_val
+    return _DEFAULT_MIN_COVERAGE
+
+
+def _enforce_coverage_floor(result: VerificationResult, min_coverage: float) -> None:
+    """Demote success → failure when measured coverage is below ``min_coverage``.
+
+    Looks for a ``coverage`` :class:`CheckResult` in ``result.checks``; if its
+    message embeds a percentage, compare against the floor. If below, mark the
+    check failed and append the explanatory message.
+    """
+    if min_coverage <= 0:
+        return
+    for check in result.checks:
+        if check.name != "coverage":
+            continue
+        # Combine the short message with details so we catch the percentage no
+        # matter where ``check_coverage`` happened to put it.
+        haystack = " ".join((check.message or "", check.details or ""))
+        match = re.search(r"(\d+(?:\.\d+)?)\s*%", haystack)
+        if not match:
+            return
+        pct = float(match.group(1))
+        if pct < min_coverage:
+            check.passed = False
+            floor_msg = f"coverage {pct:.1f}% below floor {min_coverage:.1f}%"
+            check.message = (check.message + " | " + floor_msg) if check.message else floor_msg
+
+
+def _map_src_to_tests(repo: pathlib.Path, paths: list[pathlib.Path]) -> list[pathlib.Path]:
+    """Map ``src/<module>.py`` paths to ``tests/test_<module>.py`` when present.
+
+    Returns the list of resolved test paths that exist on disk. Used by
+    ``verify_paths`` to scope pytest invocations to the modules a chunk
+    actually touched.
+    """
+    tests_dir = repo / "tests"
+    found: list[pathlib.Path] = []
+    seen: set[pathlib.Path] = set()
+    for raw in paths:
+        p = raw if isinstance(raw, pathlib.Path) else pathlib.Path(str(raw))
+        if p.suffix != ".py":
+            continue
+        candidate = tests_dir / f"test_{p.stem}.py"
+        if candidate.is_file() and candidate not in seen:
+            found.append(candidate)
+            seen.add(candidate)
+    return found
+
+
+def verify_paths(
+    repo: pathlib.Path,
+    paths: list[pathlib.Path],
+    *,
+    full_verify_kwargs: dict | None = None,
+    min_coverage: float | None = None,
+) -> VerificationResult:
+    """Run lint/security/test checks scoped to ``paths`` (spec v29 Step 6).
+
+    * ``ruff check <paths>`` and ``bandit <paths>`` are invoked with the
+      caller-supplied paths so we don't re-scan an entire repo per chunk.
+    * Tests are scoped via ``pytest <inferred-test-paths>`` derived by
+      mapping each ``src/<module>.py`` to ``tests/test_<module>.py`` when
+      that file exists.
+    * When ``paths`` is empty, or no test mapping can be inferred, the
+      function falls back to the project-wide :func:`verify`.
+    """
+    if not paths:
+        kwargs = dict(full_verify_kwargs or {})
+        return verify(repo, **kwargs)
+
+    py_paths = [pathlib.Path(p) for p in paths if str(p).endswith(".py")]
+    test_paths = _map_src_to_tests(repo, py_paths)
+
+    if not test_paths:
+        # Without a test mapping the chunk-scope test run is meaningless;
+        # fall back to the full verifier so we still catch regressions.
+        kwargs = dict(full_verify_kwargs or {})
+        return verify(repo, **kwargs)
+
+    checks: list[CheckResult] = []
+
+    # ── ruff (python only) ────────────────────────────────────────────
+    if shutil.which("ruff"):
+        ruff_targets = [str(p) for p in py_paths if (repo / p).exists() or p.is_absolute()]
+        if ruff_targets:
+            try:
+                proc = _run_with_pgroup_kill(
+                    ["ruff", "check", *ruff_targets],
+                    cwd=str(repo),
+                    capture_output=True,
+                    text=True,
+                    timeout=_LINT_TIMEOUT_S,
+                )
+                checks.append(
+                    CheckResult(
+                        name="lint",
+                        passed=proc.returncode == 0,
+                        message="ruff clean" if proc.returncode == 0 else _truncate(proc.stdout or proc.stderr),
+                    )
+                )
+            except subprocess.TimeoutExpired:
+                checks.append(CheckResult(name="lint", passed=False, message="ruff timed out"))
+
+    # ── bandit (python only) ─────────────────────────────────────────
+    if shutil.which("bandit"):
+        bandit_targets = [str(p) for p in py_paths]
+        if bandit_targets:
+            try:
+                proc = _run_with_pgroup_kill(
+                    ["bandit", "-q", *bandit_targets],
+                    cwd=str(repo),
+                    capture_output=True,
+                    text=True,
+                    timeout=_LINT_TIMEOUT_S,
+                )
+                # bandit returns 1 when issues are found; treat any non-zero
+                # exit as a failure but pass through bandit's own message.
+                checks.append(
+                    CheckResult(
+                        name="security",
+                        passed=proc.returncode == 0,
+                        message="bandit clean" if proc.returncode == 0 else _truncate(proc.stdout or proc.stderr),
+                    )
+                )
+            except subprocess.TimeoutExpired:
+                checks.append(CheckResult(name="security", passed=False, message="bandit timed out"))
+
+    # ── pytest scoped to mapped tests ────────────────────────────────
+    try:
+        proc = _run_with_pgroup_kill(
+            ["python", "-m", "pytest", "-q", "--no-cov", *[str(p) for p in test_paths]],
+            cwd=str(repo),
+            capture_output=True,
+            text=True,
+            timeout=_TEST_TIMEOUT_S,
+        )
+        checks.append(
+            CheckResult(
+                name="tests",
+                passed=proc.returncode == 0,
+                message="tests passed" if proc.returncode == 0 else _truncate(proc.stdout or proc.stderr),
+            )
+        )
+    except subprocess.TimeoutExpired:
+        checks.append(CheckResult(name="tests", passed=False, message="pytest timed out"))
+
+    result = VerificationResult(checks=checks)
+    floor = resolve_min_coverage(repo, cli_override=min_coverage)
+    _enforce_coverage_floor(result, floor)
+    return result
+
+
 def verify(
     project_dir: pathlib.Path,
     timeout: int = 120,
@@ -1281,12 +1474,21 @@ def verify(
             check.message,
         )
 
-    # Log summary
-    passed_count = sum(1 for c in checks if c.passed)
-    failed_count = len(checks) - passed_count
+    result = VerificationResult(checks=checks)
+
+    # spec v30 Step 8: enforce the resolved coverage floor regardless of how
+    # the underlying ``check_coverage`` reported it. ``coverage_threshold==0``
+    # means coverage was not measured this run; skip the gate.
+    if coverage_threshold > 0:
+        floor = resolve_min_coverage(project_dir, cli_override=float(coverage_threshold))
+        _enforce_coverage_floor(result, floor)
+
+    # Log summary (after floor enforcement so counts reflect any demotion)
+    passed_count = sum(1 for c in result.checks if c.passed)
+    failed_count = len(result.checks) - passed_count
     logger.info("Verification complete: %d passed, %d failed", passed_count, failed_count)
 
-    return VerificationResult(checks=checks)
+    return result
 
 
 def _pytest_cov_available() -> bool:
diff --git a/tests/test_agent_runner.py b/tests/test_agent_runner.py
index 9fc3458f..28435c20 100644
--- a/tests/test_agent_runner.py
+++ b/tests/test_agent_runner.py
@@ -11,6 +11,9 @@
 import pytest
 
 from codelicious.agent_runner import (
+    _CLAUDE_RETRY_AFTER_DEFAULT_S,
+    _CLAUDE_RETRY_AFTER_MAX_S,
+    _CLAUDE_RETRY_AFTER_MIN_S,
     _MAX_PROMPT_LENGTH,
     _POLL_INTERVAL_S,
     FORBIDDEN_CLI_FLAGS,
@@ -19,6 +22,7 @@
     _check_agent_errors,
     _enforce_timeout,
     _parse_agent_output,
+    _parse_claude_reset_seconds,
     _process_stream_event,
     _sanitize_prompt,
     _validate_command_flags,
@@ -263,6 +267,53 @@ def test_sanitized_prompt_passed_to_subprocess(
         assert actual_prompt == "-- --dangerous-flag"
 
 
+class TestSpec27Flags:
+    """spec v29 Step 9: chunk-level config can drive --allowedTools / --output-format."""
+
+    def test_allowed_tools_list_renders_csv(self, tmp_path: pathlib.Path) -> None:
+        import types
+
+        config = types.SimpleNamespace(
+            model="",
+            effort="",
+            max_turns=0,
+            allowed_tools=["Edit", "Write", "Bash(pytest:*)"],
+        )
+        cmd = _build_agent_command("test", tmp_path, config, "claude")
+        assert "--allowedTools" in cmd
+        idx = cmd.index("--allowedTools")
+        assert cmd[idx + 1] == "Edit,Write,Bash(pytest:*)"
+
+    def test_output_format_default_is_stream_json(self, tmp_path: pathlib.Path) -> None:
+        import types
+
+        config = types.SimpleNamespace(model="", effort="", max_turns=0)
+        cmd = _build_agent_command("test", tmp_path, config, "claude")
+        idx = cmd.index("--output-format")
+        assert cmd[idx + 1] == "stream-json"
+
+    def test_output_format_override(self, tmp_path: pathlib.Path) -> None:
+        import types
+
+        config = types.SimpleNamespace(model="", effort="", max_turns=0, output_format="text")
+        cmd = _build_agent_command("test", tmp_path, config, "claude")
+        idx = cmd.index("--output-format")
+        assert cmd[idx + 1] == "text"
+
+    def test_default_chunk_config_has_spec27_allow_list(self) -> None:
+        """ClaudeCodeEngine._DEFAULT_ALLOWED_TOOLS matches spec 27 §3.2 verbatim."""
+        from codelicious.engines.claude_engine import _DEFAULT_ALLOWED_TOOLS
+
+        assert "Edit" in _DEFAULT_ALLOWED_TOOLS
+        assert "Write" in _DEFAULT_ALLOWED_TOOLS
+        assert "Bash(git status:*)" in _DEFAULT_ALLOWED_TOOLS
+        assert "Bash(pytest:*)" in _DEFAULT_ALLOWED_TOOLS
+        assert "Bash(ruff:*)" in _DEFAULT_ALLOWED_TOOLS
+        assert "Read" in _DEFAULT_ALLOWED_TOOLS
+        assert "Glob" in _DEFAULT_ALLOWED_TOOLS
+        assert "Grep" in _DEFAULT_ALLOWED_TOOLS
+
+
 class TestDangerousFlagNeverPresent:
     """Tests for S20-P1-3: --dangerously-skip-permissions is permanently removed."""
 
@@ -409,6 +460,50 @@ def test_returncode_zero_never_raises(self) -> None:
         assert _check_agent_errors(0, [], ["auth failed somehow\n"]) is None
 
 
+class TestParseClaudeResetSeconds:
+    """Spec v29 Step 5: parse Claude rate-limit reset windows from CLI output."""
+
+    def test_no_match_returns_none(self) -> None:
+        assert _parse_claude_reset_seconds("rate limit exceeded") is None
+
+    def test_empty_returns_none(self) -> None:
+        assert _parse_claude_reset_seconds("") is None
+
+    def test_resets_in_seconds(self) -> None:
+        assert _parse_claude_reset_seconds("Limit resets in 30 seconds") == 30.0
+
+    def test_resets_in_short_form(self) -> None:
+        assert _parse_claude_reset_seconds("resets in 45s") == 45.0
+
+    def test_try_again_in_minutes(self) -> None:
+        assert _parse_claude_reset_seconds("Please try again in 5 minutes.") == 300.0
+
+    def test_try_again_short_form(self) -> None:
+        assert _parse_claude_reset_seconds("try again in 2m") == 120.0
+
+    def test_retry_after_header(self) -> None:
+        assert _parse_claude_reset_seconds("Retry-After: 90") == 90.0
+
+    def test_value_below_floor_clamped(self) -> None:
+        # 5s is below the 10s floor; should clamp up.
+        assert _parse_claude_reset_seconds("resets in 5 seconds") == _CLAUDE_RETRY_AFTER_MIN_S
+
+    def test_value_above_cap_clamped(self) -> None:
+        assert _parse_claude_reset_seconds("try again in 999 minutes") == _CLAUDE_RETRY_AFTER_MAX_S
+
+    def test_check_agent_errors_uses_parsed_value(self) -> None:
+        """The rate-limit branch should pick up the parsed reset window."""
+        with pytest.raises(ClaudeRateLimitError) as exc_info:
+            _check_agent_errors(1, [], ["rate limit hit; resets in 30 seconds\n"])
+        assert exc_info.value.retry_after_s == 30.0
+
+    def test_check_agent_errors_falls_back_to_default(self) -> None:
+        """When no reset window is parseable, default retry_after_s applies."""
+        with pytest.raises(ClaudeRateLimitError) as exc_info:
+            _check_agent_errors(1, [], ["rate limit exceeded\n"])
+        assert exc_info.value.retry_after_s == _CLAUDE_RETRY_AFTER_DEFAULT_S
+
+
 class TestParseAgentOutput:
     """Unit tests for _parse_agent_output (Finding 46)."""
 
diff --git a/tests/test_audit_logger.py b/tests/test_audit_logger.py
new file mode 100644
index 00000000..fc6c2d36
--- /dev/null
+++ b/tests/test_audit_logger.py
@@ -0,0 +1,202 @@
+"""Tests for ``codelicious.tools.audit_logger`` (spec v29 Step 12).
+
+Covers initialization, intent/outcome routing, security-event dual-write to
+``audit.log`` and ``security.log``, thread-safe appends under contention,
+context-manager lifecycle, and the SecurityEvent enum surface.
+"""
+
+from __future__ import annotations
+
+import threading
+from pathlib import Path
+
+import pytest
+
+from codelicious.tools.audit_logger import AuditLogger, SecurityEvent
+
+
+@pytest.fixture
+def audit(tmp_path: Path) -> AuditLogger:
+    logger = AuditLogger(tmp_path)
+    yield logger
+    logger.close()
+
+
+class TestAuditLoggerInit:
+    def test_creates_log_files(self, tmp_path: Path) -> None:
+        AuditLogger(tmp_path).close()
+        assert (tmp_path / ".codelicious" / "audit.log").exists()
+        assert (tmp_path / ".codelicious" / "security.log").exists()
+
+    def test_init_creates_codelicious_dir(self, tmp_path: Path) -> None:
+        target = tmp_path / "nested" / "repo"
+        target.mkdir(parents=True)
+        AuditLogger(target).close()
+        assert (target / ".codelicious").is_dir()
+
+
+class TestToolDispatchLogging:
+    def test_log_tool_intent_writes_dispatch_line(self, audit: AuditLogger, tmp_path: Path) -> None:
+        audit.log_tool_intent("read_file", {"path": "src/main.py"})
+        body = (tmp_path / ".codelicious" / "audit.log").read_text()
+        assert "TOOL_DISPATCH" in body
+        assert "read_file" in body
+        assert "src/main.py" in body
+
+    def test_log_tool_outcome_success(self, audit: AuditLogger, tmp_path: Path) -> None:
+        audit.log_tool_outcome("read_file", {"success": True, "stdout": "hello world"})
+        body = (tmp_path / ".codelicious" / "audit.log").read_text()
+        assert "TOOL_SUCCESS" in body
+        assert "INFO" in body
+
+    def test_log_tool_outcome_failure(self, audit: AuditLogger, tmp_path: Path) -> None:
+        audit.log_tool_outcome("write_file", {"success": False, "stderr": "permission denied"})
+        body = (tmp_path / ".codelicious" / "audit.log").read_text()
+        assert "TOOL_FAILED" in body
+        assert "ERROR" in body
+        assert "permission denied" in body
+
+
+class TestSecurityEventLogging:
+    def test_writes_to_both_audit_and_security(self, audit: AuditLogger, tmp_path: Path) -> None:
+        audit.log_security_event(SecurityEvent.PATH_TRAVERSAL_BLOCKED, "blocked ../etc/passwd")
+        audit_body = (tmp_path / ".codelicious" / "audit.log").read_text()
+        security_body = (tmp_path / ".codelicious" / "security.log").read_text()
+        assert "PATH_TRAVERSAL_BLOCKED" in audit_body
+        assert "PATH_TRAVERSAL_BLOCKED" in security_body
+        assert "blocked ../etc/passwd" in security_body
+
+    def test_iteration_and_tool_context_in_security_line(self, audit: AuditLogger, tmp_path: Path) -> None:
+        audit.set_iteration(7)
+        audit.set_current_tool("write_file")
+        audit.log_security_event(SecurityEvent.EXTENSION_BLOCKED, "blocked .exe")
+        body = (tmp_path / ".codelicious" / "security.log").read_text()
+        assert "iteration 7" in body
+        assert "write_file" in body
+
+    def test_explicit_overrides_take_precedence(self, audit: AuditLogger, tmp_path: Path) -> None:
+        audit.set_iteration(1)
+        audit.set_current_tool("read_file")
+        audit.log_security_event(
+            SecurityEvent.COMMAND_DENIED,
+            "denied rm -rf /",
+            iteration=99,
+            tool="run_command",
+        )
+        body = (tmp_path / ".codelicious" / "security.log").read_text()
+        assert "iteration 99" in body
+        assert "run_command" in body
+
+    def test_log_sandbox_violation_with_explicit_event_routes_to_security_log(
+        self, audit: AuditLogger, tmp_path: Path
+    ) -> None:
+        audit.log_sandbox_violation("denied symlink escape", event_type=SecurityEvent.SYMLINK_ESCAPE_BLOCKED)
+        body = (tmp_path / ".codelicious" / "security.log").read_text()
+        assert "SYMLINK_ESCAPE_BLOCKED" in body
+
+    def test_log_sandbox_violation_without_event_falls_back_to_audit(self, audit: AuditLogger, tmp_path: Path) -> None:
+        audit.log_sandbox_violation("legacy untyped violation")
+        audit_body = (tmp_path / ".codelicious" / "audit.log").read_text()
+        security_body = (tmp_path / ".codelicious" / "security.log").read_text()
+        assert "SANDBOX TRAP" in audit_body
+        # Untyped violations are not security-event-typed, so they don't
+        # land in security.log.
+        assert "legacy untyped violation" not in security_body
+
+
+class TestSecurityEventEnum:
+    def test_known_members_are_strings(self) -> None:
+        assert SecurityEvent.COMMAND_DENIED.value == "COMMAND_DENIED"
+        assert SecurityEvent.METACHAR_BLOCKED.value == "METACHAR_BLOCKED"
+
+    def test_str_inheritance(self) -> None:
+        # SecurityEvent inherits from str → comparable to plain strings.
+        assert SecurityEvent.PATH_TRAVERSAL_BLOCKED == "PATH_TRAVERSAL_BLOCKED"
+
+
+class TestThreadSafeAppend:
+    def test_concurrent_appends_preserve_all_lines(self, tmp_path: Path) -> None:
+        """10 threads × 100 entries each produce 1000 distinct lines, none truncated."""
+        logger = AuditLogger(tmp_path)
+        try:
+            entries_per_thread = 100
+            thread_count = 10
+            barrier = threading.Barrier(thread_count)
+
+            def writer(worker_id: int) -> None:
+                barrier.wait()
+                for i in range(entries_per_thread):
+                    logger.log_tool_intent("noop", {"worker": worker_id, "i": i})
+
+            threads = [threading.Thread(target=writer, args=(w,)) for w in range(thread_count)]
+            for t in threads:
+                t.start()
+            for t in threads:
+                t.join(timeout=15)
+        finally:
+            logger.close()
+
+        lines = (tmp_path / ".codelicious" / "audit.log").read_text().splitlines()
+        assert len(lines) == thread_count * entries_per_thread
+        # No line should be truncated mid-write — every line ends with '...'
+        # because log_tool_intent appends ellipsis-free Intent: prefix; verify
+        # presence of the dispatch tag instead.
+        for line in lines:
+            assert "TOOL_DISPATCH" in line
+
+
+def _mp_writer(repo_path_str: str, worker_id: int, count: int) -> None:  # pragma: no cover
+    """Multiprocessing target — must be module-level for pickling."""
+    from codelicious.tools.audit_logger import AuditLogger
+
+    logger = AuditLogger(Path(repo_path_str))
+    try:
+        for i in range(count):
+            logger.log_tool_intent("noop", {"worker": worker_id, "i": i})
+    finally:
+        logger.close()
+
+
+class TestCrossProcessAppend:
+    """spec v30 Step 11: ``fcntl.flock`` keeps audit lines from interleaving across processes."""
+
+    def test_concurrent_processes_do_not_interleave(self, tmp_path: Path) -> None:
+        import multiprocessing
+        import sys as _sys
+
+        if _sys.platform == "win32":
+            import pytest as _pytest
+
+            _pytest.skip("POSIX-only test; Windows uses msvcrt fallback or no locking")
+
+        workers = 4
+        per_worker = 50
+        # Use spawn to keep the child interpreter clean of pytest state.
+        ctx = multiprocessing.get_context("spawn")
+        procs = [ctx.Process(target=_mp_writer, args=(str(tmp_path), w, per_worker)) for w in range(workers)]
+        for p in procs:
+            p.start()
+        for p in procs:
+            p.join(timeout=20)
+            assert p.exitcode == 0, f"worker {p.pid} exited with {p.exitcode}"
+
+        lines = (tmp_path / ".codelicious" / "audit.log").read_text().splitlines()
+        assert len(lines) == workers * per_worker
+        for line in lines:
+            assert "TOOL_DISPATCH" in line, f"interleaved line: {line!r}"
+
+
+class TestLifecycle:
+    def test_context_manager_closes_handles(self, tmp_path: Path) -> None:
+        with AuditLogger(tmp_path) as logger:
+            logger.log_tool_intent("noop", {})
+            assert logger._audit_fh is not None
+            assert not logger._audit_fh.closed
+        assert logger._audit_fh.closed
+        assert logger._security_fh.closed
+
+    def test_double_close_is_safe(self, tmp_path: Path) -> None:
+        logger = AuditLogger(tmp_path)
+        logger.close()
+        # Should not raise.
+        logger.close()
diff --git a/tests/test_chunker.py b/tests/test_chunker.py
index 094b1c1e..f6c1c559 100644
--- a/tests/test_chunker.py
+++ b/tests/test_chunker.py
@@ -329,3 +329,157 @@ def test_markdown_code_fence_stripped(self, tmp_path: pathlib.Path) -> None:
         chunks = chunk_spec_with_llm(spec, tmp_path, llm)
         assert len(chunks) == 1
         assert chunks[0].title == "Task"
+
+    def test_too_many_chunks_truncated_with_warning(
+        self, tmp_path: pathlib.Path, caplog: pytest.LogCaptureFixture
+    ) -> None:
+        """LLM producing > _MAX_CHUNKS_PER_SPEC is truncated to the cap with a WARNING (spec v29 Step 7)."""
+        from codelicious.chunker import _MAX_CHUNKS_PER_SPEC, chunk_spec_with_llm
+
+        spec = self._write_spec(tmp_path, "# Spec\n\nMassive.\n")
+        items = ",".join(
+            f'{{"title": "T{i}", "description": "d", "files": [], "depends_on_indices": [], "validation": ""}}'
+            for i in range(_MAX_CHUNKS_PER_SPEC + 1)
+        )
+        llm = self._mock_llm(f"[{items}]")
+
+        with caplog.at_level("WARNING", logger="codelicious.chunker"):
+            chunks = chunk_spec_with_llm(spec, tmp_path, llm)
+        assert len(chunks) == _MAX_CHUNKS_PER_SPEC
+        assert any("truncating" in r.message.lower() for r in caplog.records)
+
+    def test_non_array_json_falls_back(self, tmp_path: pathlib.Path) -> None:
+        """JSON object (not array) at top level falls back to deterministic."""
+        spec = self._write_spec(tmp_path, "# Spec\n\n## P1\n\n- [ ] Single task\n")
+        llm = self._mock_llm('{"title": "wrong shape"}')
+
+        from codelicious.chunker import chunk_spec_with_llm
+
+        chunks = chunk_spec_with_llm(spec, tmp_path, llm)
+        assert len(chunks) == 1
+        assert "Single task" in chunks[0].title
+
+    def test_short_spec_uses_single_window(self, tmp_path: pathlib.Path) -> None:
+        """Short specs make exactly one LLM call (spec v29 Step 7)."""
+        spec = self._write_spec(tmp_path, "# Short\n\n## P1\n\n- [ ] X\n")
+        llm = self._mock_llm(
+            '[{"title": "X", "description": "d", "files": [], "depends_on_indices": [], "validation": ""}]'
+        )
+
+        from codelicious.chunker import chunk_spec_with_llm
+
+        chunk_spec_with_llm(spec, tmp_path, llm)
+        assert llm.chat_completion.call_count == 1
+
+    def test_oversized_spec_makes_multiple_windowed_calls(
+        self, tmp_path: pathlib.Path, caplog: pytest.LogCaptureFixture
+    ) -> None:
+        """A spec well over _LLM_WINDOW_SIZE drives multiple LLM calls (spec v29 Step 7)."""
+        from codelicious.chunker import _LLM_WINDOW_SIZE, chunk_spec_with_llm
+
+        big = "x " * (_LLM_WINDOW_SIZE * 3)
+        spec = self._write_spec(tmp_path, big)
+        llm = self._mock_llm(
+            '[{"title": "T", "description": "d", "files": [], "depends_on_indices": [], "validation": ""}]'
+        )
+
+        with caplog.at_level("WARNING", logger="codelicious.chunker"):
+            chunks = chunk_spec_with_llm(spec, tmp_path, llm)
+        assert llm.chat_completion.call_count >= 2
+        # Title is the same across windows; dedup keeps a single chunk.
+        assert len(chunks) == 1
+        assert any("splitting into" in r.message for r in caplog.records)
+
+    def test_split_spec_helper_caps_window_count(self) -> None:
+        """``_split_spec_for_llm`` never produces more than _LLM_MAX_WINDOWS windows."""
+        from codelicious.chunker import _LLM_MAX_WINDOWS, _LLM_WINDOW_SIZE, _split_spec_for_llm
+
+        huge = "y " * (_LLM_WINDOW_SIZE * 50)
+        windows = _split_spec_for_llm(huge)
+        assert len(windows) == _LLM_MAX_WINDOWS
+
+    def test_self_referential_dep_dropped(self, tmp_path: pathlib.Path) -> None:
+        """A chunk depending on itself is silently dropped (i != idx guard)."""
+        spec = self._write_spec(tmp_path, "# Spec\n\nWork.\n")
+        llm = self._mock_llm(
+            '[{"title": "Solo", "description": "d", "files": [], "depends_on_indices": [0], "validation": ""}]'
+        )
+
+        from codelicious.chunker import chunk_spec_with_llm
+
+        chunks = chunk_spec_with_llm(spec, tmp_path, llm)
+        assert len(chunks) == 1
+        assert chunks[0].depends_on == []
+
+
+# ─────────────────────────────────────────────────────────────────────────
+# spec v30 Step 6: token-budget-aware chunk sizing
+# ─────────────────────────────────────────────────────────────────────────
+
+
+class TestTokenBudget:
+    def _wc(self, files: list[str], chunk_id: str = "spec-1-chunk-01") -> WorkChunk:
+        from codelicious.chunker import WorkChunk
+
+        return WorkChunk(
+            id=chunk_id,
+            spec_path=pathlib.Path("docs/specs/01.md"),
+            title="t",
+            description="d",
+            depends_on=[],
+            estimated_files=files,
+            validation="",
+        )
+
+    def test_under_budget_unchanged(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.chunker import enforce_token_budget
+
+        wc = self._wc([])
+        out = enforce_token_budget([wc], tmp_path, engines=["claude"])
+        assert out == [wc]
+
+    def test_over_budget_chunk_is_split(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.chunker import enforce_token_budget
+
+        # 4 files, each 30k chars → 120k chars / 4 = 30k tokens. HF budget 24k.
+        files = []
+        for i in range(4):
+            p = tmp_path / f"big_{i}.py"
+            p.write_text("x" * 30_000)
+            files.append(p.name)
+        wc = self._wc(files)
+        out = enforce_token_budget([wc], tmp_path, engines=["huggingface"])
+        # Splits at least once → two or more chunks; second one depends on first.
+        assert len(out) >= 2
+        assert out[0].id == wc.id
+        assert wc.id in out[1].depends_on
+
+    def test_recursive_split_caps_at_depth(self, tmp_path: pathlib.Path, caplog: pytest.LogCaptureFixture) -> None:
+        """Even a runaway chunk produces at most 2**depth+1 sub-chunks."""
+        from codelicious.chunker import _MAX_CHUNK_SPLIT_DEPTH, enforce_token_budget
+
+        # One enormous file dominates the budget; can't split below 1 file.
+        big = tmp_path / "huge.py"
+        big.write_text("y" * 200_000)
+        wc = self._wc([big.name])
+        with caplog.at_level("WARNING", logger="codelicious.chunker"):
+            out = enforce_token_budget([wc], tmp_path, engines=["huggingface"])
+        # Single-file chunk can't subdivide further; original is dispatched anyway.
+        assert len(out) == 1
+        assert any("dispatch anyway" in r.message for r in caplog.records)
+        assert _MAX_CHUNK_SPLIT_DEPTH == 3
+
+    def test_split_preserves_total_file_coverage(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.chunker import enforce_token_budget
+
+        files = []
+        for i in range(4):
+            p = tmp_path / f"file_{i}.py"
+            p.write_text("z" * 30_000)
+            files.append(p.name)
+        wc = self._wc(files)
+        out = enforce_token_budget([wc], tmp_path, engines=["huggingface"])
+        covered: list[str] = []
+        for c in out:
+            covered.extend(c.estimated_files)
+        assert sorted(covered) == sorted(files)
diff --git a/tests/test_cli.py b/tests/test_cli.py
index 74f431eb..12e474cf 100644
--- a/tests/test_cli.py
+++ b/tests/test_cli.py
@@ -27,8 +27,8 @@
     main,
     setup_logger,
 )
-from codelicious.engines.base import BuildResult
 from codelicious.git.git_orchestrator import GitManager
+from codelicious.orchestrator import OrchestratorResult
 
 
 @pytest.fixture
@@ -46,12 +46,6 @@ def mock_successful_engine():
 
     engine = mock.MagicMock()
     engine.name = "mock-engine"
-    engine.run_build_cycle.return_value = BuildResult(
-        success=True,
-        message="Build completed successfully",
-        session_id="test-123",
-        elapsed_s=10.5,
-    )
     # v2 orchestrator chunk methods
     engine.execute_chunk.return_value = ChunkResult(success=True, files_modified=["src/foo.py"], message="done")
     engine.verify_chunk.return_value = ChunkResult(success=True, message="passed")
@@ -66,12 +60,6 @@ def mock_failed_engine():
 
     engine = mock.MagicMock()
     engine.name = "mock-engine"
-    engine.run_build_cycle.return_value = BuildResult(
-        success=False,
-        message="Build failed: test error",
-        session_id="test-456",
-        elapsed_s=5.0,
-    )
     # v2 orchestrator chunk methods
     engine.execute_chunk.return_value = ChunkResult(success=False, message="failed")
     engine.verify_chunk.return_value = ChunkResult(success=False, message="failed")
@@ -333,7 +321,7 @@ def _skip_dep_validation(self):
     def test_no_incomplete_specs_exits_zero_without_build(
         self, mock_repo: Path, mock_successful_engine, mock_git_manager
     ):
-        """When discover_incomplete_specs returns [], main() exits 0 without running engine.run_build_cycle."""
+        """When discover_incomplete_specs returns [], main() exits 0 without dispatching any chunk."""
         # Patch both walk_for_specs (for the banner) and discover_incomplete_specs (for the guard)
         # to return empty lists, simulating a fully-complete repo.
         with mock.patch("codelicious.cli.select_engine", return_value=mock_successful_engine):
@@ -346,7 +334,7 @@ def test_no_incomplete_specs_exits_zero_without_build(
                                     main()
 
         assert exc_info.value.code == 0
-        mock_successful_engine.run_build_cycle.assert_not_called()
+        mock_successful_engine.execute_chunk.assert_not_called()
 
 
 class TestPrintBanner:
@@ -416,7 +404,7 @@ class TestPrintResult:
 
     def test_print_result_success(self, tmp_path: Path):
         """_print_result prints BUILD COMPLETE for a successful result."""
-        result = BuildResult(success=True, message="Done.", session_id="s1", elapsed_s=5.0)
+        result = OrchestratorResult(success=True, message="Done.", elapsed_s=5.0)
 
         captured = io.StringIO()
         with mock.patch("codelicious.cli.walk_for_specs", return_value=[]):
@@ -435,7 +423,7 @@ def test_print_result_success(self, tmp_path: Path):
 
     def test_print_result_failure(self, tmp_path: Path):
         """_print_result prints BUILD FINISHED (with issues) for a failed result."""
-        result = BuildResult(success=False, message="Some error.", session_id="s2", elapsed_s=3.0)
+        result = OrchestratorResult(success=False, message="Some error.", elapsed_s=3.0)
 
         captured = io.StringIO()
         with mock.patch("codelicious.cli.walk_for_specs", return_value=[]), mock.patch("sys.stdout", captured):
@@ -453,7 +441,7 @@ def test_print_result_failure(self, tmp_path: Path):
 
     def test_print_result_elapsed_time_formatted(self, tmp_path: Path):
         """_print_result formats elapsed time in minutes and seconds for long runs."""
-        result = BuildResult(success=True, message="", session_id="s3", elapsed_s=90.0)
+        result = OrchestratorResult(success=True, message="", elapsed_s=90.0)
 
         captured = io.StringIO()
         with mock.patch("codelicious.cli.walk_for_specs", return_value=[]), mock.patch("sys.stdout", captured):
@@ -845,6 +833,15 @@ def test_max_commits_per_pr_default_8(self):
             opts = _parse_args(sys.argv)
         assert opts["max_commits_per_pr"] == 8
 
+    def test_max_loc_per_pr_default_matches_constant(self):
+        """max_loc_per_pr default tracks the _DEFAULT_PR_LOC_CAP constant (spec v29 Step 1)."""
+        from codelicious.cli import _DEFAULT_PR_COMMIT_CAP, _DEFAULT_PR_LOC_CAP
+
+        with mock.patch.object(sys, "argv", ["codelicious", "/tmp/repo"]):
+            opts = _parse_args(sys.argv)
+        assert opts["max_loc_per_pr"] == _DEFAULT_PR_LOC_CAP == 250
+        assert opts["max_commits_per_pr"] == _DEFAULT_PR_COMMIT_CAP == 8
+
     def test_max_commits_per_pr_over_100_exits(self):
         """--max-commits-per-pr > 100 exits with code 2."""
         with mock.patch.object(sys, "argv", ["codelicious", "/tmp/repo", "--max-commits-per-pr", "101"]):
@@ -970,9 +967,10 @@ def fake_run(args, **kw):
 
         assert info["transport"] == "ssh"
         assert info["ssh_key_loaded"] is True
+        assert info["ssh_agent_state"] == "keys_loaded"
 
     def test_ssh_repo_with_no_keys(self, tmp_path: Path) -> None:
-        """SSH remote with no agent keys → ssh_key_loaded False."""
+        """SSH remote with no agent keys → ssh_key_loaded False, state=empty."""
         cls = self.__class__
 
         def fake_run(args, **kw):
@@ -989,6 +987,46 @@ def fake_run(args, **kw):
 
         assert info["transport"] == "ssh"
         assert info["ssh_key_loaded"] is False
+        assert info["ssh_agent_state"] == "empty"
+
+    def test_ssh_repo_with_no_agent(self, tmp_path: Path) -> None:
+        """SSH remote, ssh-add exit 2 → ssh_agent_state=no_agent (spec v29 Step 14)."""
+        cls = self.__class__
+
+        def fake_run(args, **kw):
+            if args[:2] == ["git", "-C"] and "remote.origin.url" in args:
+                return cls._result(0, "git@github.com:o/r.git\n")
+            if args[:2] == ["git", "-C"] and "commit.gpgsign" in args:
+                return cls._result(1, "")
+            if args[0] == "ssh-add":
+                return cls._result(2, "", "Could not open a connection to your authentication agent.\n")
+            raise AssertionError(f"unexpected call {args}")
+
+        with mock.patch("subprocess.run", side_effect=fake_run):
+            info = _probe_git_credentials(tmp_path)
+
+        assert info["transport"] == "ssh"
+        assert info["ssh_key_loaded"] is False
+        assert info["ssh_agent_state"] == "no_agent"
+
+    def test_ssh_repo_probe_unknown_on_oserror(self, tmp_path: Path) -> None:
+        """OSError on ssh-add → ssh_agent_state=unknown, conservative not-loaded."""
+        cls = self.__class__
+
+        def fake_run(args, **kw):
+            if args[:2] == ["git", "-C"] and "remote.origin.url" in args:
+                return cls._result(0, "git@github.com:o/r.git\n")
+            if args[:2] == ["git", "-C"] and "commit.gpgsign" in args:
+                return cls._result(1, "")
+            if args[0] == "ssh-add":
+                raise OSError("ssh-add binary missing")
+            raise AssertionError(f"unexpected call {args}")
+
+        with mock.patch("subprocess.run", side_effect=fake_run):
+            info = _probe_git_credentials(tmp_path)
+
+        assert info["ssh_key_loaded"] is False
+        assert info["ssh_agent_state"] == "unknown"
 
     def test_gpgsign_true_with_warm_agent(self, tmp_path: Path) -> None:
         """commit.gpgsign=true + secret keys present → gpg_agent_warm True."""
@@ -1392,3 +1430,267 @@ def test_spec_override_builds_single_spec(self, mock_repo: Path, mock_git_manage
                         main()
 
         mock_successful_engine.execute_chunk.assert_called()
+
+
+# -- spec v30 Step 4: SIGTERM graceful shutdown -----------------------------
+
+
+class TestSigtermIntegration:
+    """spec v30 Step 4: SIGTERM raises SystemExit(143) and releases the run lock."""
+
+    def test_sigterm_handler_raises_system_exit_143(self) -> None:
+        """The handler exits with code 143 (128 + SIGTERM=15) per CLAUDE.md."""
+        from codelicious.cli import _handle_sigterm
+
+        with pytest.raises(SystemExit) as exc:
+            _handle_sigterm(15, None)
+        assert exc.value.code == 143
+
+    def test_subprocess_exits_within_grace_window(self, tmp_path: Path) -> None:
+        """A child process holding the run lock terminates cleanly on SIGTERM."""
+        import signal as _signal
+        import subprocess
+        import sys as _sys
+        import time as _time
+
+        if _sys.platform == "win32":
+            pytest.skip("POSIX-only test")
+
+        repo_root = Path(__file__).resolve().parent.parent
+        script = (
+            "import sys, time, signal;"
+            "sys.path.insert(0, 'src');"
+            "from codelicious.cli import _handle_sigterm, _run_lock;"
+            "import pathlib;"
+            "signal.signal(signal.SIGTERM, _handle_sigterm);"
+            f"cm = _run_lock(pathlib.Path({str(tmp_path)!r}));"
+            "cm.__enter__();"
+            "time.sleep(30)"
+        )
+        proc = subprocess.Popen(
+            [_sys.executable, "-c", script],
+            cwd=repo_root,
+            stdout=subprocess.PIPE,
+            stderr=subprocess.PIPE,
+        )
+
+        # Give the child time to acquire the lock.
+        for _ in range(40):
+            if (tmp_path / ".codelicious" / "run.lock").exists():
+                break
+            _time.sleep(0.05)
+
+        proc.send_signal(_signal.SIGTERM)
+        try:
+            _stdout, stderr = proc.communicate(timeout=8)
+        except subprocess.TimeoutExpired:
+            proc.kill()
+            _stdout, stderr = proc.communicate()
+            pytest.fail(f"Child did not exit within 8 s after SIGTERM. stderr={stderr.decode(errors='replace')[:500]}")
+
+        assert proc.returncode == 143, f"expected 143, got {proc.returncode}: {stderr.decode(errors='replace')[:500]}"
+        # The atexit hook in _run_lock should have removed the lockfile.
+        assert not (tmp_path / ".codelicious" / "run.lock").exists()
+
+
+# -- spec v30 Step 12: postmortem dump on abnormal exit ---------------------
+
+
+class TestPostmortem:
+    """spec v30 Step 12: ``_write_postmortem`` summarises run state on abort."""
+
+    def test_writes_markdown_with_counts_and_resume(self, tmp_path: Path) -> None:
+        import json
+
+        from codelicious.cli import _write_postmortem
+
+        state = tmp_path / ".codelicious" / "state"
+        state.mkdir(parents=True)
+        (state / "spec-27.json").write_text(
+            json.dumps(
+                {
+                    "chunks": {
+                        "spec-27-chunk-01": {"status": "merged", "title": "A"},
+                        "spec-27-chunk-02": {"status": "failed", "title": "B"},
+                        "spec-27-chunk-03": {"status": "failed", "title": "C"},
+                    }
+                }
+            )
+        )
+
+        log = tmp_path / "build.log"
+        log.write_text("\n".join(f"line-{i}" for i in range(75)), encoding="utf-8")
+
+        path = _write_postmortem(tmp_path, abort_reason="rate-limit", log_path=log, spec_arg=".")
+        assert path is not None
+        body = path.read_text()
+        assert "Abort reason:** rate-limit" in body
+        assert "merged: 1" in body
+        assert "failed: 2" in body
+        assert "Failed chunks" in body
+        assert "- B" in body
+        assert "Log tail" in body
+        assert "line-74" in body
+        # Resume hint is present and uses the no-reset form for rate-limit.
+        assert "codelicious build ." in body
+        assert "--reset-ledger" not in body
+
+    def test_handles_missing_state_dir(self, tmp_path: Path) -> None:
+        from codelicious.cli import _write_postmortem
+
+        path = _write_postmortem(tmp_path, abort_reason="exception", log_path=None)
+        assert path is not None
+        body = path.read_text()
+        # Empty state → all counts zero, but the file still writes cleanly.
+        assert "merged: 0" in body
+        assert "failed: 0" in body
+
+    def test_corrupt_ledger_suggests_reset(self, tmp_path: Path) -> None:
+        from codelicious.cli import _write_postmortem
+
+        path = _write_postmortem(tmp_path, abort_reason="ledger-corrupt", log_path=None)
+        body = path.read_text()
+        assert "--reset-ledger" in body
+
+
+# -- spec v30 Step 7: atomic latest.log symlink update ----------------------
+
+
+class TestAtomicSymlinkUpdate:
+    """spec v30 Step 7: ``_atomic_symlink_update`` swaps via tmp + os.replace."""
+
+    def test_creates_link_when_missing(self, tmp_path: Path) -> None:
+        from codelicious.cli import _atomic_symlink_update
+
+        target = tmp_path / "build-1.log"
+        target.write_text("hi")
+        link = tmp_path / "latest.log"
+        _atomic_symlink_update(link, target.name)
+        assert link.is_symlink()
+        assert link.resolve() == target.resolve()
+
+    def test_repeated_update_replaces_target(self, tmp_path: Path) -> None:
+        from codelicious.cli import _atomic_symlink_update
+
+        a = tmp_path / "build-1.log"
+        b = tmp_path / "build-2.log"
+        a.write_text("1")
+        b.write_text("2")
+        link = tmp_path / "latest.log"
+        _atomic_symlink_update(link, a.name)
+        _atomic_symlink_update(link, b.name)
+        assert link.is_symlink()
+        assert link.resolve() == b.resolve()
+
+    def test_no_dangling_tmp_after_update(self, tmp_path: Path) -> None:
+        from codelicious.cli import _atomic_symlink_update
+
+        target = tmp_path / "build-1.log"
+        target.write_text("hi")
+        link = tmp_path / "latest.log"
+        _atomic_symlink_update(link, target.name)
+        # No leftover *.tmp siblings.
+        leftovers = list(tmp_path.glob("latest.log.*.tmp"))
+        assert leftovers == []
+
+
+# -- spec v30 Step 1: per-repo run lockfile ---------------------------------
+
+
+class TestRunLock:
+    """spec v30 Step 1: ``_run_lock`` prevents concurrent codelicious runs on a repo."""
+
+    def test_acquires_and_releases(self, tmp_path: Path) -> None:
+        import os as _os
+
+        from codelicious.cli import _run_lock
+
+        with _run_lock(tmp_path) as lock_path:
+            assert lock_path is not None
+            assert lock_path.exists()
+            assert lock_path.read_text().strip() == str(_os.getpid())
+
+        # After exit the lock file is removed.
+        assert not (tmp_path / ".codelicious" / "run.lock").exists()
+
+    def test_second_acquire_in_subprocess_exits_75(self, tmp_path: Path) -> None:
+        """A second invocation while the first holds the lock exits 75 (EX_TEMPFAIL)."""
+        import subprocess
+        import sys
+
+        from codelicious.cli import _run_lock
+
+        # Hold the lock and try to grab it again from a child Python process.
+        with _run_lock(tmp_path):
+            child = subprocess.run(
+                [
+                    sys.executable,
+                    "-c",
+                    (
+                        "import sys; sys.path.insert(0, 'src'); "
+                        "from codelicious.cli import _run_lock; "
+                        f"cm = _run_lock(__import__('pathlib').Path({str(tmp_path)!r})); cm.__enter__()"
+                    ),
+                ],
+                capture_output=True,
+                text=True,
+                cwd=Path(__file__).resolve().parent.parent,
+                timeout=10,
+            )
+        assert child.returncode == 75
+        assert "another codelicious run is in progress" in child.stderr
+
+
+# -- spec v30 Step 3: --endpoint / LLM_ENDPOINT validation -------------------
+
+
+class TestEndpointValidation:
+    """spec v30 Step 3: insecure or credential-bearing endpoints rejected at CLI layer."""
+
+    def test_empty_url_accepted(self) -> None:
+        from codelicious.cli import _validate_endpoint_url_strict
+
+        # Should not raise / sys.exit
+        _validate_endpoint_url_strict("")
+
+    def test_https_url_accepted(self) -> None:
+        from codelicious.cli import _validate_endpoint_url_strict
+
+        _validate_endpoint_url_strict("https://api.example.com/v1/chat")
+
+    def test_http_url_rejected(self, capsys: pytest.CaptureFixture[str]) -> None:
+        from codelicious.cli import _validate_endpoint_url_strict
+
+        with pytest.raises(SystemExit) as exc:
+            _validate_endpoint_url_strict("http://internal-proxy/v1")
+        assert exc.value.code == 2
+        err = capsys.readouterr().err
+        assert "https://" in err
+        assert "http://internal-proxy/v1" in err
+
+    def test_user_info_rejected_and_masked(self, capsys: pytest.CaptureFixture[str]) -> None:
+        from codelicious.cli import _validate_endpoint_url_strict
+
+        with pytest.raises(SystemExit) as exc:
+            _validate_endpoint_url_strict("https://user:secret@host/v1")
+        assert exc.value.code == 2
+        err = capsys.readouterr().err
+        assert "***@host" in err
+        assert "secret" not in err
+
+    def test_empty_host_rejected(self, capsys: pytest.CaptureFixture[str]) -> None:
+        from codelicious.cli import _validate_endpoint_url_strict
+
+        with pytest.raises(SystemExit) as exc:
+            _validate_endpoint_url_strict("https:///path-only")
+        assert exc.value.code == 2
+        assert "no hostname" in capsys.readouterr().err
+
+    def test_malformed_url_rejected(self, capsys: pytest.CaptureFixture[str]) -> None:
+        from codelicious.cli import _validate_endpoint_url_strict
+
+        with pytest.raises(SystemExit) as exc:
+            # a scheme-less garbage string parses but has no scheme → http rejected branch
+            _validate_endpoint_url_strict("not a url at all")
+        assert exc.value.code == 2
+        assert "https://" in capsys.readouterr().err
diff --git a/tests/test_engine_base.py b/tests/test_engine_base.py
index eb84a528..9b5d0ddf 100644
--- a/tests/test_engine_base.py
+++ b/tests/test_engine_base.py
@@ -1,9 +1,9 @@
-"""Tests for the BuildEngine abstract base class and BuildResult dataclass.
+"""Tests for the BuildEngine abstract base class.
 
 Covers:
 - BuildEngine cannot be directly instantiated (abstract class enforcement)
-- Concrete subclasses must implement all abstract members
-- BuildResult field creation and default values
+- Concrete subclasses must implement all three abstract chunk methods
+- ChunkResult and EngineContext dataclass field creation and defaults
 - select_engine factory function
 """
 
@@ -14,53 +14,10 @@
 
 import pytest
 
-from codelicious.engines.base import BuildEngine, BuildResult, ChunkResult, EngineContext
+from codelicious.engines.base import BuildEngine, ChunkResult, EngineContext
 from codelicious.engines.claude_engine import ClaudeCodeEngine
 from codelicious.engines.huggingface_engine import HuggingFaceEngine
 
-# ---------------------------------------------------------------------------
-# BuildResult tests
-# ---------------------------------------------------------------------------
-
-
-class TestBuildResult:
-    """Tests for the BuildResult dataclass."""
-
-    def test_build_result_creation(self) -> None:
-        """BuildResult stores all provided field values correctly."""
-        result = BuildResult(
-            success=True,
-            message="All specs complete.",
-            session_id="abc-123",
-            elapsed_s=42.5,
-        )
-
-        assert result.success is True
-        assert result.message == "All specs complete."
-        assert result.session_id == "abc-123"
-        assert result.elapsed_s == 42.5
-
-    def test_build_result_defaults(self) -> None:
-        """BuildResult has correct default field values when only success is provided."""
-        result = BuildResult(success=False)
-
-        assert result.success is False
-        assert result.message == ""
-        assert result.session_id == ""
-        assert result.elapsed_s == 0.0
-
-    def test_build_result_success_true(self) -> None:
-        """BuildResult with success=True is truthy for the success field."""
-        result = BuildResult(success=True)
-        assert result.success is True
-
-    def test_build_result_success_false(self) -> None:
-        """BuildResult with success=False reflects a failed build."""
-        result = BuildResult(success=False, message="Exhausted iteration limit.")
-        assert result.success is False
-        assert result.message == "Exhausted iteration limit."
-
-
 # ---------------------------------------------------------------------------
 # BuildEngine abstract class enforcement tests
 # ---------------------------------------------------------------------------
@@ -74,17 +31,55 @@ def test_base_engine_cannot_be_instantiated(self) -> None:
         with pytest.raises(TypeError):
             BuildEngine()  # type: ignore[abstract]
 
-    def test_subclass_must_implement_all_abstract(self) -> None:
-        """A subclass that omits run_build_cycle cannot be instantiated."""
+    def test_subclass_must_implement_execute_chunk(self) -> None:
+        """A subclass that omits execute_chunk cannot be instantiated."""
 
         class PartialEngine(BuildEngine):
-            """Implements name but not run_build_cycle."""
+            """Implements name, verify_chunk, fix_chunk but not execute_chunk."""
+
+            @property
+            def name(self) -> str:
+                return "Partial"
+
+            def verify_chunk(self, chunk, repo_path):
+                return ChunkResult(success=True)
+
+            def fix_chunk(self, chunk, repo_path, failures):
+                return ChunkResult(success=True)
+
+        with pytest.raises(TypeError):
+            PartialEngine()  # type: ignore[abstract]
+
+    def test_subclass_must_implement_verify_chunk(self) -> None:
+        """A subclass that omits verify_chunk cannot be instantiated."""
+
+        class PartialEngine(BuildEngine):
+            @property
+            def name(self) -> str:
+                return "Partial"
+
+            def execute_chunk(self, chunk, repo_path, context):
+                return ChunkResult(success=True)
+
+            def fix_chunk(self, chunk, repo_path, failures):
+                return ChunkResult(success=True)
+
+        with pytest.raises(TypeError):
+            PartialEngine()  # type: ignore[abstract]
+
+    def test_subclass_must_implement_fix_chunk(self) -> None:
+        """A subclass that omits fix_chunk cannot be instantiated."""
 
+        class PartialEngine(BuildEngine):
             @property
             def name(self) -> str:
                 return "Partial"
 
-            # Intentionally omits run_build_cycle
+            def execute_chunk(self, chunk, repo_path, context):
+                return ChunkResult(success=True)
+
+            def verify_chunk(self, chunk, repo_path):
+                return ChunkResult(success=True)
 
         with pytest.raises(TypeError):
             PartialEngine()  # type: ignore[abstract]
@@ -93,12 +88,16 @@ def test_subclass_missing_name_property_cannot_be_instantiated(self) -> None:
         """A subclass that omits the name property cannot be instantiated."""
 
         class NoNameEngine(BuildEngine):
-            """Implements run_build_cycle but not name."""
+            """Implements chunk methods but not name."""
 
-            def run_build_cycle(self, repo_path, git_manager, cache_manager, spec_filter=None, **kwargs):
-                return BuildResult(success=True)
+            def execute_chunk(self, chunk, repo_path, context):
+                return ChunkResult(success=True)
 
-            # Intentionally omits name property
+            def verify_chunk(self, chunk, repo_path):
+                return ChunkResult(success=True)
+
+            def fix_chunk(self, chunk, repo_path, failures):
+                return ChunkResult(success=True)
 
         with pytest.raises(TypeError):
             NoNameEngine()  # type: ignore[abstract]
@@ -122,28 +121,12 @@ def verify_chunk(self, chunk, repo_path):
             def fix_chunk(self, chunk, repo_path, failures):
                 return ChunkResult(success=True, message="fixed")
 
-            def run_build_cycle(
-                self,
-                repo_path: pathlib.Path,
-                git_manager: object,
-                cache_manager: object,
-                spec_filter: str | None = None,
-                **kwargs,
-            ) -> BuildResult:
-                return BuildResult(success=True, message="Done", elapsed_s=1.0)
-
         engine = ConcreteEngine()
         assert engine.name == "Concrete Engine"
 
-        result = engine.run_build_cycle(
-            repo_path=pathlib.Path("/tmp"),
-            git_manager=object(),
-            cache_manager=object(),
-        )
-        assert isinstance(result, BuildResult)
+        result = engine.execute_chunk(None, pathlib.Path("/tmp"), EngineContext())
+        assert isinstance(result, ChunkResult)
         assert result.success is True
-        assert result.message == "Done"
-        assert result.elapsed_s == 1.0
 
     def test_subclass_name_property_is_accessible(self) -> None:
         """The name property on a concrete subclass returns the expected string."""
@@ -162,16 +145,12 @@ def verify_chunk(self, chunk, repo_path):
             def fix_chunk(self, chunk, repo_path, failures):
                 return ChunkResult(success=True)
 
-            def run_build_cycle(self, repo_path, git_manager, cache_manager, spec_filter=None, **kwargs):
-                return BuildResult(success=False)
-
         engine = NamedEngine()
         assert engine.name == "My Engine"
 
 
 # ---------------------------------------------------------------------------
 # Engine contract tests — verify both concrete engines implement the ABC
-# (merged from test_engine_contract.py, spec-18 Phase 11)
 # ---------------------------------------------------------------------------
 
 
@@ -200,26 +179,48 @@ def test_hf_engine_has_name(self) -> None:
         assert isinstance(engine.name, str)
         assert len(engine.name) > 0
 
-    def test_claude_engine_has_run_build_cycle(self) -> None:
-        """ClaudeCodeEngine must expose a callable run_build_cycle method."""
+    def test_claude_engine_has_execute_chunk(self) -> None:
+        """ClaudeCodeEngine must expose a callable execute_chunk method."""
         engine = ClaudeCodeEngine()
-        assert hasattr(engine, "run_build_cycle")
-        assert callable(engine.run_build_cycle)
+        assert hasattr(engine, "execute_chunk")
+        assert callable(engine.execute_chunk)
 
-    def test_hf_engine_has_run_build_cycle(self) -> None:
-        """HuggingFaceEngine must expose a callable run_build_cycle method."""
+    def test_hf_engine_has_execute_chunk(self) -> None:
+        """HuggingFaceEngine must expose a callable execute_chunk method."""
         engine = HuggingFaceEngine()
-        assert hasattr(engine, "run_build_cycle")
-        assert callable(engine.run_build_cycle)
+        assert hasattr(engine, "execute_chunk")
+        assert callable(engine.execute_chunk)
 
+    def test_claude_engine_has_verify_chunk(self) -> None:
+        """ClaudeCodeEngine must expose a callable verify_chunk method."""
+        engine = ClaudeCodeEngine()
+        assert hasattr(engine, "verify_chunk")
+        assert callable(engine.verify_chunk)
+
+    def test_hf_engine_has_verify_chunk(self) -> None:
+        """HuggingFaceEngine must expose a callable verify_chunk method."""
+        engine = HuggingFaceEngine()
+        assert hasattr(engine, "verify_chunk")
+        assert callable(engine.verify_chunk)
+
+    def test_claude_engine_has_fix_chunk(self) -> None:
+        """ClaudeCodeEngine must expose a callable fix_chunk method."""
+        engine = ClaudeCodeEngine()
+        assert hasattr(engine, "fix_chunk")
+        assert callable(engine.fix_chunk)
 
-class TestBuildResultMessageType:
-    """Targeted type-check for BuildResult.message (not covered by TestBuildResult)."""
+    def test_hf_engine_has_fix_chunk(self) -> None:
+        """HuggingFaceEngine must expose a callable fix_chunk method."""
+        engine = HuggingFaceEngine()
+        assert hasattr(engine, "fix_chunk")
+        assert callable(engine.fix_chunk)
 
-    def test_build_result_message_is_str(self) -> None:
-        """BuildResult.message must be an instance of str."""
-        result = BuildResult(success=False, message="failed")
-        assert isinstance(result.message, str)
+    def test_engines_do_not_have_run_build_cycle(self) -> None:
+        """Neither engine should expose run_build_cycle (removed in spec v29 Step 8)."""
+        for cls in (ClaudeCodeEngine, HuggingFaceEngine):
+            assert not hasattr(cls, "run_build_cycle"), (
+                f"{cls.__name__} still has run_build_cycle — legacy method was not removed"
+            )
 
 
 # ---------------------------------------------------------------------------
@@ -298,9 +299,6 @@ def verify_chunk(self, chunk, repo_path):
             def fix_chunk(self, chunk, repo_path, failures):
                 return ChunkResult(success=True)
 
-            def run_build_cycle(self, repo_path, git_manager, cache_manager, spec_filter=None, **kwargs):
-                return BuildResult(success=True)
-
         with pytest.raises(TypeError):
             NoChunkEngine()  # type: ignore[abstract]
 
diff --git a/tests/test_engine_claude_chunk.py b/tests/test_engine_claude_chunk.py
index d31c47b4..03a83e67 100644
--- a/tests/test_engine_claude_chunk.py
+++ b/tests/test_engine_claude_chunk.py
@@ -251,47 +251,3 @@ def test_fix_chunk_git_diff_exception(self, tmp_path: pathlib.Path) -> None:
         assert result.success is True
         assert result.files_modified == []
         assert result.retries_used == 1
-
-
-class TestClaudeRunBuildCycle:
-    """Coverage for run_build_cycle legacy interface (lines 273-293)."""
-
-    def test_run_build_cycle_no_specs(self, tmp_path: pathlib.Path) -> None:
-        """Lines 280-282: discover_incomplete_specs returns [] → immediate success."""
-        engine = ClaudeCodeEngine()
-        git_manager = mock.MagicMock()
-        cache_manager = mock.MagicMock()
-
-        with mock.patch("codelicious.spec_discovery.discover_incomplete_specs", return_value=[]):
-            result = engine.run_build_cycle(tmp_path, git_manager, cache_manager)
-
-        assert result.success is True
-        assert "no incomplete" in result.message.lower()
-
-    def test_run_build_cycle_with_specs(self, tmp_path: pathlib.Path) -> None:
-        """Lines 284-293: discover_incomplete_specs returns specs, delegates to V2Orchestrator.run."""
-        engine = ClaudeCodeEngine()
-        git_manager = mock.MagicMock()
-        cache_manager = mock.MagicMock()
-
-        fake_spec = tmp_path / "docs" / "specs" / "01_feature.md"
-        fake_spec.parent.mkdir(parents=True)
-        fake_spec.write_text("# Spec\n- [ ] task\n")
-
-        from codelicious.orchestrator import OrchestratorResult
-
-        orch_result = OrchestratorResult(success=True, message="all done", elapsed_s=1.5)
-        mock_orch = mock.MagicMock()
-        mock_orch.run.return_value = orch_result
-
-        with mock.patch("codelicious.spec_discovery.discover_incomplete_specs", return_value=[fake_spec]):
-            with mock.patch("codelicious.orchestrator.V2Orchestrator", return_value=mock_orch) as mock_orch_cls:
-                result = engine.run_build_cycle(tmp_path, git_manager, cache_manager, push_pr=True)
-
-        assert result.success is True
-        assert result.message == "all done"
-        mock_orch_cls.assert_called_once()
-        mock_orch.run.assert_called_once()
-        # push_pr kwarg should have been forwarded
-        call_kwargs = mock_orch.run.call_args.kwargs
-        assert call_kwargs.get("push_pr") is True
diff --git a/tests/test_engine_huggingface_chunk.py b/tests/test_engine_huggingface_chunk.py
index 6e014307..5a871ecc 100644
--- a/tests/test_engine_huggingface_chunk.py
+++ b/tests/test_engine_huggingface_chunk.py
@@ -92,6 +92,82 @@ def test_iteration_limit_returns_incomplete(self, tmp_path: pathlib.Path) -> Non
         assert "incomplete" in result.message.lower()
 
 
+class TestHFReflectionGate:
+    """spec v29 Step 11: post-reflection verify must pass before HF returns success=True."""
+
+    def _setup_complete_signal(self, mock_llm: mock.MagicMock) -> None:
+        mock_llm.chat_completion.return_value = {
+            "choices": [{"message": {"role": "assistant", "content": "done. CHUNK_COMPLETE"}}]
+        }
+        mock_llm.parse_tool_calls.return_value = []
+        mock_llm.parse_content.return_value = "done. CHUNK_COMPLETE"
+        mock_llm.planner_model = "test"
+        mock_llm.coder_model = "test"
+        mock_llm.endpoint_url = "https://test"
+
+    def test_verify_fails_then_passes_after_one_fix(self, tmp_path: pathlib.Path) -> None:
+        """verify fails once, fix is invoked, second verify passes → success."""
+        engine = HuggingFaceEngine()
+        chunk = FakeChunk()
+        context = EngineContext(spec_content="# Spec", deadline=0.0)
+
+        mock_llm = mock.MagicMock()
+        self._setup_complete_signal(mock_llm)
+        mock_registry = mock.MagicMock()
+        mock_registry.generate_schema.return_value = []
+        diff_mock = mock.MagicMock(returncode=0, stdout="src/a.py\n")
+
+        verify_results = [
+            ChunkResult(success=False, message="lint failed"),
+            ChunkResult(success=True, message="ok"),
+        ]
+        fix_result = ChunkResult(success=True, message="patched")
+
+        with mock.patch("codelicious.llm_client.LLMClient", return_value=mock_llm):
+            with mock.patch("codelicious.tools.registry.ToolRegistry", return_value=mock_registry):
+                with mock.patch("codelicious.config.load_project_config", return_value={}):
+                    with mock.patch("subprocess.run", return_value=diff_mock):
+                        with mock.patch.object(engine, "verify_chunk", side_effect=verify_results) as m_verify:
+                            with mock.patch.object(engine, "fix_chunk", return_value=fix_result) as m_fix:
+                                result = engine.execute_chunk(chunk, tmp_path, context)
+
+        assert result.success is True
+        assert m_fix.call_count == 1
+        assert m_verify.call_count == 2
+
+    def test_verify_always_fails_returns_failure_after_two_attempts(self, tmp_path: pathlib.Path) -> None:
+        """When verify never passes, success=False after exactly 2 fix attempts."""
+        engine = HuggingFaceEngine()
+        chunk = FakeChunk()
+        context = EngineContext(spec_content="# Spec", deadline=0.0)
+
+        mock_llm = mock.MagicMock()
+        self._setup_complete_signal(mock_llm)
+        mock_registry = mock.MagicMock()
+        mock_registry.generate_schema.return_value = []
+        diff_mock = mock.MagicMock(returncode=0, stdout="")
+
+        with mock.patch("codelicious.llm_client.LLMClient", return_value=mock_llm):
+            with mock.patch("codelicious.tools.registry.ToolRegistry", return_value=mock_registry):
+                with mock.patch("codelicious.config.load_project_config", return_value={}):
+                    with mock.patch("subprocess.run", return_value=diff_mock):
+                        with mock.patch.object(
+                            engine,
+                            "verify_chunk",
+                            return_value=ChunkResult(success=False, message="still failing"),
+                        ) as m_verify:
+                            with mock.patch.object(
+                                engine, "fix_chunk", return_value=ChunkResult(success=True, message="x")
+                            ) as m_fix:
+                                result = engine.execute_chunk(chunk, tmp_path, context)
+
+        assert result.success is False
+        assert "fix attempts" in result.message
+        # Initial verify + 2 retries = 3 verify calls; 2 fix attempts.
+        assert m_verify.call_count == 3
+        assert m_fix.call_count == 2
+
+
 class TestHFVerifyChunk:
     """HuggingFaceEngine.verify_chunk runs the verifier."""
 
@@ -555,60 +631,3 @@ def test_verify_import_error_treated_as_pass(self, tmp_path: pathlib.Path) -> No
 
         assert result.success is True
         assert "skipped" in result.message.lower() or "not available" in result.message.lower()
-
-
-# ---------------------------------------------------------------------------
-# run_build_cycle
-# ---------------------------------------------------------------------------
-
-
-class TestRunBuildCycle:
-    """run_build_cycle delegates to V2Orchestrator or returns early when no specs."""
-
-    def test_run_build_cycle_no_specs(self, tmp_path: pathlib.Path) -> None:
-        """When discover_incomplete_specs returns [], run_build_cycle returns success=True."""
-        from codelicious.engines.base import BuildResult
-
-        engine = HuggingFaceEngine()
-
-        with mock.patch("codelicious.spec_discovery.discover_incomplete_specs", return_value=[]):
-            result = engine.run_build_cycle(
-                repo_path=tmp_path,
-                git_manager=mock.MagicMock(),
-                cache_manager=mock.MagicMock(),
-            )
-
-        assert isinstance(result, BuildResult)
-        assert result.success is True
-        assert "No incomplete specs" in result.message
-
-    def test_run_build_cycle_delegates_to_v2(self, tmp_path: pathlib.Path) -> None:
-        """When specs are found, run_build_cycle instantiates V2Orchestrator and calls run()."""
-        from codelicious.engines.base import BuildResult
-
-        engine = HuggingFaceEngine()
-
-        fake_spec = mock.MagicMock()
-        fake_orch_result = mock.MagicMock()
-        fake_orch_result.success = True
-        fake_orch_result.message = "all done"
-        fake_orch_result.elapsed_s = 1.5
-
-        mock_orch_instance = mock.MagicMock()
-        mock_orch_instance.run.return_value = fake_orch_result
-
-        with mock.patch("codelicious.spec_discovery.discover_incomplete_specs", return_value=[fake_spec]):
-            with mock.patch(
-                "codelicious.orchestrator.V2Orchestrator", return_value=mock_orch_instance
-            ) as mock_orch_cls:
-                result = engine.run_build_cycle(
-                    repo_path=tmp_path,
-                    git_manager=mock.MagicMock(),
-                    cache_manager=mock.MagicMock(),
-                )
-
-        assert isinstance(result, BuildResult)
-        assert result.success is True
-        assert result.message == "all done"
-        mock_orch_cls.assert_called_once()
-        mock_orch_instance.run.assert_called_once()
diff --git a/tests/test_git_orchestrator.py b/tests/test_git_orchestrator.py
index 2397b32f..3ae3806c 100644
--- a/tests/test_git_orchestrator.py
+++ b/tests/test_git_orchestrator.py
@@ -2807,6 +2807,53 @@ def test_creates_new_branch(self, tmp_path: Path) -> None:
         mock_cmd.assert_any_call(["git", "checkout", "-b", "codelicious/spec-27-part-2"])
 
 
+class TestBranchDisambiguation:
+    """spec v30 Step 10: ``_disambiguate_branch`` resolves local/remote collisions."""
+
+    def _manager(self, tmp_path: Path) -> GitManager:
+        (tmp_path / ".git").mkdir()
+        return GitManager(tmp_path)
+
+    def test_unique_name_unchanged(self, tmp_path: Path) -> None:
+        m = self._manager(tmp_path)
+        with (
+            mock.patch.object(m, "_branch_exists_locally", return_value=False),
+            mock.patch.object(m, "_branch_exists_remotely", return_value=False),
+        ):
+            assert m._disambiguate_branch("codelicious/spec-27-part-2") == "codelicious/spec-27-part-2"
+
+    def test_local_collision_appends_suffix(self, tmp_path: Path, caplog: pytest.LogCaptureFixture) -> None:
+        m = self._manager(tmp_path)
+        with (
+            mock.patch.object(m, "_branch_exists_locally", side_effect=[True, False]),
+            mock.patch.object(m, "_branch_exists_remotely", return_value=False),
+            caplog.at_level("INFO", logger="codelicious"),
+        ):
+            disambiguated = m._disambiguate_branch("codelicious/spec-27-part-2", suffix_hint="abc123")
+        assert disambiguated == "codelicious/spec-27-part-2-abc123"
+        assert any("exists; using" in r.message for r in caplog.records)
+
+    def test_remote_only_collision_disambiguated(self, tmp_path: Path) -> None:
+        m = self._manager(tmp_path)
+        with (
+            mock.patch.object(m, "_branch_exists_locally", return_value=False),
+            mock.patch.object(m, "_branch_exists_remotely", side_effect=[True, False]),
+        ):
+            assert m._disambiguate_branch("c", suffix_hint="zz") == "c-zz"
+
+    def test_double_collision_falls_back_to_timestamp(self, tmp_path: Path) -> None:
+        m = self._manager(tmp_path)
+        with (
+            mock.patch.object(m, "_branch_exists_locally", side_effect=[True, True]),
+            mock.patch.object(m, "_branch_exists_remotely", return_value=False),
+        ):
+            disambiguated = m._disambiguate_branch("c", suffix_hint="x")
+        # Falls back to "c-<unix_ts>" — assert the prefix and that the suffix is digits.
+        assert disambiguated.startswith("c-")
+        assert disambiguated != "c-x"
+        assert disambiguated.split("-")[-1].isdigit()
+
+
 # ---------------------------------------------------------------------------
 # New coverage: push_to_origin() — all retries exhausted (transient)
 # ---------------------------------------------------------------------------
@@ -3227,6 +3274,54 @@ def test_chunk_summaries_capped_at_50(self, tmp_path: Path) -> None:
         assert "chunk-50" not in body
 
 
+class TestPrBodyChunkMetadata:
+    """spec v30 Step 9: ``chunk_metadata`` adds Chunk Context / Verifier Summary / Audit Log sections."""
+
+    def test_full_metadata_renders_all_sections(self, tmp_path: Path) -> None:
+        manager = GitManager(tmp_path)
+        meta = {
+            "spec_path": "docs/specs/27.md",
+            "chunk_id": "spec-27-chunk-04",
+            "chunk_title": "Wire orchestrator to engines",
+            "depends_on": ["spec-27-chunk-03"],
+            "dependents": ["spec-27-chunk-05"],
+            "verifier": {
+                "tests_passed": 1908,
+                "lint_warnings": 0,
+                "coverage_pct": 92.5,
+                "coverage_delta_pp": 0.4,
+            },
+            "audit_log_path": ".codelicious/logs/build-20260504.log",
+        }
+        body = manager._build_pr_body(spec_id="27", chunk_summaries=None, prev_pr_url="", chunk_metadata=meta)
+
+        assert "## Chunk Context" in body
+        assert "spec-27-chunk-04" in body
+        assert "spec-27-chunk-03" in body
+        assert "spec-27-chunk-05" in body
+        assert "## Verifier Summary" in body
+        assert "tests passed: 1908" in body
+        assert "92.5%" in body
+        assert "+0.4 pp" in body
+        assert "## Audit Log" in body
+        assert "build-20260504.log" in body
+
+    def test_missing_fields_render_as_n_a(self, tmp_path: Path) -> None:
+        manager = GitManager(tmp_path)
+        body = manager._build_pr_body(spec_id="27", chunk_summaries=None, prev_pr_url="", chunk_metadata={})
+        assert "## Chunk Context" in body
+        # Every renderable field falls back to "n/a" rather than raising.
+        assert body.count("n/a") >= 4
+
+    def test_no_metadata_omits_chunk_sections(self, tmp_path: Path) -> None:
+        """When chunk_metadata is None the new sections must NOT appear (back-compat)."""
+        manager = GitManager(tmp_path)
+        body = manager._build_pr_body(spec_id="27", chunk_summaries=None, prev_pr_url="", chunk_metadata=None)
+        assert "Chunk Context" not in body
+        assert "Verifier Summary" not in body
+        assert "Audit Log" not in body
+
+
 # ---------------------------------------------------------------------------
 # New coverage: _find_existing_pr_by_branch — timeout and JSON error
 # ---------------------------------------------------------------------------
diff --git a/tests/test_llm_client.py b/tests/test_llm_client.py
index 82d6bc5b..7b42e089 100644
--- a/tests/test_llm_client.py
+++ b/tests/test_llm_client.py
@@ -6,7 +6,7 @@
 import ssl
 import urllib.error
 from datetime import datetime
-from unittest.mock import call, patch
+from unittest.mock import patch
 
 import pytest
 
@@ -457,15 +457,18 @@ def test_os_error_retries_and_raises(self, client):
             assert "LLM Connection Error" in str(exc_info.value)
 
     def test_network_error_exponential_backoff_intervals(self, client):
-        """Sleep durations should follow exponential backoff: 1s, 2s, 4s."""
+        """Sleep durations follow jittered exponential backoff in [0.5x, 1.5x] of 2**i (spec v29 Step 3)."""
         with patch("urllib.request.urlopen") as mock_urlopen, patch("time.sleep") as mock_sleep:
             mock_urlopen.side_effect = urllib.error.URLError("timeout")
 
             with pytest.raises(RuntimeError):
                 client.chat_completion([{"role": "user", "content": "test"}])
 
-            expected_sleeps = [call(client._BACKOFF_BASE_S * (2**i)) for i in range(client._MAX_RETRIES)]
-            assert mock_sleep.call_args_list == expected_sleeps
+            assert mock_sleep.call_count == client._MAX_RETRIES
+            for i, call_obj in enumerate(mock_sleep.call_args_list):
+                base = client._BACKOFF_BASE_S * (2**i)
+                actual = call_obj.args[0]
+                assert 0.5 * base <= actual <= 1.5 * base, f"attempt {i}: {actual} not in [{0.5 * base}, {1.5 * base}]"
 
     def test_network_error_succeeds_on_retry(self, client):
         """A transient network error should succeed once the connection recovers."""
diff --git a/tests/test_orchestrator.py b/tests/test_orchestrator.py
deleted file mode 100644
index 5e57bf2f..00000000
--- a/tests/test_orchestrator.py
+++ /dev/null
@@ -1,1306 +0,0 @@
-"""Tests for the Orchestrator build loop, finding triage, and fix prompt."""
-
-from __future__ import annotations
-
-import json
-import logging
-import pathlib
-import subprocess
-import threading
-from unittest import mock
-
-import pytest
-
-from codelicious.git.git_orchestrator import GitManager
-from codelicious.orchestrator import (
-    REVIEWER_PROMPTS,
-    Finding,
-    Orchestrator,
-    OrchestratorResult,
-    ReviewRole,
-    _abort_merge,
-    _collect_review_findings,
-    _commit_worktree_changes,
-    _create_worktree,
-    _merge_worktree_branch,
-    _render_fix_prompt,
-    _triage_findings,
-)
-
-# ---------------------------------------------------------------------------
-# Finding triage
-# ---------------------------------------------------------------------------
-
-
-class TestTriageFindings:
-    """Tests for severity-based sorting and deduplication."""
-
-    def test_sorts_by_severity(self):
-        findings = [
-            Finding(role="qa", severity="P3", file="a.py", line=1, title="minor", description="", fix=""),
-            Finding(role="sec", severity="P1", file="b.py", line=2, title="critical", description="", fix=""),
-            Finding(role="perf", severity="P2", file="c.py", line=3, title="medium", description="", fix=""),
-        ]
-        result = _triage_findings(findings)
-        assert [f.severity for f in result] == ["P1", "P2", "P3"]
-
-    def test_deduplicates_by_file_line(self):
-        findings = [
-            Finding(role="qa", severity="P2", file="a.py", line=10, title="first", description="", fix=""),
-            Finding(role="sec", severity="P1", file="a.py", line=10, title="second", description="", fix=""),
-        ]
-        result = _triage_findings(findings)
-        # P1 sorts first, so the P1 finding wins the dedup
-        assert len(result) == 1
-        assert result[0].severity == "P1"
-
-    def test_empty_list(self):
-        assert _triage_findings([]) == []
-
-    def test_unknown_severity_sorts_after_p3(self):
-        """Findings with non-standard severity values (P0, UNKNOWN) sort after P3 and are preserved."""
-        findings = [
-            Finding(role="qa", severity="P3", file="a.py", line=1, title="minor", description="", fix=""),
-            Finding(role="sec", severity="UNKNOWN", file="b.py", line=2, title="mystery", description="", fix=""),
-            Finding(role="perf", severity="P0", file="c.py", line=3, title="critical-plus", description="", fix=""),
-        ]
-        result = _triage_findings(findings)
-        # All three findings must be preserved (distinct file+line keys)
-        assert len(result) == 3
-        # P3 sorts before both unknowns (which fall into the default bucket, order=9)
-        assert result[0].severity == "P3"
-        # Both non-standard severities appear after P3
-        unknown_severities = {f.severity for f in result[1:]}
-        assert unknown_severities == {"UNKNOWN", "P0"}
-
-
-# ---------------------------------------------------------------------------
-# Review findings collection
-# ---------------------------------------------------------------------------
-
-
-class TestCollectReviewFindings:
-    """Tests for parsing JSON review files."""
-
-    def test_reads_valid_json(self, tmp_path: pathlib.Path):
-        review_file = tmp_path / ".codelicious" / "review_security.json"
-        review_file.parent.mkdir(parents=True)
-        review_file.write_text(
-            json.dumps(
-                [
-                    {
-                        "severity": "P1",
-                        "file": "x.py",
-                        "line": 5,
-                        "title": "issue",
-                        "description": "desc",
-                        "fix": "fix",
-                    },
-                ]
-            )
-        )
-        findings = _collect_review_findings(tmp_path, "security")
-        assert len(findings) == 1
-        assert findings[0].severity == "P1"
-        assert findings[0].role == "security"
-
-    def test_missing_file_returns_empty(self, tmp_path: pathlib.Path):
-        assert _collect_review_findings(tmp_path, "nonexistent") == []
-
-    def test_malformed_json_returns_empty(self, tmp_path: pathlib.Path):
-        review_file = tmp_path / ".codelicious" / "review_qa.json"
-        review_file.parent.mkdir(parents=True)
-        review_file.write_text("not json")
-        assert _collect_review_findings(tmp_path, "qa") == []
-
-    def test_non_array_json_returns_empty(self, tmp_path: pathlib.Path):
-        review_file = tmp_path / ".codelicious" / "review_qa.json"
-        review_file.parent.mkdir(parents=True)
-        review_file.write_text(json.dumps({"not": "an array"}))
-        assert _collect_review_findings(tmp_path, "qa") == []
-
-    def test_permission_error_returns_empty(self, tmp_path: pathlib.Path):
-        """When read_text raises PermissionError (OSError subclass), return []."""
-        review_file = tmp_path / ".codelicious" / "review_security.json"
-        review_file.parent.mkdir(parents=True)
-        review_file.write_text("[]")
-
-        with mock.patch.object(pathlib.Path, "read_text", side_effect=PermissionError("access denied")):
-            result = _collect_review_findings(tmp_path, "security")
-
-        assert result == []
-
-
-# ---------------------------------------------------------------------------
-# Fix prompt rendering
-# ---------------------------------------------------------------------------
-
-
-class TestRenderFixPrompt:
-    """Tests for the fix prompt template."""
-
-    def test_includes_no_git_warning(self):
-        prompt = _render_fix_prompt("myproject", [])
-        assert "Do NOT run git" in prompt
-        assert "git add" in prompt and "git commit" in prompt
-
-    def test_includes_findings(self):
-        findings = [
-            Finding(role="sec", severity="P1", file="a.py", line=10, title="XSS", description="bad", fix="escape"),
-        ]
-        prompt = _render_fix_prompt("myproject", findings)
-        assert "XSS" in prompt
-        assert "a.py:10" in prompt
-        assert "P1" in prompt
-
-    def test_no_findings(self):
-        prompt = _render_fix_prompt("myproject", [])
-        assert "No findings to fix" in prompt
-
-
-# ---------------------------------------------------------------------------
-# Orchestrator build loop
-# ---------------------------------------------------------------------------
-
-
-class TestOrchestratorRun:
-    """Tests for the orchestrator's build→merge→review→fix loop."""
-
-    @pytest.fixture
-    def mock_git_manager(self):
-        mgr = mock.MagicMock(spec=GitManager)
-        mgr.commit_verified_changes.return_value = None
-        mgr.push_to_origin.return_value = mock.MagicMock(success=True, error_type=None, message="")
-        mgr.ensure_draft_pr_exists.return_value = None
-        return mgr
-
-    @pytest.fixture
-    def mock_config(self):
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return C()
-
-    def test_all_specs_already_complete(self, tmp_path: pathlib.Path, mock_git_manager, mock_config):
-        """When all specs are already complete, the loop exits immediately with no builds."""
-        spec = tmp_path / "spec.md"
-        spec.write_text("- [x] done\n- [x] also done\n")
-
-        orch = Orchestrator(tmp_path, mock_git_manager, mock_config)
-
-        # Mock _phase_build, _phase_review and _phase_fix to avoid running actual agents
-        with mock.patch.object(orch, "_phase_build", return_value=[]) as mock_build:
-            with mock.patch.object(orch, "_phase_review", return_value=[]):
-                with mock.patch.object(orch, "_phase_fix", return_value=True):
-                    result = orch.run(specs=[spec], reviewers=[], max_build_cycles=5)
-
-        assert result.success is True
-        assert result.cycles_completed == 0
-        mock_build.assert_not_called()
-
-    def test_consecutive_failures_abort(self, tmp_path: pathlib.Path, mock_git_manager, mock_config):
-        """3 consecutive build failures cause the loop to abort."""
-        spec = tmp_path / "spec.md"
-        spec.write_text("- [ ] never built\n")
-
-        orch = Orchestrator(tmp_path, mock_git_manager, mock_config)
-
-        # Mock _phase_build to always fail
-        with mock.patch.object(orch, "_phase_build", return_value=[("branch", False)]):
-            with mock.patch.object(orch, "_phase_review", return_value=[]):
-                with mock.patch.object(orch, "_phase_fix", return_value=True):
-                    with mock.patch(
-                        "codelicious.prompts.scan_remaining_tasks_for_spec",
-                        return_value=1,
-                    ):
-                        result = orch.run(specs=[spec], reviewers=[], max_build_cycles=10)
-
-        assert result.success is False
-        assert result.cycles_completed == 3  # aborted after 3
-
-    def test_empty_specs_list(self, tmp_path: pathlib.Path, mock_git_manager, mock_config):
-        """Empty specs list should complete immediately."""
-        orch = Orchestrator(tmp_path, mock_git_manager, mock_config)
-
-        with mock.patch.object(orch, "_phase_review", return_value=[]):
-            with mock.patch.object(orch, "_phase_fix", return_value=True):
-                result = orch.run(specs=[], reviewers=[], max_build_cycles=5)
-
-        assert result.success is True
-        assert result.cycles_completed == 0
-
-    def test_build_without_build_complete_reports_failure(self, tmp_path: pathlib.Path, mock_git_manager, mock_config):
-        """Agent exits cleanly but doesn't write BUILD_COMPLETE → build fails."""
-        spec = tmp_path / "spec.md"
-        spec.write_text("- [ ] not built\n")
-
-        orch = Orchestrator(tmp_path, mock_git_manager, mock_config)
-
-        # Mock _build_spec_in_worktree to simulate agent that exits ok but
-        # never writes BUILD_COMPLETE (the old bug behavior).
-        # Instead, test the actual logic by mocking _run_agent and worktree ops.
-        mock_result = mock.MagicMock(success=True)  # process exited ok
-
-        with mock.patch.object(orch, "_run_agent", return_value=mock_result):
-            with mock.patch("codelicious.orchestrator._create_worktree", return_value=tmp_path / "wt"):
-                with mock.patch("codelicious.orchestrator._remove_worktree"):
-                    with mock.patch("codelicious.orchestrator._commit_worktree_changes", return_value=True):
-                        # Create worktree dir but NO BUILD_COMPLETE file
-                        wt = tmp_path / "wt"
-                        wt.mkdir()
-                        (wt / ".codelicious").mkdir()
-                        # Copy spec into worktree
-                        (wt / "spec.md").write_text("- [ ] not built\n")
-
-                        _, success = orch._build_spec_in_worktree(spec)
-
-        # Agent exited ok, but no BUILD_COMPLETE → should be False
-        assert success is False
-
-    def test_spec_becomes_complete_after_build(self, tmp_path: pathlib.Path, mock_git_manager, mock_config):
-        """Build loop exits when the spec becomes complete after a build cycle."""
-        spec = tmp_path / "spec.md"
-        spec.write_text("- [ ] build me\n")
-
-        orch = Orchestrator(tmp_path, mock_git_manager, mock_config)
-
-        def fake_build(specs, workers):
-            # Simulate the agent checking off the task
-            spec.write_text("- [x] build me\n")
-            return [("codelicious/spec", True)]
-
-        with mock.patch.object(orch, "_phase_build", side_effect=fake_build):
-            with mock.patch.object(orch, "_phase_merge", return_value=1):
-                with mock.patch.object(orch, "_phase_review", return_value=[]):
-                    with mock.patch.object(orch, "_phase_fix", return_value=True):
-                        result = orch.run(specs=[spec], reviewers=[], max_build_cycles=10)
-
-        assert result.success is True
-        # Build ran once, then the loop detected completion on the next iteration
-        assert result.cycles_completed == 1
-        # Verify push was called (mid-cycle + final = at least 2 calls)
-        assert mock_git_manager.push_to_origin.call_count >= 2
-
-
-class TestPhaseBuildConcurrentCounter:
-    """Tests that _phase_build's completed_count is updated correctly under concurrency."""
-
-    @pytest.fixture
-    def orch(self, tmp_path: pathlib.Path):
-        git_manager = mock.MagicMock(spec=GitManager)
-        git_manager.push_to_origin.return_value = mock.MagicMock(success=True, error_type=None, message="")
-
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return Orchestrator(tmp_path, git_manager, C())
-
-    def test_all_successes_counted(self, tmp_path: pathlib.Path, orch: Orchestrator):
-        """All successful futures must be counted exactly once each."""
-        specs = [tmp_path / f"spec_{i}.md" for i in range(8)]
-        for s in specs:
-            s.write_text("")
-
-        # Mock _build_spec_in_worktree to return (branch, True) after a brief pause
-        # so futures resolve with genuine overlap.
-        barrier = threading.Barrier(len(specs))
-
-        def fake_build(spec: pathlib.Path):
-            barrier.wait(timeout=5)
-            return (f"codelicious/build-{spec.stem}", True)
-
-        with mock.patch.object(orch, "_build_spec_in_worktree", side_effect=fake_build):
-            results = orch._phase_build(specs, max_workers=len(specs))
-
-        assert len(results) == len(specs)
-        assert all(ok for _, ok in results)
-
-    def test_exception_futures_counted(self, tmp_path: pathlib.Path, orch: Orchestrator):
-        """Futures that raise must be counted, not silently dropped."""
-        specs = [tmp_path / f"spec_{i}.md" for i in range(4)]
-        for s in specs:
-            s.write_text("")
-
-        barrier = threading.Barrier(len(specs))
-
-        def fake_build_raises(spec: pathlib.Path):
-            barrier.wait(timeout=5)
-            raise RuntimeError("worker exploded")
-
-        with mock.patch.object(orch, "_build_spec_in_worktree", side_effect=fake_build_raises):
-            results = orch._phase_build(specs, max_workers=len(specs))
-
-        # All specs should still produce a result entry (failed)
-        assert len(results) == len(specs)
-        assert all(not ok for _, ok in results)
-
-    def test_mixed_success_and_failure_counted(self, tmp_path: pathlib.Path, orch: Orchestrator):
-        """A mix of successes and exceptions must all be counted exactly once."""
-        specs = [tmp_path / f"spec_{i}.md" for i in range(6)]
-        for s in specs:
-            s.write_text("")
-
-        barrier = threading.Barrier(len(specs))
-
-        def fake_build_mixed(spec: pathlib.Path):
-            barrier.wait(timeout=5)
-            idx = int(spec.stem.rsplit("_", 1)[-1])
-            if idx % 2 == 0:
-                raise RuntimeError("even spec fails")
-            return (f"codelicious/build-{spec.stem}", True)
-
-        with mock.patch.object(orch, "_build_spec_in_worktree", side_effect=fake_build_mixed):
-            results = orch._phase_build(specs, max_workers=len(specs))
-
-        assert len(results) == len(specs)
-        successes = [ok for _, ok in results if ok]
-        failures = [ok for _, ok in results if not ok]
-        assert len(successes) == 3
-        assert len(failures) == 3
-
-
-# ---------------------------------------------------------------------------
-# Finding 7 — _commit_worktree_changes
-# ---------------------------------------------------------------------------
-
-
-class TestCommitWorktreeChanges:
-    """Tests for _commit_worktree_changes error paths."""
-
-    def test_staging_timeout_returns_false(self, tmp_path: pathlib.Path):
-        """A timeout while staging all files returns False."""
-        with mock.patch(
-            "codelicious.orchestrator.subprocess.run",
-            side_effect=subprocess.TimeoutExpired(cmd="git add", timeout=120),
-        ):
-            result = _commit_worktree_changes(tmp_path, "spec.md")
-        assert result is False
-
-    def test_diff_check_timeout_returns_false(self, tmp_path: pathlib.Path):
-        """A timeout on the diff --cached check returns False."""
-        add_ok = mock.MagicMock(returncode=0)
-
-        def _fake_run(cmd, **kwargs):
-            if "add" in cmd:
-                return add_ok
-            # diff --cached
-            raise subprocess.TimeoutExpired(cmd=cmd, timeout=120)
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", side_effect=_fake_run):
-            result = _commit_worktree_changes(tmp_path, "spec.md")
-        assert result is False
-
-    def test_clean_worktree_returns_false(self, tmp_path: pathlib.Path):
-        """When diff --cached exits 0 (nothing staged), returns False without committing."""
-        add_ok = mock.MagicMock(returncode=0)
-        diff_clean = mock.MagicMock(returncode=0)  # 0 = no staged changes
-
-        def _fake_run(cmd, **kwargs):
-            if "add" in cmd:
-                return add_ok
-            return diff_clean
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", side_effect=_fake_run):
-            result = _commit_worktree_changes(tmp_path, "spec.md")
-        assert result is False
-
-    def test_gpg_failure_falls_back_to_no_gpg_sign(self, tmp_path: pathlib.Path):
-        """A GPG-related commit failure triggers a --no-gpg-sign retry."""
-        add_ok = mock.MagicMock(returncode=0)
-        diff_dirty = mock.MagicMock(returncode=1)  # 1 = staged changes exist
-        gpg_fail = mock.MagicMock(returncode=1, stderr="error: gpg failed to sign the data")
-        unsigned_ok = mock.MagicMock(returncode=0)
-
-        calls = iter([add_ok, diff_dirty, gpg_fail, unsigned_ok])
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", side_effect=lambda *a, **kw: next(calls)):
-            result = _commit_worktree_changes(tmp_path, "spec.md")
-        assert result is True
-
-    def test_unsigned_commit_timeout_returns_false(self, tmp_path: pathlib.Path):
-        """Timeout on the --no-gpg-sign fallback commit returns False."""
-        add_ok = mock.MagicMock(returncode=0)
-        diff_dirty = mock.MagicMock(returncode=1)
-        gpg_fail = mock.MagicMock(returncode=1, stderr="gpg signing failed: secret key not available")
-
-        def _fake_run(cmd, **kwargs):
-            if "add" in cmd:
-                return add_ok
-            if "diff" in cmd:
-                return diff_dirty
-            if "--no-gpg-sign" in cmd:
-                raise subprocess.TimeoutExpired(cmd=cmd, timeout=120)
-            # First commit attempt (no --no-gpg-sign yet)
-            return gpg_fail
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", side_effect=_fake_run):
-            result = _commit_worktree_changes(tmp_path, "spec.md")
-        assert result is False
-
-    def test_gpg_fallback_succeeds_returns_true(self, tmp_path: pathlib.Path):
-        """When the first commit fails with a GPG error and the --no-gpg-sign retry succeeds, returns True."""
-        add_ok = mock.MagicMock(returncode=0)
-        diff_dirty = mock.MagicMock(returncode=1)  # 1 = staged changes exist
-        gpg_fail = mock.MagicMock(returncode=1, stderr="gpg: signing failed: secret key not available")
-        unsigned_ok = mock.MagicMock(returncode=0)
-
-        calls = iter([add_ok, diff_dirty, gpg_fail, unsigned_ok])
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", side_effect=lambda *a, **kw: next(calls)):
-            result = _commit_worktree_changes(tmp_path, "spec.md")
-        assert result is True
-
-    def test_gpg_fallback_also_fails_returns_false(self, tmp_path: pathlib.Path):
-        """When both the initial commit and the --no-gpg-sign fallback fail, returns False."""
-        add_ok = mock.MagicMock(returncode=0)
-        diff_dirty = mock.MagicMock(returncode=1)  # 1 = staged changes exist
-        gpg_fail = mock.MagicMock(returncode=1, stderr="gpg: signing failed: secret key not available")
-        unsigned_fail = mock.MagicMock(returncode=1, stderr="error: commit failed")
-
-        calls = iter([add_ok, diff_dirty, gpg_fail, unsigned_fail])
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", side_effect=lambda *a, **kw: next(calls)):
-            result = _commit_worktree_changes(tmp_path, "spec.md")
-        assert result is False
-
-
-# ---------------------------------------------------------------------------
-# Finding 8 — data-loss guard: commit fails after successful build
-# ---------------------------------------------------------------------------
-
-
-class TestDataLossGuard:
-    """When commit fails after a successful build the worktree must be preserved."""
-
-    @pytest.fixture
-    def orch(self, tmp_path: pathlib.Path):
-        git_manager = mock.MagicMock(spec=GitManager)
-
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return Orchestrator(tmp_path, git_manager, C())
-
-    def test_commit_failure_after_success_returns_false(self, tmp_path: pathlib.Path, orch: Orchestrator):
-        """If _commit_worktree_changes returns False after a successful build, success is False."""
-        spec = tmp_path / "spec.md"
-        spec.write_text("- [ ] a task\n")
-
-        worktree = tmp_path / "wt"
-        worktree.mkdir()
-        (worktree / ".codelicious").mkdir()
-        (worktree / "spec.md").write_text("- [x] a task\n")
-        # Write BUILD_COMPLETE so agent_done is True
-        (worktree / ".codelicious" / "BUILD_COMPLETE").write_text("DONE")
-
-        mock_result = mock.MagicMock(success=True)
-
-        remove_worktree = mock.MagicMock()
-
-        with mock.patch.object(orch, "_run_agent", return_value=mock_result):
-            with mock.patch("codelicious.orchestrator._create_worktree", return_value=worktree):
-                with mock.patch("codelicious.orchestrator._remove_worktree", remove_worktree):
-                    with mock.patch("codelicious.orchestrator._commit_worktree_changes", return_value=False):
-                        _, success = orch._build_spec_in_worktree(spec)
-
-        assert success is False
-        # Worktree must be preserved (not removed) to prevent data loss
-        remove_worktree.assert_not_called()
-
-
-# ---------------------------------------------------------------------------
-# Finding 9 — _create_worktree
-# ---------------------------------------------------------------------------
-
-
-class TestCreateWorktree:
-    """Tests for _create_worktree error and fallback paths."""
-
-    def test_branch_exists_uses_fallback(self, tmp_path: pathlib.Path):
-        """When the first worktree add fails (branch exists), the fallback without -b succeeds."""
-        fail = mock.MagicMock(returncode=1, stderr="already exists")
-        success = mock.MagicMock(returncode=0)
-
-        # Call order: optional stale-remove (skipped — dir doesn't exist yet),
-        # first add (-b), fallback add (no -b)
-        responses = iter([fail, success])
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", side_effect=lambda *a, **kw: next(responses)):
-            result = _create_worktree(tmp_path, "codelicious/test-branch")
-        # Should return the expected worktree path without raising
-        assert result == tmp_path / ".codelicious" / "worktrees" / "codelicious/test-branch"
-
-    def test_first_add_timeout_raises_runtime_error(self, tmp_path: pathlib.Path):
-        """A timeout on the primary worktree add raises RuntimeError."""
-
-        def _fake_run(cmd, **kwargs):
-            if "add" in cmd and "-b" in cmd:
-                raise subprocess.TimeoutExpired(cmd=cmd, timeout=120)
-            return mock.MagicMock(returncode=0)
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", side_effect=_fake_run):
-            with pytest.raises(RuntimeError, match="Timed out creating worktree"):
-                _create_worktree(tmp_path, "codelicious/test-branch")
-
-
-# ---------------------------------------------------------------------------
-# Finding 10 — _abort_merge
-# ---------------------------------------------------------------------------
-
-
-class TestAbortMerge:
-    """Tests for _abort_merge error and timeout paths."""
-
-    def test_non_zero_abort_logs_critical(self, tmp_path: pathlib.Path, caplog):
-        """When git merge --abort returns non-zero, a CRITICAL message is logged."""
-        fail = mock.MagicMock(returncode=1, stderr="nothing to abort")
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", return_value=fail):
-            with caplog.at_level("CRITICAL", logger="codelicious.orchestrator"):
-                _abort_merge(tmp_path)
-
-        assert any("abort failed" in r.message.lower() for r in caplog.records)
-
-    def test_timeout_logs_critical_dirty_state(self, tmp_path: pathlib.Path, caplog):
-        """A timeout on git merge --abort logs a CRITICAL warning about dirty state."""
-        with (
-            mock.patch(
-                "codelicious.orchestrator.subprocess.run",
-                side_effect=subprocess.TimeoutExpired(cmd="git merge", timeout=30),
-            ),
-            caplog.at_level("CRITICAL", logger="codelicious.orchestrator"),
-        ):
-            _abort_merge(tmp_path)
-
-        assert any("dirty state" in r.message.lower() for r in caplog.records)
-
-
-# ---------------------------------------------------------------------------
-# Finding 11 — _merge_worktree_branch
-# ---------------------------------------------------------------------------
-
-
-class TestMergeWorktreeBranch:
-    """Tests for _merge_worktree_branch success, conflict and timeout paths."""
-
-    def test_successful_merge_returns_true(self, tmp_path: pathlib.Path):
-        """A zero-returncode merge returns True."""
-        ok = mock.MagicMock(returncode=0)
-        with mock.patch("codelicious.orchestrator.subprocess.run", return_value=ok):
-            result = _merge_worktree_branch(tmp_path, "codelicious/feat")
-        assert result is True
-
-    def test_merge_conflict_calls_abort_and_returns_false(self, tmp_path: pathlib.Path):
-        """A non-zero merge result calls _abort_merge and returns False."""
-        conflict = mock.MagicMock(returncode=1, stderr="CONFLICT")
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", return_value=conflict):
-            with mock.patch("codelicious.orchestrator._abort_merge") as mock_abort:
-                result = _merge_worktree_branch(tmp_path, "codelicious/feat")
-
-        assert result is False
-        mock_abort.assert_called_once_with(tmp_path)
-
-    def test_timeout_calls_abort_and_returns_false(self, tmp_path: pathlib.Path):
-        """A timeout on git merge calls _abort_merge and returns False."""
-        with (
-            mock.patch(
-                "codelicious.orchestrator.subprocess.run",
-                side_effect=subprocess.TimeoutExpired(cmd="git merge", timeout=120),
-            ),
-            mock.patch("codelicious.orchestrator._abort_merge") as mock_abort,
-        ):
-            result = _merge_worktree_branch(tmp_path, "codelicious/feat")
-
-        assert result is False
-        mock_abort.assert_called_once_with(tmp_path)
-
-
-# ---------------------------------------------------------------------------
-# Finding 12 — Orchestrator.run() loop edge cases
-# ---------------------------------------------------------------------------
-
-
-class TestOrchestratorRunLoop:
-    """Tests for loop-abort logic and commit-failure tolerance in run()."""
-
-    @pytest.fixture
-    def mock_git_manager(self):
-        mgr = mock.MagicMock(spec=GitManager)
-        mgr.commit_verified_changes.return_value = None
-        mgr.push_to_origin.return_value = mock.MagicMock(success=True, error_type=None, message="")
-        mgr.ensure_draft_pr_exists.return_value = None
-        return mgr
-
-    @pytest.fixture
-    def mock_config(self):
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return C()
-
-    def test_zero_progress_for_three_cycles_aborts(self, tmp_path: pathlib.Path, mock_git_manager, mock_config):
-        """_phase_build returning all failures for 3 consecutive cycles aborts the loop."""
-        spec = tmp_path / "spec.md"
-        spec.write_text("- [ ] never built\n")
-
-        orch = Orchestrator(tmp_path, mock_git_manager, mock_config)
-
-        with mock.patch.object(orch, "_phase_build", return_value=[("codelicious/spec", False)]):
-            with mock.patch.object(orch, "_phase_review", return_value=[]):
-                with mock.patch.object(orch, "_phase_fix", return_value=True):
-                    with mock.patch(
-                        "codelicious.prompts.scan_remaining_tasks_for_spec",
-                        return_value=1,
-                    ):
-                        result = orch.run(specs=[spec], reviewers=[], max_build_cycles=10)
-
-        assert result.success is False
-        assert result.cycles_completed == 3
-
-    def test_commit_raises_does_not_crash_run(self, tmp_path: pathlib.Path, mock_git_manager, mock_config):
-        """An exception from commit_verified_changes must not propagate out of run()."""
-        spec = tmp_path / "spec.md"
-        spec.write_text("- [x] already done\n")
-
-        mock_git_manager.commit_verified_changes.side_effect = RuntimeError("disk full")
-
-        orch = Orchestrator(tmp_path, mock_git_manager, mock_config)
-
-        with mock.patch.object(orch, "_phase_review", return_value=[]):
-            with mock.patch.object(orch, "_phase_fix", return_value=True):
-                result = orch.run(specs=[spec], reviewers=[], max_build_cycles=5)
-
-        # run() must still return a valid OrchestratorResult regardless of commit errors
-        assert isinstance(result, OrchestratorResult)
-
-
-# ---------------------------------------------------------------------------
-# Finding 13 — spec-not-in-worktree fallback
-# ---------------------------------------------------------------------------
-
-
-class TestSpecNotInWorktreeFallback:
-    """Tests for the fallback prompt when a spec path is outside the worktree."""
-
-    @pytest.fixture
-    def orch(self, tmp_path: pathlib.Path):
-        git_manager = mock.MagicMock(spec=GitManager)
-
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return Orchestrator(tmp_path, git_manager, C())
-
-    def test_spec_outside_repo_logs_warning_and_uses_fallback(self, tmp_path: pathlib.Path, orch: Orchestrator, caplog):
-        """A spec path not under repo_path logs a warning and uses the filename as fallback."""
-        # Create a spec outside the repo (different tmp directory)
-        outside_dir = tmp_path / "outside"
-        outside_dir.mkdir()
-        spec = outside_dir / "myspec.md"
-        spec.write_text("- [ ] build task\n")
-
-        # Use a completely different repo_path so spec is definitely outside it
-        other_repo = tmp_path / "repo"
-        other_repo.mkdir()
-        orch.repo_path = other_repo
-
-        worktree = tmp_path / "wt"
-        worktree.mkdir()
-        (worktree / ".codelicious").mkdir()
-
-        captured_prompts: list[str] = []
-
-        def fake_run_agent(prompt, project_root, session_id=""):
-            captured_prompts.append(prompt)
-            return mock.MagicMock(success=False)
-
-        with mock.patch.object(orch, "_run_agent", side_effect=fake_run_agent):
-            with mock.patch("codelicious.orchestrator._create_worktree", return_value=worktree):
-                with mock.patch("codelicious.orchestrator._remove_worktree"):
-                    with mock.patch("codelicious.orchestrator._commit_worktree_changes", return_value=False):
-                        with caplog.at_level("WARNING", logger="codelicious.orchestrator"):
-                            orch._build_spec_in_worktree(spec)
-
-        warning_messages = [r.message for r in caplog.records if r.levelname == "WARNING"]
-        assert any("not under repo" in m for m in warning_messages)
-        # The agent should have been called with the filename-based fallback path
-        assert len(captured_prompts) == 1
-        assert "myspec.md" in captured_prompts[0]
-
-    def test_spec_missing_in_worktree_uses_fallback_prompt(self, tmp_path: pathlib.Path, orch: Orchestrator, caplog):
-        """When the resolved spec path doesn't exist in the worktree, the agent gets a fallback prompt."""
-        spec = tmp_path / "docs" / "spec_missing.md"
-        spec.parent.mkdir(parents=True, exist_ok=True)
-        spec.write_text("- [ ] build task\n")
-
-        worktree = tmp_path / "wt"
-        worktree.mkdir()
-        (worktree / ".codelicious").mkdir()
-        # Intentionally do NOT create worktree / "docs" / "spec_missing.md"
-
-        captured_prompts: list[str] = []
-
-        def fake_run_agent(prompt, project_root, session_id=""):
-            captured_prompts.append(prompt)
-            return mock.MagicMock(success=False)
-
-        with mock.patch.object(orch, "_run_agent", side_effect=fake_run_agent):
-            with mock.patch("codelicious.orchestrator._create_worktree", return_value=worktree):
-                with mock.patch("codelicious.orchestrator._remove_worktree"):
-                    with mock.patch("codelicious.orchestrator._commit_worktree_changes", return_value=False):
-                        with caplog.at_level("WARNING", logger="codelicious.orchestrator"):
-                            orch._build_spec_in_worktree(spec)
-
-        warning_messages = [r.message for r in caplog.records if r.levelname == "WARNING"]
-        assert any("not found in worktree" in m for m in warning_messages)
-        assert len(captured_prompts) == 1
-        assert "spec_missing.md" in captured_prompts[0]
-
-
-# ---------------------------------------------------------------------------
-# Finding 68 — _phase_build parallel error path
-# ---------------------------------------------------------------------------
-
-
-class TestPhaseBuildParallelErrorPath:
-    """Tests that _phase_build catches exceptions from one worker while
-    allowing the rest to succeed, and that the failed spec produces a
-    (branch, False) result."""
-
-    @pytest.fixture
-    def orch(self, tmp_path: pathlib.Path) -> Orchestrator:
-        git_manager = mock.MagicMock()
-        git_manager.push_to_origin.return_value = mock.MagicMock(success=True, error_type=None, message="")
-
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return Orchestrator(tmp_path, git_manager, C())
-
-    def test_one_worker_raises_caught_logged_and_false_returned(
-        self, tmp_path: pathlib.Path, orch: Orchestrator, caplog
-    ):
-        """When one future raises, the exception must be caught, an error
-        logged, and (branch, False) returned for that spec while the other
-        spec's success result is preserved."""
-        spec_ok = tmp_path / "spec_ok.md"
-        spec_fail = tmp_path / "spec_fail.md"
-        spec_ok.write_text("")
-        spec_fail.write_text("")
-
-        barrier = threading.Barrier(2)
-
-        def fake_build(spec: pathlib.Path) -> tuple[str, bool]:
-            barrier.wait(timeout=5)
-            if spec == spec_fail:
-                raise RuntimeError("worker exploded")
-            return (f"codelicious/build-{spec.stem}", True)
-
-        with mock.patch.object(orch, "_build_spec_in_worktree", side_effect=fake_build):
-            with caplog.at_level("ERROR", logger="codelicious.orchestrator"):
-                results = orch._phase_build([spec_ok, spec_fail], max_workers=2)
-
-        assert len(results) == 2
-        # The failing spec must produce (branch, False)
-        fail_results = [(b, ok) for b, ok in results if not ok]
-        assert len(fail_results) == 1
-        # The failing branch name is derived from spec.stem
-        assert "spec_fail" in fail_results[0][0]
-        # The success spec must still produce (branch, True)
-        ok_results = [(b, ok) for b, ok in results if ok]
-        assert len(ok_results) == 1
-        # An error must have been logged for the exception
-        error_msgs = [r.message for r in caplog.records if r.levelno >= logging.ERROR]
-        assert any("worker exploded" in m or "spec_fail" in m for m in error_msgs)
-
-
-# ---------------------------------------------------------------------------
-# Finding 69 — _phase_merge
-# ---------------------------------------------------------------------------
-
-
-class TestPhaseMerge:
-    """Tests for _phase_merge success, conflict, and all-failures paths."""
-
-    @pytest.fixture
-    def orch(self, tmp_path: pathlib.Path) -> Orchestrator:
-        # Finding 82: use spec=GitManager so attribute access is validated
-        git_manager = mock.MagicMock(spec=GitManager)
-
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return Orchestrator(tmp_path, git_manager, C())
-
-    def test_all_failures_returns_zero_merged(self, orch: Orchestrator):
-        """When every build result is False, _phase_merge returns 0."""
-        build_results = [("codelicious/spec-a", False), ("codelicious/spec-b", False)]
-        result = orch._phase_merge(build_results)
-        assert result == 0
-
-    def test_successful_merge_deletes_branch(self, tmp_path: pathlib.Path, orch: Orchestrator):
-        """A branch that merges successfully must have _delete_branch called on it."""
-        build_results = [("codelicious/spec-ok", True)]
-
-        with mock.patch("codelicious.orchestrator._merge_worktree_branch", return_value=True) as mock_merge:
-            with mock.patch("codelicious.orchestrator._delete_branch") as mock_del:
-                merged = orch._phase_merge(build_results)
-
-        assert merged == 1
-        mock_merge.assert_called_once_with(orch.repo_path, "codelicious/spec-ok")
-        mock_del.assert_called_once_with(orch.repo_path, "codelicious/spec-ok")
-
-    def test_merge_conflict_logs_warning_and_skips_delete(self, tmp_path: pathlib.Path, orch: Orchestrator, caplog):
-        """A merge conflict must log a warning and not call _delete_branch."""
-        build_results = [("codelicious/spec-conflict", True)]
-
-        with mock.patch("codelicious.orchestrator._merge_worktree_branch", return_value=False):
-            with mock.patch("codelicious.orchestrator._delete_branch") as mock_del:
-                with caplog.at_level("WARNING", logger="codelicious.orchestrator"):
-                    merged = orch._phase_merge(build_results)
-
-        assert merged == 0
-        mock_del.assert_not_called()
-        warning_msgs = [r.message for r in caplog.records if r.levelname == "WARNING"]
-        assert any("conflict" in m.lower() or "merge" in m.lower() for m in warning_msgs)
-
-
-# ---------------------------------------------------------------------------
-# Finding 70 — _phase_review parallel path
-# ---------------------------------------------------------------------------
-
-
-class TestPhaseReviewParallelPath:
-    """Tests for _phase_review error handling in the parallel path."""
-
-    @pytest.fixture
-    def orch(self, tmp_path: pathlib.Path) -> Orchestrator:
-        git_manager = mock.MagicMock()
-
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return Orchestrator(tmp_path, git_manager, C())
-
-    def test_one_reviewer_raises_caught_remaining_findings_collected(
-        self, tmp_path: pathlib.Path, orch: Orchestrator, caplog
-    ):
-        """When one reviewer future raises, the exception is caught, an error
-        logged, and the findings from the remaining reviewer are still returned."""
-        good_finding = Finding(
-            role="qa",
-            severity="P2",
-            file="src/foo.py",
-            line=10,
-            title="missing test",
-            description="untested path",
-            fix="add test",
-        )
-
-        def fake_reviewer(role: str) -> list[Finding]:
-            if role == "security":
-                raise RuntimeError("security agent crashed")
-            return [good_finding]
-
-        with mock.patch.object(orch, "_run_reviewer", side_effect=fake_reviewer):
-            with caplog.at_level("ERROR", logger="codelicious.orchestrator"):
-                results = orch._phase_review(["security", "qa"], max_workers=2)
-
-        # The QA finding must still be present
-        assert any(f.role == "qa" for f in results)
-        # An error must have been logged for the exception
-        error_msgs = [r.message for r in caplog.records if r.levelno >= logging.ERROR]
-        assert any("security" in m.lower() or "crashed" in m.lower() for m in error_msgs)
-
-
-# ---------------------------------------------------------------------------
-# Finding 6 — push_pr=True code path
-# ---------------------------------------------------------------------------
-
-
-class TestPushPrPath:
-    """Tests for the push_pr=True branch in Orchestrator.run()."""
-
-    @pytest.fixture
-    def mock_config(self):
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return C()
-
-    def test_push_pr_true_calls_ensure_draft_pr_exists(self, tmp_path: pathlib.Path, mock_config):
-        """When push_pr=True, ensure_draft_pr_exists() is called with spec_id and spec_summary."""
-        from codelicious.git.git_orchestrator import GitManager
-
-        git_manager = mock.MagicMock(spec=GitManager)
-        git_manager.commit_verified_changes.return_value = None
-        git_manager.push_to_origin.return_value = mock.MagicMock(success=True, error_type=None, message="")
-        git_manager.ensure_draft_pr_exists.return_value = None
-
-        spec = tmp_path / "16_test_spec.md"
-        spec.write_text("- [x] already done\n")
-
-        orch = Orchestrator(tmp_path, git_manager, mock_config)
-
-        with mock.patch.object(orch, "_phase_review", return_value=[]):
-            with mock.patch.object(orch, "_phase_fix", return_value=True):
-                orch.run(specs=[spec], reviewers=[], max_build_cycles=5, push_pr=True)
-
-        git_manager.ensure_draft_pr_exists.assert_called_once()
-        call_kwargs = git_manager.ensure_draft_pr_exists.call_args.kwargs
-        assert call_kwargs["spec_id"] == "16"
-        assert "spec_summary" in call_kwargs
-
-    def test_push_pr_true_exception_logs_warning_and_run_returns(self, tmp_path: pathlib.Path, mock_config, caplog):
-        """When ensure_draft_pr_exists() raises, a warning is logged and run() does not crash."""
-        from codelicious.git.git_orchestrator import GitManager
-
-        git_manager = mock.MagicMock(spec=GitManager)
-        git_manager.commit_verified_changes.return_value = None
-        git_manager.push_to_origin.return_value = mock.MagicMock(success=True, error_type=None, message="")
-        git_manager.ensure_draft_pr_exists.side_effect = RuntimeError("gh CLI not found")
-
-        spec = tmp_path / "22_test_spec.md"
-        spec.write_text("- [x] already done\n")
-
-        orch = Orchestrator(tmp_path, git_manager, mock_config)
-
-        with mock.patch.object(orch, "_phase_review", return_value=[]):
-            with mock.patch.object(orch, "_phase_fix", return_value=True):
-                with caplog.at_level("WARNING", logger="codelicious.orchestrator"):
-                    result = orch.run(specs=[spec], reviewers=[], max_build_cycles=5, push_pr=True)
-
-        # run() must return a valid result despite the PR creation failure
-        assert isinstance(result, OrchestratorResult)
-        warning_msgs = [r.message for r in caplog.records if r.levelname == "WARNING"]
-        assert any("PR creation" in m or "gh CLI not found" in m for m in warning_msgs)
-
-
-# ---------------------------------------------------------------------------
-# Finding 71 — _phase_fix
-# ---------------------------------------------------------------------------
-
-
-class TestPhaseFix:
-    """Tests for _phase_fix short-circuit and agent-failure paths."""
-
-    @pytest.fixture
-    def orch(self, tmp_path: pathlib.Path) -> Orchestrator:
-        # Finding 82: use spec=GitManager so attribute access is validated
-        git_manager = mock.MagicMock(spec=GitManager)
-
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return Orchestrator(tmp_path, git_manager, C())
-
-    def test_only_p3_findings_returns_true_without_calling_agent(self, orch: Orchestrator):
-        """When all findings are P3, _phase_fix must return True immediately
-        without invoking the fix agent."""
-        p3_findings = [
-            Finding(role="qa", severity="P3", file="a.py", line=1, title="minor", description="", fix=""),
-            Finding(role="qa", severity="P3", file="b.py", line=2, title="also minor", description="", fix=""),
-        ]
-
-        with mock.patch.object(orch, "_run_agent") as mock_agent:
-            result = orch._phase_fix(p3_findings)
-
-        assert result is True
-        mock_agent.assert_not_called()
-
-    def test_p1_finding_agent_raises_returns_false(self, tmp_path: pathlib.Path, orch: Orchestrator):
-        """When the fix agent raises an exception, _phase_fix must return False."""
-        p1_finding = Finding(
-            role="security",
-            severity="P1",
-            file="src/bar.py",
-            line=42,
-            title="critical issue",
-            description="dangerous code",
-            fix="remove it",
-        )
-
-        with mock.patch.object(orch, "_run_agent", side_effect=RuntimeError("agent timed out")):
-            with mock.patch("codelicious.prompts.check_build_complete", return_value=False):
-                with mock.patch("codelicious.prompts.clear_build_complete"):
-                    result = orch._phase_fix([p1_finding])
-
-        assert result is False
-
-
-# ---------------------------------------------------------------------------
-# Finding 63 — _create_worktree failure/fallback paths
-# ---------------------------------------------------------------------------
-
-
-class TestCreateWorktreeFailurePaths:
-    """Finding 63: _create_worktree fallback and double-failure paths."""
-
-    def test_fallback_non_zero_raises_runtime_error(self, tmp_path: pathlib.Path):
-        """When both primary add (-b) and fallback add (no -b) return non-zero,
-        RuntimeError is raised."""
-        primary_fail = mock.MagicMock(returncode=1, stderr="fatal: cannot create branch")
-        fallback_fail = mock.MagicMock(returncode=1, stderr="fatal: worktree already exists")
-
-        responses = iter([primary_fail, fallback_fail])
-
-        with (
-            mock.patch(
-                "codelicious.orchestrator.subprocess.run",
-                side_effect=lambda *a, **kw: next(responses),
-            ),
-            pytest.raises(RuntimeError, match="Failed to create worktree"),
-        ):
-            _create_worktree(tmp_path, "codelicious/my-branch")
-
-    def test_fallback_timeout_raises_runtime_error(self, tmp_path: pathlib.Path):
-        """When the fallback (no -b) worktree add times out, RuntimeError is raised."""
-        primary_fail = mock.MagicMock(returncode=1, stderr="already exists")
-
-        def _fake_run(cmd, **kwargs):
-            if "-b" in cmd:
-                return primary_fail
-            # fallback — no -b
-            raise subprocess.TimeoutExpired(cmd=cmd, timeout=120)
-
-        with mock.patch("codelicious.orchestrator.subprocess.run", side_effect=_fake_run):
-            with pytest.raises(RuntimeError, match="Timed out creating worktree"):
-                _create_worktree(tmp_path, "codelicious/my-branch")
-
-    def test_primary_fails_fallback_succeeds_returns_path(self, tmp_path: pathlib.Path):
-        """When the primary (-b) add fails and the fallback succeeds, the worktree path
-        is returned without raising."""
-        primary_fail = mock.MagicMock(returncode=1, stderr="already exists")
-        fallback_ok = mock.MagicMock(returncode=0)
-
-        responses = iter([primary_fail, fallback_ok])
-
-        with mock.patch(
-            "codelicious.orchestrator.subprocess.run",
-            side_effect=lambda *a, **kw: next(responses),
-        ):
-            result = _create_worktree(tmp_path, "codelicious/my-branch")
-
-        expected = tmp_path / ".codelicious" / "worktrees" / "codelicious/my-branch"
-        assert result == expected
-
-
-# ---------------------------------------------------------------------------
-# Finding 64 — _delete_branch
-# ---------------------------------------------------------------------------
-
-
-class TestDeleteBranch:
-    """Finding 64: _delete_branch happy path, timeout, and non-zero exit."""
-
-    def test_happy_path_succeeds_silently(self, tmp_path: pathlib.Path):
-        """When git branch -d returns zero, _delete_branch returns without raising."""
-        ok = mock.MagicMock(returncode=0)
-        with mock.patch("codelicious.orchestrator.subprocess.run", return_value=ok):
-            # Should not raise
-            from codelicious.orchestrator import _delete_branch
-
-            _delete_branch(tmp_path, "codelicious/my-branch")
-
-    def test_timeout_logs_warning_and_returns(self, tmp_path: pathlib.Path, caplog):
-        """A timeout on git branch -d logs a warning and does not raise."""
-        from codelicious.orchestrator import _delete_branch
-
-        with (
-            mock.patch(
-                "codelicious.orchestrator.subprocess.run",
-                side_effect=subprocess.TimeoutExpired(cmd=["git", "branch", "-d"], timeout=120),
-            ),
-            caplog.at_level("WARNING", logger="codelicious.orchestrator"),
-        ):
-            _delete_branch(tmp_path, "codelicious/timed-out-branch")
-
-        warning_msgs = [r.message for r in caplog.records if r.levelname == "WARNING"]
-        assert any("timed out" in m.lower() or "timeout" in m.lower() for m in warning_msgs)
-
-    def test_non_zero_exit_logs_warning_and_returns(self, tmp_path: pathlib.Path, caplog):
-        """A non-zero exit from git branch -d logs a warning and does not raise."""
-        from codelicious.orchestrator import _delete_branch
-
-        fail = mock.MagicMock(returncode=1, stderr="error: branch not fully merged")
-        with mock.patch("codelicious.orchestrator.subprocess.run", return_value=fail):
-            with caplog.at_level("WARNING", logger="codelicious.orchestrator"):
-                _delete_branch(tmp_path, "codelicious/un-merged-branch")
-
-        warning_msgs = [r.message for r in caplog.records if r.levelname == "WARNING"]
-        assert any("failed to delete" in m.lower() or "un-merged-branch" in m.lower() for m in warning_msgs)
-
-
-# ---------------------------------------------------------------------------
-# Finding 65 — _phase_build KeyboardInterrupt handling
-# ---------------------------------------------------------------------------
-
-
-class TestPhaseBuildKeyboardInterrupt:
-    """Finding 65: KeyboardInterrupt inside _phase_build shuts the pool down and re-raises."""
-
-    @pytest.fixture
-    def orch(self, tmp_path: pathlib.Path) -> Orchestrator:
-        git_manager = mock.MagicMock()
-        git_manager.push_to_origin.return_value = mock.MagicMock(success=True, error_type=None, message="")
-
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return Orchestrator(tmp_path, git_manager, C())
-
-    def test_keyboard_interrupt_re_raises_and_pool_is_shut_down(self, tmp_path: pathlib.Path, orch: Orchestrator):
-        """When concurrent.futures.as_completed raises KeyboardInterrupt,
-        the exception is re-raised after cancelling pending futures."""
-        spec_a = tmp_path / "spec_a.md"
-        spec_b = tmp_path / "spec_b.md"
-        spec_a.write_text("")
-        spec_b.write_text("")
-
-        with (
-            mock.patch(
-                "concurrent.futures.as_completed",
-                side_effect=KeyboardInterrupt,
-            ),
-            mock.patch.object(orch, "_build_spec_in_worktree", return_value=("branch", True)),
-        ):
-            with pytest.raises(KeyboardInterrupt):
-                orch._phase_build([spec_a, spec_b], max_workers=2)
-
-
-# ---------------------------------------------------------------------------
-# Finding 66 — commit_ok=False, success=True data-loss prevention path
-# ---------------------------------------------------------------------------
-
-
-class TestCommitFailureAfterSuccessPreservesWorktree:
-    """Finding 66: when _commit_worktree_changes returns False and the build
-    was successful, the worktree is preserved and success becomes False."""
-
-    @pytest.fixture
-    def orch(self, tmp_path: pathlib.Path) -> Orchestrator:
-        git_manager = mock.MagicMock()
-
-        class C:
-            model = ""
-            effort = ""
-            max_turns = 0
-            agent_timeout_s = 30
-            dry_run = True
-
-        return Orchestrator(tmp_path, git_manager, C())
-
-    def test_commit_false_success_true_sets_success_false_and_preserves_worktree(
-        self, tmp_path: pathlib.Path, orch: Orchestrator, caplog
-    ):
-        """If _commit_worktree_changes returns False after a successful build,
-        success is flipped to False and the worktree is NOT removed."""
-        spec = tmp_path / "spec.md"
-        spec.write_text("- [ ] some task\n")
-
-        worktree = tmp_path / "wt"
-        worktree.mkdir()
-        (worktree / ".codelicious").mkdir()
-        (worktree / ".codelicious" / "BUILD_COMPLETE").write_text("DONE")
-        (worktree / "spec.md").write_text("- [x] some task\n")
-
-        remove_worktree = mock.MagicMock()
-
-        with mock.patch.object(orch, "_run_agent", return_value=mock.MagicMock(success=True)):
-            with mock.patch("codelicious.orchestrator._create_worktree", return_value=worktree):
-                with mock.patch("codelicious.orchestrator._remove_worktree", remove_worktree):
-                    with mock.patch("codelicious.orchestrator._commit_worktree_changes", return_value=False):
-                        with caplog.at_level("ERROR", logger="codelicious.orchestrator"):
-                            _, success = orch._build_spec_in_worktree(spec)
-
-        assert success is False
-        remove_worktree.assert_not_called()
-        error_msgs = [r.message for r in caplog.records if r.levelno >= logging.ERROR]
-        assert any("data loss" in m.lower() or "commit" in m.lower() for m in error_msgs)
-
-
-# ---------------------------------------------------------------------------
-# spec-21 Phase 14: REVIEWER_PROMPTS and ReviewRole coverage
-# ---------------------------------------------------------------------------
-
-
-class TestReviewerPromptsStructure:
-    """Tests for REVIEWER_PROMPTS dict structure (spec-21 Phase 14)."""
-
-    def test_reviewer_prompts_is_dict_with_string_values(self) -> None:
-        """REVIEWER_PROMPTS must be a dict[str, str]."""
-        assert isinstance(REVIEWER_PROMPTS, dict)
-        assert len(REVIEWER_PROMPTS) > 0
-        for key, value in REVIEWER_PROMPTS.items():
-            assert isinstance(key, str), f"Key {key!r} is not a string"
-            assert isinstance(value, str), f"Value for {key!r} is not a string"
-
-    def test_reviewer_prompts_has_security_role(self) -> None:
-        """REVIEWER_PROMPTS must include a 'security' role."""
-        assert "security" in REVIEWER_PROMPTS
-
-    def test_reviewer_prompts_contain_template_vars(self) -> None:
-        """Each prompt must contain the {{project_name}} template variable."""
-        for key, prompt in REVIEWER_PROMPTS.items():
-            assert "{{project_name}}" in prompt, f"Role {key!r} missing {{{{project_name}}}}"
-
-
-class TestReviewRoleDataclass:
-    """Tests for ReviewRole dataclass (spec-21 Phase 14)."""
-
-    def test_review_role_fields(self) -> None:
-        """ReviewRole must have name and prompt fields."""
-        role = ReviewRole(name="security", prompt="Review for vulnerabilities.")
-        assert role.name == "security"
-        assert role.prompt == "Review for vulnerabilities."
-
-    def test_review_role_is_frozen(self) -> None:
-        """ReviewRole must be frozen (immutable)."""
-        role = ReviewRole(name="test", prompt="test prompt")
-        with pytest.raises(AttributeError):
-            role.name = "modified"
diff --git a/tests/test_prompts.py b/tests/test_prompts.py
index ac41c63b..71b88efd 100644
--- a/tests/test_prompts.py
+++ b/tests/test_prompts.py
@@ -8,7 +8,9 @@
 import pytest
 
 from codelicious.prompts import (
-    AGENT_BUILD_SPEC,
+    CHUNK_EXECUTE,
+    CHUNK_FIX,
+    CHUNK_VERIFY,
     check_build_complete,
     clear_build_complete,
     render,
@@ -124,17 +126,6 @@ def test_unused_kwargs_ignored(self):
         result = render("Hello {{name}}!", name="world", extra="ignored")
         assert result == "Hello world!"
 
-    def test_spec_filter_substitution(self):
-        """The critical fix: spec_filter is actually substituted into the prompt."""
-        result = render(
-            AGENT_BUILD_SPEC,
-            project_name="myproject",
-            spec_filter="/path/to/spec.md",
-        )
-        assert "/path/to/spec.md" in result
-        assert "{{spec_filter}}" not in result
-        assert "{{project_name}}" not in result
-
     def test_partial_kwargs_leaves_unreplaced_tokens_verbatim(self):
         """render() with only some kwargs replaces provided tokens and leaves others intact."""
         template = "Hello {{name}}, your task is {{task}}!"
@@ -236,16 +227,8 @@ def test_all_prompt_constants_are_strings(self) -> None:
                 if isinstance(val, str):
                     assert len(val) > 0, f"Prompt constant {name} is empty"
 
-    def test_agent_build_spec_contains_template_vars(self) -> None:
-        """AGENT_BUILD_SPEC must contain {{project_name}} and {{spec_filter}}."""
-        from codelicious.prompts import AGENT_BUILD_SPEC
-
-        assert "{{project_name}}" in AGENT_BUILD_SPEC
-        assert "{{spec_filter}}" in AGENT_BUILD_SPEC
-
     def test_chunk_execute_contains_template_vars(self) -> None:
         """CHUNK_EXECUTE must contain expected template variables."""
-        from codelicious.prompts import CHUNK_EXECUTE
 
         assert "{{repo_path}}" in CHUNK_EXECUTE
         assert "{{chunk_id}}" in CHUNK_EXECUTE
@@ -256,14 +239,12 @@ def test_chunk_execute_contains_template_vars(self) -> None:
 
     def test_chunk_verify_contains_template_vars(self) -> None:
         """CHUNK_VERIFY must contain expected template variables."""
-        from codelicious.prompts import CHUNK_VERIFY
 
         assert "{{repo_path}}" in CHUNK_VERIFY
         assert "{{chunk_id}}" in CHUNK_VERIFY
 
     def test_chunk_fix_contains_template_vars(self) -> None:
         """CHUNK_FIX must contain expected template variables."""
-        from codelicious.prompts import CHUNK_FIX
 
         assert "{{repo_path}}" in CHUNK_FIX
         assert "{{chunk_id}}" in CHUNK_FIX
@@ -271,7 +252,7 @@ def test_chunk_fix_contains_template_vars(self) -> None:
 
     def test_chunk_templates_renderable(self) -> None:
         """All chunk templates can be rendered with render()."""
-        from codelicious.prompts import CHUNK_EXECUTE, CHUNK_FIX, CHUNK_VERIFY, render
+        from codelicious.prompts import render
 
         rendered = render(
             CHUNK_EXECUTE,
diff --git a/tests/test_sandbox.py b/tests/test_sandbox.py
index 9397f43f..2578e8ca 100644
--- a/tests/test_sandbox.py
+++ b/tests/test_sandbox.py
@@ -879,3 +879,35 @@ def batch_writer(thread_id: int) -> None:
         failed = sum(1 for _, ok in results if not ok)
         assert succeeded == 10, f"Expected 10 successes, got {succeeded}"
         assert failed == 10, f"Expected 10 failures, got {failed}"
+
+
+# -- spec v29 Step 15: O_NOFOLLOW unavailable warning --------------------------
+
+
+class TestONoFollowWarning:
+    """spec v29 Step 15: Sandbox emits a single WARNING when O_NOFOLLOW is unavailable."""
+
+    def test_warning_emitted_when_o_nofollow_missing(
+        self, tmp_path: pathlib.Path, caplog: pytest.LogCaptureFixture
+    ) -> None:
+        """A platform without os.O_NOFOLLOW (e.g. Windows) gets a clear warning at init."""
+        with unittest.mock.patch("codelicious.sandbox.os") as mock_os:
+            # Re-expose only the attributes the constructor touches; deliberately
+            # omit O_NOFOLLOW so hasattr() returns False.
+            mock_os.path = os.path
+            mock_os.environ = os.environ
+            del mock_os.O_NOFOLLOW
+            with caplog.at_level(logging.WARNING, logger="codelicious.sandbox"):
+                Sandbox(tmp_path)
+
+        warnings = [r for r in caplog.records if "O_NOFOLLOW unavailable" in r.message]
+        assert len(warnings) == 1, f"expected exactly one O_NOFOLLOW warning, got {len(warnings)}"
+
+    def test_no_warning_on_posix(self, tmp_path: pathlib.Path, caplog: pytest.LogCaptureFixture) -> None:
+        """On POSIX (where os.O_NOFOLLOW exists) no warning fires."""
+        if not hasattr(os, "O_NOFOLLOW"):
+            pytest.skip("Test only meaningful on POSIX")
+        with caplog.at_level(logging.WARNING, logger="codelicious.sandbox"):
+            Sandbox(tmp_path)
+        warnings = [r for r in caplog.records if "O_NOFOLLOW unavailable" in r.message]
+        assert warnings == []
diff --git a/tests/test_scaffolder.py b/tests/test_scaffolder.py
index 376a7de8..4b7449e9 100644
--- a/tests/test_scaffolder.py
+++ b/tests/test_scaffolder.py
@@ -204,3 +204,37 @@ def test_scaffold_propagates_oserror_from_atomic_write(tmp_path: pathlib.Path) -
     with patch("codelicious.scaffolder.atomic_write_text", side_effect=OSError("disk full")):
         with pytest.raises(OSError, match="disk full"):
             scaffold(tmp_path)
+
+
+# -- spec v29 Step 18: forbidden-pattern tuple is single source of truth ---
+
+
+class TestForbiddenPatternsSingleSource:
+    def test_all_patterns_appear_in_security_rules(self) -> None:
+        from codelicious.scaffolder import _FORBIDDEN_PATTERNS_DOC, _RULES_SECURITY
+
+        for pattern in _FORBIDDEN_PATTERNS_DOC:
+            assert pattern in _RULES_SECURITY, f"missing {pattern!r} in _RULES_SECURITY"
+
+    def test_all_patterns_appear_in_verify_all_skill(self) -> None:
+        from codelicious.scaffolder import _FORBIDDEN_PATTERNS_DOC, _SKILL_VERIFY_ALL
+
+        for pattern in _FORBIDDEN_PATTERNS_DOC:
+            assert pattern in _SKILL_VERIFY_ALL
+
+    def test_all_patterns_appear_in_builder_agent(self) -> None:
+        from codelicious.scaffolder import _AGENT_BUILDER, _FORBIDDEN_PATTERNS_DOC
+
+        for pattern in _FORBIDDEN_PATTERNS_DOC:
+            assert pattern in _AGENT_BUILDER
+
+    def test_no_unsubstituted_sentinels_remain(self) -> None:
+        from codelicious.scaffolder import (
+            _AGENT_BUILDER,
+            _RULES_SECURITY,
+            _SKILL_VERIFY_ALL,
+        )
+
+        for blob in (_AGENT_BUILDER, _RULES_SECURITY, _SKILL_VERIFY_ALL):
+            assert "__FORBIDDEN_INLINE__" not in blob
+            assert "__FORBIDDEN_BULLETS__" not in blob
diff --git a/tests/test_v2_orchestrator.py b/tests/test_v2_orchestrator.py
index 09f37c47..7e2313f3 100644
--- a/tests/test_v2_orchestrator.py
+++ b/tests/test_v2_orchestrator.py
@@ -184,6 +184,19 @@ def test_deadline_stops_execution(self, tmp_path: pathlib.Path) -> None:
         # Chunks should not have been executed
         engine.execute_chunk.assert_not_called()
 
+    def test_future_deadline_runs_all_chunks(self, tmp_path: pathlib.Path) -> None:
+        """A deadline well in the future does not interrupt chunk execution (spec v29 Step 10)."""
+        import time
+
+        spec = self._make_spec(tmp_path, "# Feature\n\n## Phase 1\n\n- [ ] Task A\n- [ ] Task B\n")
+        engine = self._mock_engine(success=True)
+        git = self._mock_git()
+
+        orch = V2Orchestrator(tmp_path, git, engine)
+        orch.run(specs=[spec], deadline=time.monotonic() + 3600, push_pr=False)
+
+        assert engine.execute_chunk.call_count == 2
+
     def test_empty_spec_no_chunks(self, tmp_path: pathlib.Path) -> None:
         """A spec with no checkboxes and no body produces no chunks."""
         spec = self._make_spec(tmp_path, "# Empty Spec\n")
@@ -321,3 +334,194 @@ def test_commit_cap_takes_precedence_over_loc_cap(self, tmp_path: pathlib.Path)
         git.create_continuation_branch.assert_called_once()
         # LOC check should not have been polled on the chunk that split via commits
         assert git.get_pr_diff_loc.call_count == 0
+
+
+class TestIdempotentResume:
+    """spec v30 Step 2: persistent chunk-status ledger drives skip-on-resume."""
+
+    def _make_spec(self, tmp_path: pathlib.Path, content: str) -> pathlib.Path:
+        spec_dir = tmp_path / "docs" / "specs"
+        spec_dir.mkdir(parents=True, exist_ok=True)
+        spec = spec_dir / "01_feature.md"
+        spec.write_text(content, encoding="utf-8")
+        return spec
+
+    def _mock_engine(self):
+        engine = mock.MagicMock()
+        engine.name = "mock-engine"
+        engine.execute_chunk.return_value = ChunkResult(
+            success=True, files_modified=[pathlib.Path("src/a.py")], message="done"
+        )
+        engine.verify_chunk.return_value = ChunkResult(success=True, message="passed")
+        engine.fix_chunk.return_value = ChunkResult(success=True, message="fixed")
+        return engine
+
+    def _mock_git(self):
+        git = mock.MagicMock()
+        git.assert_safe_branch = mock.MagicMock()
+        git.push_to_origin.return_value = mock.MagicMock(success=True, error_type=None, message="")
+        git.commit_chunk.return_value = mock.MagicMock(success=True, sha="abc1234", message="ok")
+        git.get_pr_commit_count.return_value = 0
+        git.get_pr_diff_loc.return_value = 0
+        git.ensure_draft_pr_exists.return_value = 42
+        git.revert_chunk_changes.return_value = True
+        git.repo_path = pathlib.Path("/tmp/repo")
+        return git
+
+    def test_merged_chunk_skipped(self, tmp_path: pathlib.Path) -> None:
+        """A chunk previously marked 'merged' in the ledger is not re-executed."""
+        import json
+
+        spec = self._make_spec(tmp_path, "# Feature\n\n## Phase 1\n\n- [ ] Task A\n- [ ] Task B\n")
+        engine = self._mock_engine()
+        git = self._mock_git()
+        orch = V2Orchestrator(tmp_path, git, engine)
+        ledger_path = orch._ledger_path(spec)
+        ledger_path.parent.mkdir(parents=True, exist_ok=True)
+        ledger_path.write_text(
+            json.dumps(
+                {
+                    "chunks": {
+                        "spec-01-chunk-01": {
+                            "status": "merged",
+                            "title": "Task A",
+                            "updated_at": "2026-05-04T00:00:00Z",
+                        }
+                    }
+                }
+            )
+        )
+
+        orch.run(specs=[spec], push_pr=False)
+        # Only chunk 02 should have been executed.
+        assert engine.execute_chunk.call_count == 1
+
+    def test_full_ledger_executes_no_chunks(self, tmp_path: pathlib.Path) -> None:
+        import json
+
+        spec = self._make_spec(tmp_path, "# Feature\n\n## Phase 1\n\n- [ ] Task A\n")
+        engine = self._mock_engine()
+        git = self._mock_git()
+        orch = V2Orchestrator(tmp_path, git, engine)
+        ledger_path = orch._ledger_path(spec)
+        ledger_path.parent.mkdir(parents=True, exist_ok=True)
+        ledger_path.write_text(
+            json.dumps(
+                {
+                    "chunks": {
+                        "spec-01-chunk-01": {"status": "merged", "title": "Task A"},
+                    }
+                }
+            )
+        )
+
+        orch.run(specs=[spec], push_pr=False)
+        engine.execute_chunk.assert_not_called()
+
+    def test_no_resume_ignores_ledger(self, tmp_path: pathlib.Path) -> None:
+        import json
+
+        spec = self._make_spec(tmp_path, "# Feature\n\n## Phase 1\n\n- [ ] Task A\n")
+        engine = self._mock_engine()
+        git = self._mock_git()
+        orch = V2Orchestrator(tmp_path, git, engine, no_resume=True)
+        ledger_path = orch._ledger_path(spec)
+        ledger_path.parent.mkdir(parents=True, exist_ok=True)
+        ledger_path.write_text(
+            json.dumps(
+                {
+                    "chunks": {
+                        "spec-01-chunk-01": {"status": "merged", "title": "Task A"},
+                    }
+                }
+            )
+        )
+
+        orch.run(specs=[spec], push_pr=False)
+        engine.execute_chunk.assert_called_once()
+
+    def test_successful_run_writes_merged_status(self, tmp_path: pathlib.Path) -> None:
+        import json
+
+        spec = self._make_spec(tmp_path, "# Feature\n\n## Phase 1\n\n- [ ] Task A\n")
+        engine = self._mock_engine()
+        git = self._mock_git()
+        orch = V2Orchestrator(tmp_path, git, engine)
+
+        orch.run(specs=[spec], push_pr=False)
+        ledger = json.loads(orch._ledger_path(spec).read_text())
+        merged = [entry for entry in ledger["chunks"].values() if entry["status"] == "merged"]
+        assert len(merged) == 1
+
+
+class TestEngineFallback:
+    """spec v30 Step 5: Claude rate-limit fails over to the next engine in the list."""
+
+    def _make_spec(self, tmp_path: pathlib.Path, content: str) -> pathlib.Path:
+        spec_dir = tmp_path / "docs" / "specs"
+        spec_dir.mkdir(parents=True, exist_ok=True)
+        spec = spec_dir / "01_feature.md"
+        spec.write_text(content, encoding="utf-8")
+        return spec
+
+    def _mock_git(self):
+        git = mock.MagicMock()
+        git.assert_safe_branch = mock.MagicMock()
+        git.push_to_origin.return_value = mock.MagicMock(success=True, error_type=None, message="")
+        git.commit_chunk.return_value = mock.MagicMock(success=True, sha="abc1234", message="ok")
+        git.get_pr_commit_count.return_value = 0
+        git.get_pr_diff_loc.return_value = 0
+        git.ensure_draft_pr_exists.return_value = 42
+        git.revert_chunk_changes.return_value = True
+        git.repo_path = pathlib.Path("/tmp/repo")
+        return git
+
+    def _engine(self, name: str, *, success: bool = True, rate_limit: bool = False) -> mock.MagicMock:
+        eng = mock.MagicMock()
+        eng.name = name
+        if rate_limit:
+            eng.execute_chunk.return_value = ChunkResult(success=False, files_modified=[], message="Rate limited")
+        else:
+            eng.execute_chunk.return_value = ChunkResult(
+                success=success,
+                files_modified=[pathlib.Path("src/a.py")] if success else [],
+                message="ok" if success else "err",
+            )
+        eng.verify_chunk.return_value = ChunkResult(success=True, message="passed")
+        eng.fix_chunk.return_value = ChunkResult(success=True, message="fixed")
+        return eng
+
+    def test_primary_rate_limit_fails_over(self, tmp_path: pathlib.Path) -> None:
+        spec = self._make_spec(tmp_path, "# Spec\n\n## P1\n\n- [ ] Task A\n")
+        primary = self._engine("claude", rate_limit=True)
+        secondary = self._engine("huggingface", success=True)
+        git = self._mock_git()
+
+        orch = V2Orchestrator(tmp_path, git, primary, engines=[primary, secondary])
+        result = orch.run(specs=[spec], push_pr=False)
+
+        assert primary.execute_chunk.call_count == 1
+        assert secondary.execute_chunk.call_count == 1
+        assert result.success is True
+
+    def test_both_rate_limit_aborts(self, tmp_path: pathlib.Path) -> None:
+        spec = self._make_spec(tmp_path, "# Spec\n\n## P1\n\n- [ ] Task A\n")
+        primary = self._engine("claude", rate_limit=True)
+        secondary = self._engine("huggingface", rate_limit=True)
+        git = self._mock_git()
+
+        orch = V2Orchestrator(tmp_path, git, primary, engines=[primary, secondary])
+        result = orch.run(specs=[spec], push_pr=False)
+
+        assert primary.execute_chunk.call_count == 1
+        assert secondary.execute_chunk.call_count == 1
+        assert result.success is False
+
+    def test_no_engines_arg_keeps_legacy_single_engine_behavior(self, tmp_path: pathlib.Path) -> None:
+        spec = self._make_spec(tmp_path, "# Spec\n\n## P1\n\n- [ ] Task A\n")
+        eng = self._engine("claude", success=True)
+        git = self._mock_git()
+
+        orch = V2Orchestrator(tmp_path, git, eng)  # no engines kwarg
+        orch.run(specs=[spec], push_pr=False)
+        assert eng.execute_chunk.call_count == 1
diff --git a/tests/test_verifier.py b/tests/test_verifier.py
index 089252ee..e9e1a1c6 100644
--- a/tests/test_verifier.py
+++ b/tests/test_verifier.py
@@ -1771,3 +1771,108 @@ def test_coverage_timeout_used_in_subprocess(self, tmp_path: pathlib.Path) -> No
         assert mock_run.called
         call_kwargs = mock_run.call_args
         assert call_kwargs.kwargs.get("timeout") == 42 or call_kwargs[1].get("timeout") == 42
+
+
+# ─────────────────────────────────────────────────────────────────────────
+# spec v29 Step 6: chunk-scoped verify_paths
+# ─────────────────────────────────────────────────────────────────────────
+
+
+class TestVerifyPaths:
+    def test_empty_paths_falls_back_to_full_verify(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.verifier import verify_paths
+
+        with patch("codelicious.verifier.verify") as mock_verify:
+            mock_verify.return_value = "fallback"
+            assert verify_paths(tmp_path, []) == "fallback"
+            mock_verify.assert_called_once_with(tmp_path)
+
+    def test_no_test_mapping_falls_back(self, tmp_path: pathlib.Path) -> None:
+        """Source file with no matching test_<module>.py falls back to full verify."""
+        from codelicious.verifier import verify_paths
+
+        (tmp_path / "src").mkdir()
+        src = tmp_path / "src" / "lonely.py"
+        src.write_text("x = 1\n")
+
+        with patch("codelicious.verifier.verify") as mock_verify:
+            mock_verify.return_value = "fallback"
+            assert verify_paths(tmp_path, [src]) == "fallback"
+            mock_verify.assert_called_once()
+
+    def test_mapped_test_file_resolved(self, tmp_path: pathlib.Path) -> None:
+        """src/<module>.py with a matching tests/test_<module>.py is mapped."""
+        from codelicious.verifier import _map_src_to_tests
+
+        (tmp_path / "src").mkdir()
+        (tmp_path / "tests").mkdir()
+        src = tmp_path / "src" / "math_utils.py"
+        src.write_text("def add(a, b):\n    return a + b\n")
+        test = tmp_path / "tests" / "test_math_utils.py"
+        test.write_text("def test_x():\n    assert True\n")
+
+        mapped = _map_src_to_tests(tmp_path, [src])
+        assert mapped == [test]
+
+    def test_map_skips_non_python_paths(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.verifier import _map_src_to_tests
+
+        (tmp_path / "tests").mkdir()
+        assert _map_src_to_tests(tmp_path, [pathlib.Path("README.md")]) == []
+
+
+# ─────────────────────────────────────────────────────────────────────────
+# spec v30 Step 8: coverage-floor enforcement
+# ─────────────────────────────────────────────────────────────────────────
+
+
+class TestCoverageGate:
+    def test_default_floor_is_90(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.verifier import resolve_min_coverage
+
+        assert resolve_min_coverage(tmp_path) == 90.0
+
+    def test_pyproject_value_used_when_present(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.verifier import resolve_min_coverage
+
+        (tmp_path / "pyproject.toml").write_text(
+            "[tool.codelicious]\nmin_coverage = 75\n",
+            encoding="utf-8",
+        )
+        assert resolve_min_coverage(tmp_path) == 75.0
+
+    def test_cli_override_wins(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.verifier import resolve_min_coverage
+
+        (tmp_path / "pyproject.toml").write_text(
+            "[tool.codelicious]\nmin_coverage = 75\n",
+            encoding="utf-8",
+        )
+        assert resolve_min_coverage(tmp_path, cli_override=99.0) == 99.0
+
+    def test_below_floor_demotes_to_failure(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.verifier import (
+            CheckResult,
+            VerificationResult,
+            _enforce_coverage_floor,
+        )
+
+        cov = CheckResult(name="coverage", passed=True, message="Coverage 71.5%", details="")
+        result = VerificationResult(checks=[cov])
+        _enforce_coverage_floor(result, 90.0)
+        assert cov.passed is False
+        assert "71.5%" in cov.message
+        assert "90.0%" in cov.message
+
+    def test_above_floor_unchanged(self, tmp_path: pathlib.Path) -> None:
+        from codelicious.verifier import (
+            CheckResult,
+            VerificationResult,
+            _enforce_coverage_floor,
+        )
+
+        cov = CheckResult(name="coverage", passed=True, message="Coverage 95%", details="")
+        result = VerificationResult(checks=[cov])
+        _enforce_coverage_floor(result, 90.0)
+        assert cov.passed is True
+        assert cov.message == "Coverage 95%"