Skip to content

spec v29 + v30: gap closure and operational resilience#11

Merged
clay-good merged 1 commit into
mainfrom
codelicious/spec-v29-and-v30-closure
May 5, 2026
Merged

spec v29 + v30: gap closure and operational resilience#11
clay-good merged 1 commit into
mainfrom
codelicious/spec-v29-and-v30-closure

Conversation

@clay-good
Copy link
Copy Markdown
Owner

Summary

  • Closes spec v29 (18 steps) — gap closure: legacy 4-phase Orchestrator removed, jitter on all backoff sites, chunk-scoped verifier, prompt-window splitting, HF reflection gate, etc.
  • Closes spec v30 (12 steps) — operational resilience: per-repo run lock (fcntl.flock), idempotent resume ledger, CLI-layer endpoint validation, atomic latest.log swap, token-budget-aware chunking, engine fallback on rate-limit, branch-name disambiguation, cross-process audit log, coverage gate, PR description metadata, postmortem on abort.
  • 35 files changed (+3844 / −2910). The big deletion is the legacy 1306-line tests/test_orchestrator.py; chunk-based coverage already lives in tests/test_v2_orchestrator.py + tests/test_full_workflow.py.

Scope notes

The two specs share files (cli.py, orchestrator.py, verifier.py, etc.), so this lands as a single PR rather than two stacked ones. See the commit body for the per-step breakdown.

Quality

  • pytest -q --no-cov1928 passed
  • ruff check src/ tests/ → clean
  • ruff format --check src/ tests/ → clean
  • bandit -r src/ → 0 medium, 0 high (low findings only)

Test plan

  • CI matrix is green on Python 3.10 / 3.11 / 3.12 / 3.13 / 3.14-dev
  • Manual: codelicious . against a sandbox repo with a multi-task spec, confirm PRs ≤ 250 LOC and split into part-2 / part-3 as expected (closes one of the two unchecked items in spec 28's acceptance criteria).
  • Manual: codelicious . --continuous runs to completion without intervention (closes the other spec 28 acceptance item).
  • Manual: send SIGTERM mid-build and confirm .codelicious/run.lock is removed and a postmortem-*.md is written.
  • Manual: with both ANTHROPIC_API_KEY (or Claude CLI auth) and HF_TOKEN set, force a Claude rate-limit and confirm the build continues on HuggingFace.

🤖 Generated with Claude Code

Closes spec-v29 (18 steps) and spec-v30 (12 steps).

## v29 — Gap Closure
1. Reconciled spec 28 with the 250 LOC default; introduced `_DEFAULT_PR_LOC_CAP`.
2. Removed the legacy 4-phase Orchestrator (BUILD/MERGE/REVIEW/FIX), shrinking
   `orchestrator.py` from 1499 → 501 LOC. `V2Orchestrator` renamed to
   `Orchestrator` with a back-compat alias.
3. Added jitter (`secrets.SystemRandom`) to all exponential-backoff sites in
   `llm_client` and `loop_controller`.
4. `LLMClient` now honors HTTP `Retry-After` (seconds or HTTP-date), capped
   at 120 s, with jitter still applied.
5. Claude rate-limit branch parses provider reset windows
   (`_parse_claude_reset_seconds`) and clamps to [10, 3600] s.
6. `verify_paths` runs ruff/bandit/pytest scoped to chunk-modified files; both
   engines call it when chunk metadata is present.
7. `chunk_spec_with_llm` now windows oversized specs (5000-char windows,
   500-char overlap, capped at 10) instead of silently truncating.
8. Engine ABC trimmed to `execute_chunk` / `verify_chunk` / `fix_chunk` —
   `run_build_cycle` and `BuildResult` removed.
9. Surfaced spec 27 §3.2 Claude CLI flags (`--allowedTools`, `--output-format
   stream-json`) at the chunk call-site via `_DEFAULT_ALLOWED_TOOLS`.
10. Per-chunk deadline gate (`>=`) before `engine.execute_chunk`.
11. HF reflection step now gates on verification with up to 2 fix-cycle
    attempts (`_HF_MAX_FIX_ATTEMPTS`).
12. New dedicated `tests/test_audit_logger.py` (15 cases).
13. Added `chunk_spec_with_llm` test coverage (10 cases total).
14. `_probe_git_credentials` distinguishes ssh-add exit codes (no_agent,
    empty, keys_loaded, unknown) and tailors prompt accordingly.
15. Sandbox emits a one-shot WARNING on platforms without `os.O_NOFOLLOW`.
16. Confirmed CI matrix matches `pyproject.toml` Python classifiers.
17. `prompts.py` docstring documents the live template set.
18. Single-source-of-truth `_FORBIDDEN_PATTERNS_DOC` tuple in scaffolder.

## v30 — Operational Resilience & Idempotency
1. Per-repo advisory lockfile via `_run_lock` (`fcntl.LOCK_EX | LOCK_NB`);
   second concurrent invocation exits 75 (EX_TEMPFAIL).
2. Persistent chunk-status ledger at `.codelicious/state/<spec>.json` — runs
   resume by skipping already-merged chunks; new `--no-resume` and
   `--reset-ledger` CLI flags.
3. CLI-layer endpoint validation (`_validate_endpoint_url_strict`) rejects
   non-HTTPS and credentials-in-URL before banner.
4. SIGTERM integration test spawns a real Python child holding the run-lock,
   asserts exit 143 within 8 s and lockfile cleanup.
5. Engine fallback list — Claude rate-limit fails over to HuggingFace for
   the remainder of the run when both credentials are configured.
6. Token-budget-aware chunk sizing — `enforce_token_budget` recursively
   halves over-budget chunks up to depth 3, preserving file coverage.
7. Atomic `latest.log` symlink update via `_atomic_symlink_update` (tmp +
   `os.replace`); Windows fallback writes `<link>.txt`.
8. Coverage-floor enforcement (`resolve_min_coverage`,
   `_enforce_coverage_floor`); CLI `--min-coverage`,
   `[tool.codelicious].min_coverage`, default 90.
9. Enriched PR descriptions with Chunk Context, Verifier Summary, and
   Audit Log sections via `chunk_metadata` arg to `_build_pr_body`.
10. Branch-name disambiguation (`_disambiguate_branch`) probes local + remote
    and appends a hint or unix timestamp on collision.
11. Cross-process `fcntl.flock` on `.audit.lock` keeps audit lines from
    interleaving across `codelicious` processes.
12. `_write_postmortem` aggregates ledger counts + log tail + resume hint
    on abnormal exit, written to `.codelicious/postmortem-<ts>.md`.

## Quality

- `pytest` 1928 passed (was 1901 before, 27 net new — accounting for the
  ~70-test removal that came with deleting `tests/test_orchestrator.py`).
- `ruff check src/ tests/` clean.
- `ruff format src/ tests/` clean.
- `bandit -r src/` 0 medium / 0 high (low findings only).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@clay-good clay-good merged commit 4c1e6a0 into main May 5, 2026
2 of 6 checks passed
clay-good added a commit that referenced this pull request May 5, 2026
post-merge bugfixes: 7 real issues found by deep review of PR #11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant