spec v29 + v30: gap closure and operational resilience by clay-good · Pull Request #11 · clay-good/codelicious

clay-good · 2026-05-05T14:12:03Z

Summary

Closes spec v29 (18 steps) — gap closure: legacy 4-phase Orchestrator removed, jitter on all backoff sites, chunk-scoped verifier, prompt-window splitting, HF reflection gate, etc.
Closes spec v30 (12 steps) — operational resilience: per-repo run lock (fcntl.flock), idempotent resume ledger, CLI-layer endpoint validation, atomic latest.log swap, token-budget-aware chunking, engine fallback on rate-limit, branch-name disambiguation, cross-process audit log, coverage gate, PR description metadata, postmortem on abort.
35 files changed (+3844 / −2910). The big deletion is the legacy 1306-line tests/test_orchestrator.py; chunk-based coverage already lives in tests/test_v2_orchestrator.py + tests/test_full_workflow.py.

Scope notes

The two specs share files (cli.py, orchestrator.py, verifier.py, etc.), so this lands as a single PR rather than two stacked ones. See the commit body for the per-step breakdown.

Quality

pytest -q --no-cov → 1928 passed
ruff check src/ tests/ → clean
ruff format --check src/ tests/ → clean
bandit -r src/ → 0 medium, 0 high (low findings only)

Test plan

CI matrix is green on Python 3.10 / 3.11 / 3.12 / 3.13 / 3.14-dev
Manual: codelicious . against a sandbox repo with a multi-task spec, confirm PRs ≤ 250 LOC and split into part-2 / part-3 as expected (closes one of the two unchecked items in spec 28's acceptance criteria).
Manual: codelicious . --continuous runs to completion without intervention (closes the other spec 28 acceptance item).
Manual: send SIGTERM mid-build and confirm .codelicious/run.lock is removed and a postmortem-*.md is written.
Manual: with both ANTHROPIC_API_KEY (or Claude CLI auth) and HF_TOKEN set, force a Claude rate-limit and confirm the build continues on HuggingFace.

🤖 Generated with Claude Code

Closes spec-v29 (18 steps) and spec-v30 (12 steps). ## v29 — Gap Closure 1. Reconciled spec 28 with the 250 LOC default; introduced `_DEFAULT_PR_LOC_CAP`. 2. Removed the legacy 4-phase Orchestrator (BUILD/MERGE/REVIEW/FIX), shrinking `orchestrator.py` from 1499 → 501 LOC. `V2Orchestrator` renamed to `Orchestrator` with a back-compat alias. 3. Added jitter (`secrets.SystemRandom`) to all exponential-backoff sites in `llm_client` and `loop_controller`. 4. `LLMClient` now honors HTTP `Retry-After` (seconds or HTTP-date), capped at 120 s, with jitter still applied. 5. Claude rate-limit branch parses provider reset windows (`_parse_claude_reset_seconds`) and clamps to [10, 3600] s. 6. `verify_paths` runs ruff/bandit/pytest scoped to chunk-modified files; both engines call it when chunk metadata is present. 7. `chunk_spec_with_llm` now windows oversized specs (5000-char windows, 500-char overlap, capped at 10) instead of silently truncating. 8. Engine ABC trimmed to `execute_chunk` / `verify_chunk` / `fix_chunk` — `run_build_cycle` and `BuildResult` removed. 9. Surfaced spec 27 §3.2 Claude CLI flags (`--allowedTools`, `--output-format stream-json`) at the chunk call-site via `_DEFAULT_ALLOWED_TOOLS`. 10. Per-chunk deadline gate (`>=`) before `engine.execute_chunk`. 11. HF reflection step now gates on verification with up to 2 fix-cycle attempts (`_HF_MAX_FIX_ATTEMPTS`). 12. New dedicated `tests/test_audit_logger.py` (15 cases). 13. Added `chunk_spec_with_llm` test coverage (10 cases total). 14. `_probe_git_credentials` distinguishes ssh-add exit codes (no_agent, empty, keys_loaded, unknown) and tailors prompt accordingly. 15. Sandbox emits a one-shot WARNING on platforms without `os.O_NOFOLLOW`. 16. Confirmed CI matrix matches `pyproject.toml` Python classifiers. 17. `prompts.py` docstring documents the live template set. 18. Single-source-of-truth `_FORBIDDEN_PATTERNS_DOC` tuple in scaffolder. ## v30 — Operational Resilience & Idempotency 1. Per-repo advisory lockfile via `_run_lock` (`fcntl.LOCK_EX | LOCK_NB`); second concurrent invocation exits 75 (EX_TEMPFAIL). 2. Persistent chunk-status ledger at `.codelicious/state/<spec>.json` — runs resume by skipping already-merged chunks; new `--no-resume` and `--reset-ledger` CLI flags. 3. CLI-layer endpoint validation (`_validate_endpoint_url_strict`) rejects non-HTTPS and credentials-in-URL before banner. 4. SIGTERM integration test spawns a real Python child holding the run-lock, asserts exit 143 within 8 s and lockfile cleanup. 5. Engine fallback list — Claude rate-limit fails over to HuggingFace for the remainder of the run when both credentials are configured. 6. Token-budget-aware chunk sizing — `enforce_token_budget` recursively halves over-budget chunks up to depth 3, preserving file coverage. 7. Atomic `latest.log` symlink update via `_atomic_symlink_update` (tmp + `os.replace`); Windows fallback writes `<link>.txt`. 8. Coverage-floor enforcement (`resolve_min_coverage`, `_enforce_coverage_floor`); CLI `--min-coverage`, `[tool.codelicious].min_coverage`, default 90. 9. Enriched PR descriptions with Chunk Context, Verifier Summary, and Audit Log sections via `chunk_metadata` arg to `_build_pr_body`. 10. Branch-name disambiguation (`_disambiguate_branch`) probes local + remote and appends a hint or unix timestamp on collision. 11. Cross-process `fcntl.flock` on `.audit.lock` keeps audit lines from interleaving across `codelicious` processes. 12. `_write_postmortem` aggregates ledger counts + log tail + resume hint on abnormal exit, written to `.codelicious/postmortem-<ts>.md`. ## Quality - `pytest` 1928 passed (was 1901 before, 27 net new — accounting for the ~70-test removal that came with deleting `tests/test_orchestrator.py`). - `ruff check src/ tests/` clean. - `ruff format src/ tests/` clean. - `bandit -r src/` 0 medium / 0 high (low findings only). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

post-merge bugfixes: 7 real issues found by deep review of PR #11

clay-good merged commit 4c1e6a0 into main May 5, 2026
2 of 6 checks passed

clay-good mentioned this pull request May 5, 2026

post-merge bugfixes: 7 real issues found by deep review of PR #11 #12

Merged

4 tasks

clay-good added a commit that referenced this pull request May 5, 2026

Merge pull request #12 from clay-good/codelicious/post-merge-fixes

e487c3f

post-merge bugfixes: 7 real issues found by deep review of PR #11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spec v29 + v30: gap closure and operational resilience#11

spec v29 + v30: gap closure and operational resilience#11
clay-good merged 1 commit into
mainfrom
codelicious/spec-v29-and-v30-closure

clay-good commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

clay-good commented May 5, 2026

Summary

Scope notes

Quality

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant