post-merge bugfixes: 7 real issues found by deep review of PR #11 by clay-good · Pull Request #12 · clay-good/codelicious

clay-good · 2026-05-05T15:55:31Z

Summary

After PR #11 (v29 + v30 closure) merged, a deep reviewer pass surfaced seven real bugs that the test suite missed because either the affected code path wasn't exercised end-to-end or the tests mocked the broken helpers directly. All seven are fixed here.

Fixes

_branch_exists_locally / _branch_exists_remotely always returned False — they called getattr(result, "stdout", "") on a value that was already a string (_run_cmd returns stripped stdout, not a CompletedProcess). Disambiguation never triggered in production. Tests passed because they mocked the predicates.
_run_lock was misused as a non-context-manager in main(), so on any uncaught exception between __enter__ and the natural exit, the lock was never released. Re-wired through contextlib.ExitStack. _release is now also idempotent — the prior atexit + finally double-fire could close a reused fd.
Engine-fallover left self.engine stale when the rate-limit loop exited. The verify/fix path then ran on the wrong engine. Now self.engine is re-bound from self._engines[0] after every pop.
Postmortem markdown was vulnerable to fence injection. Log lines (which include LLM-controlled tee'd output) and ledger-supplied chunk titles were embedded raw. Backticks are now split with a zero-width joiner; titles are stripped of newlines/backticks.
enforce_token_budget used list.pop(0) (O(n)) and inserted at index 1 in a way that scrambled order under recursion. Replaced with collections.deque for O(1) front-ops and correct sub-chunk ordering.
AuditLogger's cross-process lock used blocking LOCK_EX, so a peer process doing a slow rotation could hang the orchestrator's main loop indefinitely. Now uses LOCK_EX | LOCK_NB with 3 × 10 ms retry and falls back to intra-process locking on timeout.
verify_paths ran pytest as bare python -m pytest — picks up whatever python is first on PATH. Now uses sys.executable, matching check_tests.

Quality

pytest -q --no-cov → 1928 passed (no regressions, no new tests required for these fixes — they are corrections to plumbing the existing tests don't reach end-to-end)
ruff check / ruff format clean

Test plan

CI matrix green
Manual: trigger a continuation branch creation when an old branch already exists locally — confirm the disambiguator now appends a suffix instead of silently checking out the stale branch
Manual: run with both Claude and HF creds, force a Claude rate-limit mid-spec — confirm the verify step that follows uses the HF engine, not Claude
Manual: send SIGKILL during codelicious build, then codelicious build again — confirm .codelicious/run.lock is gone

🤖 Generated with Claude Code

Found via deep review of the v29/v30 merge. Seven real issues fixed: 1. **Branch-exists checks always returned False.** `_run_cmd` returns the stripped stdout *string*, but `_branch_exists_locally` / `_branch_exists_remotely` were calling `getattr(result, "stdout", "")` on that string — which always yields `""`. The disambiguation tests passed only because they mocked the predicates directly. Fixed both helpers. 2. **`_run_lock` was misused as a non-context-manager.** `main()` called `cm.__enter__()` without ever calling `__exit__`, so the generator's `try: yield ... finally: _release()` block never fired on the exception path. Re-wired through `contextlib.ExitStack` and registered `stack.close` with `atexit`. The `_release` helper is now also idempotent (guarded by a flag) so the atexit + finally double-fire can't close a reused fd. 3. **Engine-fallover stale reference.** The rate-limit `while` loop updated `self.engine` only inside the loop body, leaving the verify/fix path on the stale engine when the loop exited. Now `self.engine` is re-bound from `self._engines[0]` before each `execute_chunk` and after each pop. 4. **Postmortem markdown allowed log-fence injection.** Log lines and ledger-supplied chunk titles were embedded raw into the markdown body. Now backticks in the log tail are split with a zero-width joiner and chunk titles are stripped of newlines / backticks before rendering. 5. **`enforce_token_budget` was O(n²) and ordered sub-chunks wrong.** `list.pop(0)` is O(n); a recursion that splits 50 chunks could spend most of its time shifting the queue. Switched to `collections.deque` with `popleft` / `appendleft` and clarified the comment about dependency ordering. 6. **AuditLogger could hang on a peer process's slow rotation.** The cross-process flock used the blocking `LOCK_EX` mode. Replaced with a non-blocking acquire + bounded retry (3 attempts × 10 ms); on timeout we log a one-shot warning and proceed with intra-process locking only. 7. **`verify_paths` ran pytest as bare `python -m pytest`.** That picks up whatever `python` is first on PATH — potentially Python 2 or a different venv. Now uses `sys.executable`, matching the existing `check_tests` convention. Quality: - pytest 1928 passed - ruff check / format clean - No public API changes; no test-fixture rewrites needed Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

clay-good merged commit e487c3f into main May 5, 2026
2 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

post-merge bugfixes: 7 real issues found by deep review of PR #11#12

post-merge bugfixes: 7 real issues found by deep review of PR #11#12
clay-good merged 1 commit into
mainfrom
codelicious/post-merge-fixes

clay-good commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

clay-good commented May 5, 2026

Summary

Fixes

Quality

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant