xsovad06 · xsovad06 · Jul 3, 2026 · Jul 2, 2026 · Jul 2, 2026 · Jul 2, 2026
diff --git a/.claude/commands/ingest-review.md b/.claude/commands/ingest-review.md
@@ -11,33 +11,70 @@ outputs:
 
 # Ingest PR Review Feedback
 
-Process review comments from a merged PR and update agent memory.
+Parse structured review findings from TaskRun records and update agent memory.
 
 ## Instructions
 
 1. Get the PR number from `$ARGUMENTS`. If empty, ask the user.
 
-2. Fetch PR data:
+2. Query the database for reviewer TaskRun records linked to this PR:
    ```bash
-   gh pr view <PR_NUMBER> --json comments,reviews,body,title
-   gh api repos/<OWNER>/<REPO>/pulls/<PR_NUMBER>/comments
+   # Find the reviewer run's handoff data
+   python3 -c "
+   import asyncio, json, os
+   os.environ.setdefault('SOVA_DATABASE_URL', 'sqlite+aiosqlite://.claude/sova.db')
+   from sova.db.session import init_db, get_session
+   from sova.db.models import TaskRun
+   from sqlalchemy import select
+
+   async def main():
+       await init_db(run_migrations=False)
+       async with get_session() as session:
+           stmt = select(TaskRun).where(
+               TaskRun.pr_number == <PR_NUMBER>,
+               TaskRun.role.in_(['reviewer', 'command:review-pr']),
+               TaskRun.status == 'done',
+           ).order_by(TaskRun.id.desc()).limit(1)
+           run = (await session.execute(stmt)).scalar_one_or_none()
+           if not run or not run.handoff_json:
+               print('NO_FINDINGS')
+               return
+           print(json.dumps(run.handoff_json, indent=2))
+
+   asyncio.run(main())
+   "
+   ```
+
+3. If the output is `NO_FINDINGS`, report "No reviewer findings found for PR #N" and stop.
+
+4. Parse the `pending_findings` array from the handoff JSON. Each finding has:
+   - `file`: file path
+   - `line`: line number
+   - `severity`: 1-10 score
+   - `category`: type of issue (bug, style, performance, etc.)
+   - `description`: what the issue is
+   - `suggestion`: how to fix it
+
+5. Also fetch external review comments (CodeRabbit, human reviewers):
+   ```bash
+   gh pr view <PR_NUMBER> --json reviews,comments --jq '.reviews[] | {author: .author.login, state: .state, body: .body}'
    ```
 
-3. Analyze review comments and extract lessons:
-   - **Patterns to always follow** -- things reviewers praised or requested
-   - **Mistakes to avoid** -- bugs caught, missing edge cases, style violations
-   - **Style preferences** -- formatting, naming, structural preferences
-   - **Test coverage gaps** -- missing assertions, untested scenarios
+6. Classify findings into memory categories:
+   - **Severity >= 7**: likely a "common_mistake" -- check `.claude/agent-memory/cookbook.md` for existing entries
+   - **Severity 4-6 with "style" or "naming" category**: "style preference"
+   - **Repeated patterns across findings**: "review pattern" worth codifying
+   - **Test-related findings**: "test coverage gap"
 
-4. Read existing memory file:
+7. Read existing memory file:
    - `.claude/agent-memory/cookbook.md`
 
-5. Update `.claude/agent-memory/cookbook.md`:
+8. Update `.claude/agent-memory/cookbook.md`:
    - Append new findings under the matching domain section (no duplicates)
-   - If a mistake has appeared before, add it to the "Common Mistakes" section with `[Nx]` count
-   - If a finding is high-impact, add it to `MEMORY.md`
+   - If a mistake already exists, increment its `[Nx]` count
+   - If a finding is high-impact (severity >= 8), also add it to `.claude/agent-memory/MEMORY.md`
 
-6. Report what was learned and which files were updated.
+9. Report what was learned and which files were updated.
 
 ## Cross-References
 
@@ -47,4 +84,5 @@ Process review comments from a merged PR and update agent memory.
 ## Rules
 
 - Only record actionable, specific lessons -- not generic advice
+- Parse structured data from DB records; do NOT use LLM summarization
 - NEVER use emojis in any output
diff --git a/.claude/rules/architecture.md b/.claude/rules/architecture.md
@@ -14,7 +14,7 @@ SOVA has four main components:
 - `core/state.py` -- 17-state TaskStatus StrEnum with transition validation
 - `core/context.py` -- ExecutionContext dataclass threading state through steps
 - `core/output.py` -- OutputWriter for per-run DB-backed output persistence, read_lines, retention cleanup
-- `core/steps/` -- 29 BaseStep implementations with execute/validate_output/can_skip. Four pipeline variants:
+- `core/steps/` -- 26 BaseStep implementations with execute/validate_output/can_skip. Four pipeline variants:
   - **Developer pipeline** (16 steps): sync -> assess -> create_worktree -> capture_baseline -> develop -> simplify -> self_review -> commit -> validate -> push -> create_pr -> wait_for_external_reviews -> address_external_findings -> monitor_ci -> extract_memory -> handoff_to_reviewer
   - **Address-review pipeline** (9 steps): rebase -> address_review -> commit -> validate -> push -> monitor_ci -> resolve_external_reviews -> extract_memory -> handoff_to_user
   - **Researcher pipeline** (4 steps): fetch_task -> research -> spec -> extract_memory
@@ -115,12 +115,12 @@ The project's full name is **SOVA** (Software Orchestration Via Agents).
 - **Auto-handoff must clear handoff file before spawning next agent**: `_process_auto_handoff()` in `agent_handoff.py` calls `handoff_service.clear_handoff()` before spawning the next agent via `start_agent()` or `start_command()`. Without this, the newly spawned agent or a concurrent dashboard poll may re-process the same handoff, leading to duplicate agent spawns. The clear must happen synchronously before the spawn call, not after.
 - **Seed cross-agent data before clearing the handoff file**: when spawning an agent that needs data from the previous agent's handoff (e.g., review findings for address-review), extract the data before `clear_handoff()` and write it to the new TaskRun's `handoff_json` via `start_agent(handoff_findings=...)`. Without this, the address-review step can't find findings when: (1) the handoff file was cleared, (2) `resume_run_id` points to a non-reviewer run, and (3) the issue-level DB fallback fails. File: `sova/dashboard/services/agent_db.py:_seed_handoff_findings()`.
 - **Adapter ABC contract**: `TaskAdapter` in `sova/adapters/base.py` defines 18 abstract methods: `list_tasks`, `get_task`, `get_state`, `transition_state`, `assign`, `add_label`, `remove_label`, `_do_post_comment`, `_do_post_pr_comment`, `_do_post_pr_review`, `_do_edit_body`, `link_pr`, `get_pr_reviews`, `_do_create_issue`, `get_available_transitions`, `list_milestones`, `create_milestone`, `set_milestone`, plus `github_user` field (constructor param). Public posting methods (`post_comment`, `post_pr_comment`, `post_pr_review`, `edit_body`, `create_issue`) are concrete Template Methods that run the egress filter before delegating to `_do_*` implementations. When adding new agent capabilities that interact with the tracker, add the method to the ABC first, then implement in `GitHubAdapter`. Factory: `create_adapter(type, repo, github_user, project_number)`.
-- **LLM for user-facing outputs with structured fallback**: workflow steps producing user-facing content (PR descriptions, review comments) use a focused LLM call (Sonnet, ~$0.01) with structured context (commit log, diff stats, issue body). Fallback MUST preserve available data -- build a structured body from already-fetched data rather than discarding to a bare stub. Pattern: `_generate_pr_body()` + `_build_fallback_body()` in `create_pr.py`, `_run_review()` in `reviewer.py`.
+- **Deterministic PR body from structured data**: `CreatePRStep._build_pr_body()` builds PR descriptions deterministically from commits, diff stats, issue body excerpt (truncated to 500 chars), and Closes link -- no LLM call. The reviewer role's `_run_review()` is the only user-facing LLM call in the pipeline (structured output at the right abstraction level).
 - **Rebase with LLM conflict resolution**: `rebase_with_conflict_resolution()` in `sova/git/rebase.py` rebases onto the latest base branch and uses the LLM to resolve merge conflicts. Loop: detect conflicted files, invoke LLM to fix markers + `git add`, `git rebase --continue`. Aborts on LLM failure or after `max_attempts` (default 3) so the worktree is never left in a broken rebase state. `RebaseStep` in the address-review pipeline runs this before `AddressReviewStep`. `PushStep` uses `--force-with-lease` when `ctx.pr_number` is set (post-rebase history rewrite).
 - **Pydantic request models must be at module scope in `app.py`**: `from __future__ import annotations` makes all type annotations lazy strings. FastAPI resolves annotation strings by searching the module's global namespace. Pydantic `BaseModel` subclasses defined inside nested functions (e.g., `_setup_multi_project()`) are not in module globals, so FastAPI falls back to treating the parameter as a query param, producing 422 errors. Always define request/response models at module level. File: `sova/dashboard/app.py`.
 - **Thin re-export wrappers during module splits**: when splitting a large module into submodules (e.g., `control_service.py` -> `agent_lifecycle.py` + `agent_output.py` + `agent_recovery.py`), convert the original module to a thin re-export facade (`from agent_lifecycle import X`) rather than deleting it. This preserves backward compatibility for all existing imports in routers, tests, and commands. Delete the old module only after all imports are migrated.
 - **Never non-visible overflow on containers with popout children**: the sidebar `<nav>` and its scrollable child div contain absolutely-positioned children (notification panel, CSS hover tooltips) that extend beyond their boundary at `left: 100%`. ANY non-visible overflow value (`hidden`, `auto`, `scroll`) on ANY ancestor creates a clipping boundary for positioned descendants. CSS spec: when one axis is non-visible, the other is computed to `auto` even if explicitly set to `visible` -- so `overflow-y: auto; overflow-x: visible` does NOT work (x becomes `auto` too). Use `max-width` and opacity transitions on individual child elements instead of overflow on the container. This applies to any fixed-position sidebar or panel that contains popout menus or tooltips.
-- **Automatic memory extraction**: `ExtractMemoryStep` runs before every handoff step in both pipelines. `ReviewerRole.execute()` calls `extract_memories()` directly (reviewer doesn't use WorkflowEngine). A single Haiku LLM call (~$0.005-0.01) extracts 0-5 reusable learnings from run context (role, task, files changed, step summaries, review findings). Results are stored to the Memory DB table via `memory.store()` with deduplication (title similarity check against existing memories in same category). Confirmation counters (`[confirmed: N]` in content field) track reuse; memories auto-promote to "shared" tier at N=3. Extraction is fully non-fatal: failures are logged but never block the pipeline. Module: `sova/knowledge/extraction.py`.
+- **Memory extraction is a no-op**: `ExtractMemoryStep` still runs before every handoff step in both pipelines (step slot preserved for future rule-based extraction), but `extract_memories()` returns immediately without calling the LLM. `ReviewerRole._extract_review_memories()` is also a no-op. Use the human-reviewed `/extract-knowledge` command for knowledge capture. Infrastructure (`_build_extraction_prompt`, `_parse_extraction_response`, `_deduplicate_and_store`) is retained in `sova/knowledge/extraction.py` for future use.
 - **Issue Lifecycle Control**: `IssueLifecycle` is the "spine" connecting all `TaskRun` records for a single issue into a unified journey with 6 phases (`LifecyclePhase` enum: development, post_pr, review, address_review, integrate, post_merge). `LifecyclePhaseRecord` tracks each phase execution (status, cost, attempt counter, linked TaskRun). Phase transitions are advisory (warnings, not strict enforcement) -- matches `--force` philosophy. The `post_pr` phase is passive (no TaskRun; inferred from PR creation). `lifecycle_service.py` provides CRUD, phase transitions, and backward-compatible reconstruction from pre-existing TaskRuns via `build_lifecycle_view()`. Integration hooks in `agent_lifecycle.py` (`_link_run_to_lifecycle`, `_finalize_lifecycle_phase`) are non-fatal side effects. Dashboard UI at `/lifecycle/{issue_number}` shows a phase rail with live polling.
 - **Doc counts drift after refactors**: step count, service count, adapter method count, CLI subcommand list, and test count go stale after module splits, service extractions, or ABC changes. After any structural refactor, run actual counts (`find`, `grep -c`, `wc -l`) and update AGENTS.md, architecture.md, and CLAUDE.md. The `/review` command checks doc freshness automatically.
 - **Stale references persist after file/feature renames**: when a file is deleted, renamed, or a concept changes (e.g., "PAK" -> "SOVA"), references survive in commands, rules, issue bodies, and vision docs. After any rename or deletion, run `grep -rn "OLD_NAME" . --include="*.md"` to find stale references across all markdown files. Also check GitHub Issues and VISION.md for outdated naming.

diff --git a/AGENTS.md b/AGENTS.md
@@ -60,7 +60,7 @@ sova/
     KNOWLEDGE.md                   # 4-tier knowledge management system
   templates/                       # Project scaffolding templates
   deploy/                          # systemd + launchd service files
-  tests/                           # pytest suite (2800+ tests)
+  tests/                           # pytest suite (2900+ tests)
   docs/
     VISION.md                      # Product vision and roadmap
     ARCHITECTURE.md                # Architecture overview (points to .claude/rules/)

diff --git a/docs/error-handling-guidelines.md b/docs/error-handling-guidelines.md
@@ -180,13 +180,15 @@ Status updates are conditional (once terminal, never overwrite); cost updates ar
 
 When LLM calls fail for user-facing outputs, always provide a structured fallback from available data. **Never discard to a bare stub.**
 
+PR body generation uses a deterministic template (`_build_pr_body()`) -- no LLM call. For steps that still use conditional LLM calls (validate, monitor_ci, rebase), the pattern is:
+
 ```python
 try:
-    result = await invoke(prompt, model="sonnet", cwd=ctx.working_dir, timeout=120)
+    result = await invoke(prompt, model=model, cwd=ctx.working_dir, timeout=120)
     return result.text
 except RuntimeError:
-    log.warning("step.create_pr.body_generation_failed", fallback="structured")
-    return self._build_fallback_body(ctx, task_title, commit_log, diff_stat)
+    log.warning("step.operation_failed", fallback="structured")
+    return build_structured_fallback(available_data)
 ```
 
 ### JSON Parsing with Substring Extraction

diff --git a/docs/pipeline-determinism.html b/docs/pipeline-determinism.html
@@ -782,10 +782,10 @@ <h2>Determinism Opportunities</h2>
       </tr>
       <tr>
         <td>PR body generation</td>
-        <td><span class="step-tag tag-llm" style="display:inline">llm</span></td>
+        <td><span class="step-tag tag-code" style="display:inline">code</span></td>
         <td class="arrow">--></td>
         <td><span class="step-tag tag-code" style="display:inline">code</span></td>
-        <td>Structured template from commits + diff stats + issue body. Fallback already exists in <code>_build_fallback_body()</code>; promote it to primary.</td>
+        <td>Deterministic template via <code>_build_pr_body()</code>: commits + diff stats + issue body excerpt (500 char limit). Done.</td>
       </tr>
       <tr>
         <td>Commit message (step 7)</td>

diff --git a/sova/core/steps/create_pr.py b/sova/core/steps/create_pr.py
@@ -1,4 +1,4 @@
-"""Step 9: Create PR -- generate a rich description and open a pull request."""
+"""Step 9: Create PR -- generate a structured description and open a pull request."""
 
 from __future__ import annotations
 
@@ -9,61 +9,22 @@
 from sova.core.context import ExecutionContext
 from sova.core.steps.base import BaseStep, GateCheckResult, StepResult
 from sova.git import operations as git_ops
-from sova.llm.client import invoke
+from sova.utils.formatting import truncate
 from sova.utils.logging import get_logger
 from sova.utils.shell import run
 
 log = get_logger(component="step.create_pr")
 
 _PLACEHOLDER = "(none)"
 
+_ISSUE_BODY_EXCERPT_LIMIT = 500
+
 _CONVENTIONAL_RE = re.compile(
     r"^(feat|fix|refactor|test|docs|chore|ci)"
     r"(?:\([^)]*\))?"
     r":\s*",
 )
 
-_PR_BODY_PROMPT_BASE = """\
-Generate a pull request description for the changes below. Output ONLY the \
-markdown body (no fences, no commentary). Use this structure:
-
-## Summary
-1-3 bullet points: WHAT changed and WHY.
-
-## Changes
-Brief description of each logical change grouped by area.
-
-## Review guidance
-What should a reviewer focus on? Any trade-offs or shortcuts?
-
-## Test plan
-How were these changes verified?
-"""
-
-_PR_BODY_ISSUE_SECTION = """
-Closes #{issue_number}
-
----
-Issue #{issue_number}: {issue_title}
-
-{issue_body}
-
-"""
-
-_PR_BODY_NO_ISSUE_SECTION = """
----
-Task: {issue_title}
-
-"""
-
-_PR_BODY_CONTEXT = """\
-Commits on this branch:
-{commit_log}
-
-Files changed:
-{diff_stat}
-"""
-
 
 def _build_pr_title(task_title: str, issue_number: str | None) -> str:
     """Build a PR title from the task title, avoiding double conventional prefixes.
@@ -169,41 +130,12 @@ async def _generate_pr_body(self, ctx: ExecutionContext, task_title: str) -> str
             ),
         )
 
-        issue_body = ctx.task.body if ctx.task else ""
         commit_log = log_result.stdout.strip() if log_result.success else "(unavailable)"
         diff_stat = diff_result.stdout.strip() if diff_result.success else "(unavailable)"
-
-        if ctx.has_issue:
-            middle = _PR_BODY_ISSUE_SECTION.format(
-                issue_number=ctx.issue_number,
-                issue_title=task_title,
-                issue_body=issue_body or "(no description)",
-            )
-        else:
-            middle = _PR_BODY_NO_ISSUE_SECTION.format(issue_title=task_title)
-
-        prompt = (
-            _PR_BODY_PROMPT_BASE
-            + middle
-            + _PR_BODY_CONTEXT.format(
-                commit_log=commit_log,
-                diff_stat=diff_stat,
-            )
-        )
-
-        try:
-            result = await invoke(prompt, model="sonnet", cwd=ctx.working_dir, timeout=120)
-            ctx.add_cost(result.cost_usd)
-            body = result.text
-            if ctx.has_issue and f"#{ctx.issue_number}" not in body:
-                body += f"\n\nCloses #{ctx.issue_number}"
-            return body
-        except RuntimeError:
-            log.warning("step.create_pr.body_generation_failed", fallback="structured")
-            return self._build_fallback_body(ctx, task_title, commit_log, diff_stat)
+        return self._build_pr_body(ctx, task_title, commit_log, diff_stat)
 
     @staticmethod
-    def _build_fallback_body(ctx: ExecutionContext, task_title: str, commit_log: str, diff_stat: str) -> str:
+    def _build_pr_body(ctx: ExecutionContext, task_title: str, commit_log: str, diff_stat: str) -> str:
         lines = [
             "## Summary",
             "",
@@ -213,6 +145,13 @@ def _build_fallback_body(ctx: ExecutionContext, task_title: str, commit_log: str
         if ctx.has_issue:
             lines.append(f"Closes #{ctx.issue_number}")
             lines.append("")
+
+        issue_body = (ctx.task.body if ctx.task else "") or ""
+        stripped_body = issue_body.strip()
+        if stripped_body:
+            excerpt = truncate(stripped_body, max_length=_ISSUE_BODY_EXCERPT_LIMIT)
+            lines.extend(["## Context", "", excerpt, ""])
+
         lines.extend(
             [
                 "## Commits",