Loop ticks should use evolution, judges, and memory - not bypass them

## Problem

`phantom_loop` ticks bypass the entire intelligence stack - no evolved config, no memory recall, no post-session evolution, no cross-model judges. Each tick is a naive LLM call with no benefit from anything Phantom has learned.

The loop is a context-window management strategy, not a reason to skip the intelligence layer. Interactive sessions get the full stack; loops - the primary mechanism for long-running autonomous work - get none of it.

## Proposed fix (3 phases)

### Phase 1: Inject evolved config + memory into tick prompts

Each tick gets persona, domain knowledge, error recovery strategies, and recalled memories. Zero extra LLM calls (one local embedding query per tick for memory recall).

- Extend `RunnerDeps` with optional `memoryContextBuilder`, `evolvedConfig`, `roleTemplate`
- Wire them from `src/index.ts` (same instances already in the router scope)
- Extend `buildTickPrompt()` to inject evolved config sections before the goal and memory context before the state file
- Call `contextBuilder.build(loop.goal)` per tick for memory recall

Modified: `src/loop/runner.ts`, `src/loop/prompt.ts`, `src/index.ts`

### Phase 2: Post-loop evolution and memory consolidation

After a loop finishes, synthesize a `SessionSummary` from accumulated tick transcripts and feed it through `afterSession()` and `consolidateSessionWithLLM()`. Phantom learns from autonomous work, not just interactive conversations.

- Accumulate tick prompt/response pairs in-memory during the run
- On finalize, build a `SessionSummary` (loop status maps to outcome: done->success, stopped->abandoned, budget_exceeded/failed->failure)
- Call `evolution.afterSession(summary)` - the pipeline is completely channel-agnostic, no changes needed
- Call `consolidateSessionWithLLM()` to store the run as a vector-backed episode

Modified: `src/loop/runner.ts`, `src/index.ts`

### Phase 3: Mid-loop critique checkpoints

For long loops (10+ ticks), run a Sonnet 4.6 judge every N ticks to detect drift before the budget burns out. This is the only phase that adds a new LLM call.

- New `src/loop/critique.ts` module: reads state file + tick history, asks quality judge if the loop is making progress or stuck
- Configurable `checkpoint_interval` (default 5 ticks, disabled for short loops)
- Critique injected into next tick prompt as "Reviewer feedback" section

Modified: `src/loop/runner.ts`, `src/loop/prompt.ts`
New: `src/loop/critique.ts`

## Key notes

- All existing evolution/judge/memory code is channel-agnostic and reusable without modification
- Primary integration point is `buildTickPrompt()` in `src/loop/prompt.ts`
- 300-line file cap applies - Phase 3 critique logic lives in a separate module
- Builds on #5 / #7 (loop Slack feedback pipeline)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loop ticks should use evolution, judges, and memory - not bypass them #8

Problem

Proposed fix (3 phases)

Phase 1: Inject evolved config + memory into tick prompts

Phase 2: Post-loop evolution and memory consolidation

Phase 3: Mid-loop critique checkpoints

Key notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Loop ticks should use evolution, judges, and memory - not bypass them #8

Description

Problem

Proposed fix (3 phases)

Phase 1: Inject evolved config + memory into tick prompts

Phase 2: Post-loop evolution and memory consolidation

Phase 3: Mid-loop critique checkpoints

Key notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions