-
Notifications
You must be signed in to change notification settings - Fork 0
Loop ticks should use evolution, judges, and memory - not bypass them #8
Description
Problem
phantom_loop ticks bypass the entire intelligence stack - no evolved config, no memory recall, no post-session evolution, no cross-model judges. Each tick is a naive LLM call with no benefit from anything Phantom has learned.
The loop is a context-window management strategy, not a reason to skip the intelligence layer. Interactive sessions get the full stack; loops - the primary mechanism for long-running autonomous work - get none of it.
Proposed fix (3 phases)
Phase 1: Inject evolved config + memory into tick prompts
Each tick gets persona, domain knowledge, error recovery strategies, and recalled memories. Zero extra LLM calls (one local embedding query per tick for memory recall).
- Extend
RunnerDepswith optionalmemoryContextBuilder,evolvedConfig,roleTemplate - Wire them from
src/index.ts(same instances already in the router scope) - Extend
buildTickPrompt()to inject evolved config sections before the goal and memory context before the state file - Call
contextBuilder.build(loop.goal)per tick for memory recall
Modified: src/loop/runner.ts, src/loop/prompt.ts, src/index.ts
Phase 2: Post-loop evolution and memory consolidation
After a loop finishes, synthesize a SessionSummary from accumulated tick transcripts and feed it through afterSession() and consolidateSessionWithLLM(). Phantom learns from autonomous work, not just interactive conversations.
- Accumulate tick prompt/response pairs in-memory during the run
- On finalize, build a
SessionSummary(loop status maps to outcome: done->success, stopped->abandoned, budget_exceeded/failed->failure) - Call
evolution.afterSession(summary)- the pipeline is completely channel-agnostic, no changes needed - Call
consolidateSessionWithLLM()to store the run as a vector-backed episode
Modified: src/loop/runner.ts, src/index.ts
Phase 3: Mid-loop critique checkpoints
For long loops (10+ ticks), run a Sonnet 4.6 judge every N ticks to detect drift before the budget burns out. This is the only phase that adds a new LLM call.
- New
src/loop/critique.tsmodule: reads state file + tick history, asks quality judge if the loop is making progress or stuck - Configurable
checkpoint_interval(default 5 ticks, disabled for short loops) - Critique injected into next tick prompt as "Reviewer feedback" section
Modified: src/loop/runner.ts, src/loop/prompt.ts
New: src/loop/critique.ts
Key notes
- All existing evolution/judge/memory code is channel-agnostic and reusable without modification
- Primary integration point is
buildTickPrompt()insrc/loop/prompt.ts - 300-line file cap applies - Phase 3 critique logic lives in a separate module
- Builds on Loop progress feedback in Slack is silent after start message #5 / fix(loop): restore Slack feedback for phantom_loop runs #7 (loop Slack feedback pipeline)