feat(loop): integrate evolution, memory, and mid-loop critique#9
Open
electronicBlacksmith wants to merge 5 commits intomainfrom
Open
feat(loop): integrate evolution, memory, and mid-loop critique#9electronicBlacksmith wants to merge 5 commits intomainfrom
electronicBlacksmith wants to merge 5 commits intomainfrom
Conversation
Closes #5. The feedback pipeline in LoopRunner already existed but was gated on loop.channelId, which was always null because the agent never plumbed channel_id/conversation_id into the in-process MCP tool call, that context only lived in the router. - AsyncLocalStorage<SlackContext> captures the Slack channel/thread/ trigger-message for the current turn so phantom_loop can auto-fill them when the agent omits them. Explicit tool args still win. - Reaction ladder on the operator's original message: hourglass -> cycle -> terminal (check/stop/warning/x). Restart-safe via iteration === 1 check, no in-memory flag. - Inline unicode progress bar in the edited status message. - New trigger_message_ts column on loops, appended as migration ghostwright#11. - Extracted LoopNotifier into src/loop/notifications.ts, runner.ts was already at the 300-line cap. 34 new tests, 938 pass / 0 fail.
…tion Two defects surfaced during the first Slack end-to-end test of the loop feedback fix: 1. Stop button disappeared after the first tick. Slack's chat.update replaces the message wholesale and strips any blocks the caller does not include. postStartNotice attached the button but postTickUpdate called updateMessage without blocks, so the button was wiped on the first progress edit. Extract buildStatusBlocks() and re-send it on every tick edit. Final notice still omits blocks intentionally so the button disappears when the loop is no longer interruptible. 2. No end-of-loop summary. The agent curates the state.md body every tick (Goal, Progress, Next Action, Notes), but that content never reached the operator. Post it as a threaded reply when the loop finalizes. No extra agent cost: we surface content the agent already wrote. Frontmatter stripped, truncated at 3500 chars, silently skipped if the file is missing or empty. +7 tests covering both regressions. 945 pass / 0 fail.
…l message 1. Tick update race: postTickUpdate was fire-and-forget, so a stop on tick N+1 could race with tick N's Slack write. If the tick update's HTTP response arrived after postFinalNotice, it overwrote the final message and re-sent the Stop button blocks. Awaiting postTickUpdate serializes Slack writes so finalize always runs after the last tick update completes. 2. Final message now includes the progress bar at its halted position, visually consistent with tick updates. A stopped loop at 3/10 shows the bar frozen at 3/10 with "stopped" instead of a terse one-liner.
…oop ticks Loop ticks now use Phantom's full intelligence stack instead of running blind: Phase 1 - Memory context injection: cached once at loop start from the goal, injected into every tick prompt via TickPromptOptions. Cleared on finalize, rebuilt on resume. Phase 2 - Post-loop evolution and consolidation: bounded transcript accumulation (first tick + rolling 10 summaries + last tick), SessionData synthesis in finalize(), fire-and-forget evolution pipeline and LLM/heuristic memory consolidation with cost-cap guards matching the interactive path. Phase 3 - Mid-loop critique checkpoints: optional checkpoint_interval param lets the agent request Sonnet 4.6 review every N ticks. Guard requires evolution enabled, LLM judges active, and cost cap not exceeded. Critique is awaited before next tick to avoid race conditions. Closes #8
- Decouple postLoopDeps so evolution and memory run independently (evolution works when memory is down and vice versa) - Skip mid-loop critique on terminal ticks to avoid wasted Sonnet calls - Track judge cost on failure paths via JudgeParseError carrying usage data - Extract recordTranscript/clamp from runner.ts to post-loop.ts (292 < 300 lines)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
checkpoint_intervaltriggers Sonnet 4.6 review every N ticks. Guarded by judge availability and cost cap. Awaited before next tick to prevent race conditions.New files:
src/loop/critique.ts,src/loop/post-loop.ts, and 3 test files.Test plan
Closes #8