diff --git a/CHANGELOG.md b/CHANGELOG.md index 2bea539..d1c03d6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,39 @@ changed, why it matters, and which user-visible guarantee got stronger. Use it a the fastest factual tour of Goal Mode's evolution from prompt discipline into a guard-enforced workflow. +## Unreleased + +### Fix: switching to Build no longer hijacks the session back to Goal (issue #428) + +After starting `/goal`, interrupting, switching to **Build**, and prompting it, +the Build response would finish and Goal Mode would *automatically switch the +main agent back to `goal`* and start running review subagents — exactly the +opposite of what an agent switch is supposed to do. + +**Root cause.** When you switch a session off the `goal` agent, `chat.params` / +`chat.message` correctly set `state.active = false`. But the idle handler's +"is this still a goal session?" check OR'd in two terms that ignored that +deactivation: + +- `Boolean(state.contract)` — the contract persists from the prior goal, so a + stale contract re-activated the session; and +- `goalSessionActiveForAgent("goal", state)` — which is *tautological*: it + returns `true` unconditionally when its first argument is the primary `"goal"` + agent, so once a session had ever been a goal it was treated as one forever. + +The result: on the Build turn's idle, the guard launched a programmatic review +cycle, and the reviewer batch (pinned to the goal primary) switched the user's +main agent back to Goal — the reported "automatically switching to Build". + +**Fix.** The idle path now requires `state.active` for a session to be eligible +for programmatic review or auto-continue. A stale contract (or any other +persisted goal state) can no longer re-activate a session you explicitly paused; +only switching back to the `goal` agent (or running `/goal`) re-activates it, and +reviews then resume with all of your contract, evidence, and review-cycle state +intact. As defense-in-depth, `launchReviewBatch` itself now refuses to run on a +`state.active === false` session, so a future regression in the eligibility check +can never switch the user's main agent back to Goal. + ## v0.7.0 ### Feature: YOLO mode + a fully customizable guard diff --git a/README.md b/README.md index 37d8bb2..20897d6 100644 --- a/README.md +++ b/README.md @@ -346,9 +346,15 @@ on this tool. of pausing. - **Goal Mode stopped after switching agents.** Switching the session off the `goal` agent — to Build or Plan, or through an action that cycles the agent — intentionally - pauses Goal Mode; the guard shows a toast and stops treating that turn as a goal. Your - contract, reviews, and evidence are preserved. Switch back to the `goal` agent, or run - `/goal`, to resume with all state intact. + pauses Goal Mode; the guard shows a toast and stops treating that turn as a goal. + While paused, **no** Goal activity fires on that session: the sidebar hides, the + completion claim is no longer rewritten, auto-continue sends no "keep going" prompt, + and **programmatic reviews are not launched** (so the guard will never switch your + main agent back to `goal` on its own — see + [issue #428](https://github.com/devinoldenburg/opencode-goal-mode/issues/428)). Your + contract, reviews, and evidence are preserved. Switch back to the `goal` agent, or + run `/goal`, to resume with all state intact and reviews will run again on the next + idle. - **A safe command was blocked.** Inspect how the analyzer reads it with `node benchmarks/external.mjs --json`, allow it for that project with `allowCommands`, and please diff --git a/plugins/goal-guard/guard.js b/plugins/goal-guard/guard.js index e9d08fa..7738714 100644 --- a/plugins/goal-guard/guard.js +++ b/plugins/goal-guard/guard.js @@ -287,10 +287,25 @@ export function createGuard(input = {}, options = {}, overrides = {}) { state.verificationSeen || (Array.isArray(state.evidence) && state.evidence.length > 0) || (Array.isArray(state.changedFiles) && state.changedFiles.length > 0); - const goalSession = - state.active || - Boolean(state.contract) || - goalSessionActiveForAgent("goal", state); + // The user can explicitly switch a goal session off the goal primary — to + // Build/Plan, or via an action that cycles the agent — and `chat.params` / + // `chat.message` faithfully set `state.active=false` when that happens. The + // guard MUST honor that deactivation here: a stale contract (or any other + // persisted goal state) MUST NOT re-activate the session, or the guard would + // launch programmatic reviews on the user's Build turn, and the reviewer + // batch (pinned to the goal primary in review-runner.js) would switch the + // main agent back to Goal against the user's explicit choice — the + // "automatically switching to Build" report in issue #428. + // + // The previous expression additionally OR'd in `Boolean(state.contract)` and + // `goalSessionActiveForAgent("goal", state)`, but the latter is TAUTOLOGICAL: + // `goalSessionActiveForAgent` returns `true` unconditionally when its first + // argument is the primary `"goal"` agent (see agents.js), so once a session + // had ever been a goal the idle path treated it as a goal forever, no matter + // how many times the user switched away. Only an ACTUALLY active goal session + // is eligible for review / auto-continue; switching back to the goal agent + // (or running /goal) re-activates it and reviews resume with all state intact. + const goalSession = state.active; const outstandingGates = missingGates(state, config).length; const anchoredContract = Boolean(state.contract) && !state.contract?.auto; const canProgrammaticReview = diff --git a/plugins/goal-guard/review-runner.js b/plugins/goal-guard/review-runner.js index 2ed66e2..a287c60 100644 --- a/plugins/goal-guard/review-runner.js +++ b/plugins/goal-guard/review-runner.js @@ -199,6 +199,20 @@ function freshRecordedVerdict(state, agent, beforeSeq) { /** Launch every reviewer in one batched subtask prompt (parallel on the host). */ async function launchReviewBatch(client, sessionID, agents, state, model, { sleep, idleDeferMs }) { if (!agents.length) return; + // Defense-in-depth for issue #428: the only legitimate caller is the idle path + // of an ACTIVE goal session, but a future regression in the caller's `goalSession` + // check must NEVER be able to launch a reviewer batch on a session the user has + // explicitly switched away from (Build/Plan/etc.). Doing so would pin the batch to + // the goal primary and silently switch the user's main agent back to Goal — the + // exact symptom in the bug report. Bail out (treated as SessionBusy so the outer + // retry loop defers instead of marking every reviewer failed) when the session is + // not actually active. `state` is the live store record, so a fresh agent switch + // is reflected here even if it happened after the caller's eligibility check. + if (state && state.active === false) { + const err = new Error("Goal Guard: session is not an active goal session — refusing to launch programmatic review"); + err.name = "SessionBusyError"; + throw err; + } if (idleDeferMs > 0) await sleep(idleDeferMs); await promptSubtaskWithRetry( client, @@ -208,6 +222,9 @@ async function launchReviewBatch(client, sessionID, agents, state, model, { slee // goal session (the same programmatic path as task/command subtasks), never on // a Build/Plan session. Reference PRIMARY_AGENT, not a bare literal, so a // rename of the primary agent id cannot desync this from the rest of the guard. + // (Safe now that the eligibility check above guarantees the session is active: + // when active, the session's current agent is already the goal primary or a + // goal worker — so this pin is a no-op for the user's main agent.) agent: PRIMARY_AGENT, ...(model ? { model } : {}), parts: agents.map((agent) => ({ diff --git a/tests/plugin.test.mjs b/tests/plugin.test.mjs index a38b9b7..f6200a4 100644 --- a/tests/plugin.test.mjs +++ b/tests/plugin.test.mjs @@ -513,6 +513,117 @@ test("a Build (non-goal) idle session is never auto-continued", async () => { assert.equal(sent.length, 0); }); +test("REGRESSION [issue #428]: a goal session switched to Build is NOT re-activated on idle (no reviews, no agent switch)", async () => { + // The reporter's flow: /goal → interrupt → switch to Build → prompt Build → after + // Build's response the guard switched the main agent back to Goal and ran reviews. + // Root cause: resolveIdleSession's `goalSession` check OR'd in a tautological + // goalSessionActiveForAgent("goal", state) (always true) plus a stale contract, so + // an explicitly deactivated session was still treated as a goal on idle. The fix is + // to require state.active; this test pins the regression. + const launched = []; + const prompts = []; + const client = { + app: { log: async () => {} }, + tui: { showToast: async () => {} }, + session: { + promptAsync: async ({ path, body }) => { + prompts.push({ id: path.id, agent: body?.agent, body }); + for (const part of body?.parts || []) { + if (part.type === "subtask") launched.push(part.agent); + } + }, + messages: async () => ({ data: [] }), + }, + }; + const guard = __test.createGuard( + { client }, + { abortGraceMs: 0, reviewPollMs: 2, reviewTimeoutMs: 100, reviewIdleDeferMs: 0 }, + { persistence: noopPersistence, clock: () => 1, syncIdle: true }, + ); + + // 1) Start a goal and anchor work (contract auto-seeded, edit recorded). + await guard.hooks["chat.params"]({ sessionID: "s", agent: "goal" }, {}); + await guard.hooks["chat.message"]({ sessionID: "s", agent: "goal" }, { parts: [{ type: "text", text: "implement the feature" }] }); + await guard.hooks["tool.execute.after"]({ tool: "edit", sessionID: "s", callID: "c", args: {} }, { output: "", title: "", metadata: {} }); + assert.equal(guard.store.stateFor("s").active, true); + assert.ok(guard.store.stateFor("s").contract, "contract anchored"); + + // 2) User switches the SAME session to Build. + await guard.hooks["chat.params"]({ sessionID: "s", agent: "build" }, {}); + assert.equal(guard.store.stateFor("s").active, false, "Build deactivates the goal session"); + assert.equal(guard.store.stateFor("s").currentAgent, "build"); + + // 3) User prompts Build (a genuine new user turn on the build agent). + await guard.hooks["chat.message"]({ sessionID: "s", agent: "build" }, { parts: [{ type: "text", text: "just check it compiles" }] }); + + // 4) Build's turn finishes → idle fires. + await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "s" } } }); + + // 5) The guard must NOT have launched reviewers or switched the agent to goal. + assert.equal(launched.length, 0, `BUG #428: reviewers must not launch on a Build session (got ${launched.length})`); + const switchedToGoal = prompts.some((p) => p.agent === "goal"); + assert.equal(switchedToGoal, false, "BUG #428: the guard must NOT switch the main agent back to goal"); + // Contract + work are preserved so the user can resume by switching back. + assert.ok(guard.store.stateFor("s").contract, "goal state is preserved across the Build excursion"); + assert.ok((guard.store.stateFor("s").lastEditSeq || 0) > 0, "edit history is preserved"); +}); + +test("REGRESSION [issue #428] companion: switching back to Goal re-activates and programmatic review resumes", async () => { + // The fix must NOT over-fire — switching back to the goal agent (or running /goal) + // re-activates the session and programmatic reviews run with all prior state intact. + const launched = []; + const prompts = []; + let currentAgent = null; + const client = { + app: { log: async () => {} }, + tui: { showToast: async () => {} }, + session: { + promptAsync: async ({ path, body }) => { + prompts.push({ id: path.id, agent: body?.agent, body }); + for (const part of body?.parts || []) { + if (part.type === "subtask") { + currentAgent = part.agent; + launched.push(part.agent); + } + } + }, + messages: async () => ({ + data: [ + { + info: { role: "assistant", agent: currentAgent }, + parts: launched.map((agent) => ({ + type: "tool", + tool: "task", + state: { status: "completed", input: { subagent_type: agent }, output: "Reviewed.\nVerdict: PASS" }, + })), + }, + ], + }), + }, + }; + const guard = __test.createGuard( + { client }, + { abortGraceMs: 0, reviewPollMs: 2, reviewTimeoutMs: 500, reviewIdleDeferMs: 0 }, + { persistence: noopPersistence, clock: () => 1, syncIdle: true }, + ); + + // Goal → Build (idle is a no-op for reviews) → back to Goal (idle runs reviews). + await guard.hooks["chat.params"]({ sessionID: "s", agent: "goal" }, {}); + await guard.hooks["chat.message"]({ sessionID: "s", agent: "goal" }, { parts: [{ type: "text", text: "implement the feature" }] }); + await guard.hooks["tool.execute.after"]({ tool: "edit", sessionID: "s", callID: "c", args: {} }, { output: "", title: "", metadata: {} }); + await guard.hooks["chat.params"]({ sessionID: "s", agent: "build" }, {}); + await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "s" } } }); + assert.equal(launched.length, 0, "Build idle: no reviews"); + + // Switch back to Goal and prompt it; idle now runs the programmatic review cycle. + await guard.hooks["chat.params"]({ sessionID: "s", agent: "goal" }, {}); + await guard.hooks["chat.message"]({ sessionID: "s", agent: "goal" }, { parts: [{ type: "text", text: "resume" }] }); + assert.equal(guard.store.stateFor("s").active, true, "switching back to Goal re-activates the session"); + await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "s" } } }); + + assert.ok(launched.length >= 5, `programmatic reviews resume after re-activation (got ${launched.length})`); +}); + test("auto-continue can be disabled via config", async () => { const sent = []; const guard = __test.createGuard( diff --git a/tests/programmatic-review.test.mjs b/tests/programmatic-review.test.mjs index 7fbaceb..1c36cfc 100644 --- a/tests/programmatic-review.test.mjs +++ b/tests/programmatic-review.test.mjs @@ -236,3 +236,24 @@ test("patch/str_replace edits count as work for programmatic review", async () = await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "g" } } }); assert.ok(launched.length >= 5, `patch tool must mark work and trigger review (got ${launched.length})`); }); + +test("REGRESSION [issue #428]: a deactivated session never launches a reviewer batch (defense-in-depth in launchReviewBatch)", async () => { + // The idle path now blocks this upstream, but launchReviewBatch is the very call + // that pins the batch to the goal primary and would switch the user's main agent. + // A future regression in the eligibility check must NOT be able to reach it for a + // session the user has switched to Build — so launchReviewBatch refuses outright + // when state.active is false, surfacing as a SessionBusy deferral (not a hard + // failure that marks every reviewer failed). + const { guard, launched } = makeReviewingGuard("PASS"); + await startGoalWithWork(guard.hooks); + // User switches the active goal session to Build. + await guard.hooks["chat.params"]({ sessionID: "g", agent: "build" }, {}); + assert.equal(guard.store.stateFor("g").active, false); + await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "g" } } }); + assert.equal(launched.length, 0, "no reviewer subtask may launch on a deactivated (Build) session"); + // Re-activate and confirm reviews then run normally — the deferral did not wedge. + await guard.hooks["chat.params"]({ sessionID: "g", agent: "goal", model: MODEL }, {}); + await guard.hooks["chat.message"]({ sessionID: "g", agent: "goal", model: MODEL }, { parts: [{ type: "text", text: "resume" }] }); + await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "g" } } }); + assert.ok(launched.length >= 5, `reviews run normally after re-activation (got ${launched.length})`); +});