Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,39 @@ changed, why it matters, and which user-visible guarantee got stronger. Use it a
the fastest factual tour of Goal Mode's evolution from prompt discipline into a
guard-enforced workflow.

## Unreleased

### Fix: switching to Build no longer hijacks the session back to Goal (issue #428)

After starting `/goal`, interrupting, switching to **Build**, and prompting it,
the Build response would finish and Goal Mode would *automatically switch the
main agent back to `goal`* and start running review subagents — exactly the
opposite of what an agent switch is supposed to do.

**Root cause.** When you switch a session off the `goal` agent, `chat.params` /
`chat.message` correctly set `state.active = false`. But the idle handler's
"is this still a goal session?" check OR'd in two terms that ignored that
deactivation:

- `Boolean(state.contract)` — the contract persists from the prior goal, so a
stale contract re-activated the session; and
- `goalSessionActiveForAgent("goal", state)` — which is *tautological*: it
returns `true` unconditionally when its first argument is the primary `"goal"`
agent, so once a session had ever been a goal it was treated as one forever.

The result: on the Build turn's idle, the guard launched a programmatic review
cycle, and the reviewer batch (pinned to the goal primary) switched the user's
main agent back to Goal — the reported "automatically switching to Build".

**Fix.** The idle path now requires `state.active` for a session to be eligible
for programmatic review or auto-continue. A stale contract (or any other
persisted goal state) can no longer re-activate a session you explicitly paused;
only switching back to the `goal` agent (or running `/goal`) re-activates it, and
reviews then resume with all of your contract, evidence, and review-cycle state
intact. As defense-in-depth, `launchReviewBatch` itself now refuses to run on a
`state.active === false` session, so a future regression in the eligibility check
can never switch the user's main agent back to Goal.

## v0.7.0

### Feature: YOLO mode + a fully customizable guard
Expand Down
12 changes: 9 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -346,9 +346,15 @@ on this tool.
of pausing.
- **Goal Mode stopped after switching agents.** Switching the session off the `goal`
agent — to Build or Plan, or through an action that cycles the agent — intentionally
pauses Goal Mode; the guard shows a toast and stops treating that turn as a goal. Your
contract, reviews, and evidence are preserved. Switch back to the `goal` agent, or run
`/goal`, to resume with all state intact.
pauses Goal Mode; the guard shows a toast and stops treating that turn as a goal.
While paused, **no** Goal activity fires on that session: the sidebar hides, the
completion claim is no longer rewritten, auto-continue sends no "keep going" prompt,
and **programmatic reviews are not launched** (so the guard will never switch your
main agent back to `goal` on its own — see
[issue #428](https://github.com/devinoldenburg/opencode-goal-mode/issues/428)). Your
contract, reviews, and evidence are preserved. Switch back to the `goal` agent, or
run `/goal`, to resume with all state intact and reviews will run again on the next
idle.
- **A safe command was blocked.** Inspect how the analyzer reads it with
`node benchmarks/external.mjs --json`, allow it for that project with
`allowCommands`, and please
Expand Down
23 changes: 19 additions & 4 deletions plugins/goal-guard/guard.js
Original file line number Diff line number Diff line change
Expand Up @@ -287,10 +287,25 @@ export function createGuard(input = {}, options = {}, overrides = {}) {
state.verificationSeen ||
(Array.isArray(state.evidence) && state.evidence.length > 0) ||
(Array.isArray(state.changedFiles) && state.changedFiles.length > 0);
const goalSession =
state.active ||
Boolean(state.contract) ||
goalSessionActiveForAgent("goal", state);
// The user can explicitly switch a goal session off the goal primary — to
// Build/Plan, or via an action that cycles the agent — and `chat.params` /
// `chat.message` faithfully set `state.active=false` when that happens. The
// guard MUST honor that deactivation here: a stale contract (or any other
// persisted goal state) MUST NOT re-activate the session, or the guard would
// launch programmatic reviews on the user's Build turn, and the reviewer
// batch (pinned to the goal primary in review-runner.js) would switch the
// main agent back to Goal against the user's explicit choice — the
// "automatically switching to Build" report in issue #428.
//
// The previous expression additionally OR'd in `Boolean(state.contract)` and
// `goalSessionActiveForAgent("goal", state)`, but the latter is TAUTOLOGICAL:
// `goalSessionActiveForAgent` returns `true` unconditionally when its first
// argument is the primary `"goal"` agent (see agents.js), so once a session
// had ever been a goal the idle path treated it as a goal forever, no matter
// how many times the user switched away. Only an ACTUALLY active goal session
// is eligible for review / auto-continue; switching back to the goal agent
// (or running /goal) re-activates it and reviews resume with all state intact.
const goalSession = state.active;
const outstandingGates = missingGates(state, config).length;
const anchoredContract = Boolean(state.contract) && !state.contract?.auto;
const canProgrammaticReview =
Expand Down
17 changes: 17 additions & 0 deletions plugins/goal-guard/review-runner.js
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,20 @@ function freshRecordedVerdict(state, agent, beforeSeq) {
/** Launch every reviewer in one batched subtask prompt (parallel on the host). */
async function launchReviewBatch(client, sessionID, agents, state, model, { sleep, idleDeferMs }) {
if (!agents.length) return;
// Defense-in-depth for issue #428: the only legitimate caller is the idle path
// of an ACTIVE goal session, but a future regression in the caller's `goalSession`
// check must NEVER be able to launch a reviewer batch on a session the user has
// explicitly switched away from (Build/Plan/etc.). Doing so would pin the batch to
// the goal primary and silently switch the user's main agent back to Goal — the
// exact symptom in the bug report. Bail out (treated as SessionBusy so the outer
// retry loop defers instead of marking every reviewer failed) when the session is
// not actually active. `state` is the live store record, so a fresh agent switch
// is reflected here even if it happened after the caller's eligibility check.
if (state && state.active === false) {
const err = new Error("Goal Guard: session is not an active goal session — refusing to launch programmatic review");
err.name = "SessionBusyError";
throw err;
}
if (idleDeferMs > 0) await sleep(idleDeferMs);
await promptSubtaskWithRetry(
client,
Expand All @@ -208,6 +222,9 @@ async function launchReviewBatch(client, sessionID, agents, state, model, { slee
// goal session (the same programmatic path as task/command subtasks), never on
// a Build/Plan session. Reference PRIMARY_AGENT, not a bare literal, so a
// rename of the primary agent id cannot desync this from the rest of the guard.
// (Safe now that the eligibility check above guarantees the session is active:
// when active, the session's current agent is already the goal primary or a
// goal worker — so this pin is a no-op for the user's main agent.)
agent: PRIMARY_AGENT,
...(model ? { model } : {}),
parts: agents.map((agent) => ({
Expand Down
111 changes: 111 additions & 0 deletions tests/plugin.test.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -513,6 +513,117 @@ test("a Build (non-goal) idle session is never auto-continued", async () => {
assert.equal(sent.length, 0);
});

test("REGRESSION [issue #428]: a goal session switched to Build is NOT re-activated on idle (no reviews, no agent switch)", async () => {
// The reporter's flow: /goal → interrupt → switch to Build → prompt Build → after
// Build's response the guard switched the main agent back to Goal and ran reviews.
// Root cause: resolveIdleSession's `goalSession` check OR'd in a tautological
// goalSessionActiveForAgent("goal", state) (always true) plus a stale contract, so
// an explicitly deactivated session was still treated as a goal on idle. The fix is
// to require state.active; this test pins the regression.
const launched = [];
const prompts = [];
const client = {
app: { log: async () => {} },
tui: { showToast: async () => {} },
session: {
promptAsync: async ({ path, body }) => {
prompts.push({ id: path.id, agent: body?.agent, body });
for (const part of body?.parts || []) {
if (part.type === "subtask") launched.push(part.agent);
}
},
messages: async () => ({ data: [] }),
},
};
const guard = __test.createGuard(
{ client },
{ abortGraceMs: 0, reviewPollMs: 2, reviewTimeoutMs: 100, reviewIdleDeferMs: 0 },
{ persistence: noopPersistence, clock: () => 1, syncIdle: true },
);

// 1) Start a goal and anchor work (contract auto-seeded, edit recorded).
await guard.hooks["chat.params"]({ sessionID: "s", agent: "goal" }, {});
await guard.hooks["chat.message"]({ sessionID: "s", agent: "goal" }, { parts: [{ type: "text", text: "implement the feature" }] });
await guard.hooks["tool.execute.after"]({ tool: "edit", sessionID: "s", callID: "c", args: {} }, { output: "", title: "", metadata: {} });
assert.equal(guard.store.stateFor("s").active, true);
assert.ok(guard.store.stateFor("s").contract, "contract anchored");

// 2) User switches the SAME session to Build.
await guard.hooks["chat.params"]({ sessionID: "s", agent: "build" }, {});
assert.equal(guard.store.stateFor("s").active, false, "Build deactivates the goal session");
assert.equal(guard.store.stateFor("s").currentAgent, "build");

// 3) User prompts Build (a genuine new user turn on the build agent).
await guard.hooks["chat.message"]({ sessionID: "s", agent: "build" }, { parts: [{ type: "text", text: "just check it compiles" }] });

// 4) Build's turn finishes → idle fires.
await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "s" } } });

// 5) The guard must NOT have launched reviewers or switched the agent to goal.
assert.equal(launched.length, 0, `BUG #428: reviewers must not launch on a Build session (got ${launched.length})`);
const switchedToGoal = prompts.some((p) => p.agent === "goal");
assert.equal(switchedToGoal, false, "BUG #428: the guard must NOT switch the main agent back to goal");
// Contract + work are preserved so the user can resume by switching back.
assert.ok(guard.store.stateFor("s").contract, "goal state is preserved across the Build excursion");
assert.ok((guard.store.stateFor("s").lastEditSeq || 0) > 0, "edit history is preserved");
});

test("REGRESSION [issue #428] companion: switching back to Goal re-activates and programmatic review resumes", async () => {
// The fix must NOT over-fire — switching back to the goal agent (or running /goal)
// re-activates the session and programmatic reviews run with all prior state intact.
const launched = [];
const prompts = [];
let currentAgent = null;
const client = {
app: { log: async () => {} },
tui: { showToast: async () => {} },
session: {
promptAsync: async ({ path, body }) => {
prompts.push({ id: path.id, agent: body?.agent, body });
for (const part of body?.parts || []) {
if (part.type === "subtask") {
currentAgent = part.agent;
launched.push(part.agent);
}
}
},
messages: async () => ({
data: [
{
info: { role: "assistant", agent: currentAgent },
parts: launched.map((agent) => ({
type: "tool",
tool: "task",
state: { status: "completed", input: { subagent_type: agent }, output: "Reviewed.\nVerdict: PASS" },
})),
},
],
}),
},
};
const guard = __test.createGuard(
{ client },
{ abortGraceMs: 0, reviewPollMs: 2, reviewTimeoutMs: 500, reviewIdleDeferMs: 0 },
{ persistence: noopPersistence, clock: () => 1, syncIdle: true },
);

// Goal → Build (idle is a no-op for reviews) → back to Goal (idle runs reviews).
await guard.hooks["chat.params"]({ sessionID: "s", agent: "goal" }, {});
await guard.hooks["chat.message"]({ sessionID: "s", agent: "goal" }, { parts: [{ type: "text", text: "implement the feature" }] });
await guard.hooks["tool.execute.after"]({ tool: "edit", sessionID: "s", callID: "c", args: {} }, { output: "", title: "", metadata: {} });
await guard.hooks["chat.params"]({ sessionID: "s", agent: "build" }, {});
await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "s" } } });
assert.equal(launched.length, 0, "Build idle: no reviews");

// Switch back to Goal and prompt it; idle now runs the programmatic review cycle.
await guard.hooks["chat.params"]({ sessionID: "s", agent: "goal" }, {});
await guard.hooks["chat.message"]({ sessionID: "s", agent: "goal" }, { parts: [{ type: "text", text: "resume" }] });
assert.equal(guard.store.stateFor("s").active, true, "switching back to Goal re-activates the session");
await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "s" } } });

assert.ok(launched.length >= 5, `programmatic reviews resume after re-activation (got ${launched.length})`);
});

test("auto-continue can be disabled via config", async () => {
const sent = [];
const guard = __test.createGuard(
Expand Down
21 changes: 21 additions & 0 deletions tests/programmatic-review.test.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -236,3 +236,24 @@ test("patch/str_replace edits count as work for programmatic review", async () =
await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "g" } } });
assert.ok(launched.length >= 5, `patch tool must mark work and trigger review (got ${launched.length})`);
});

test("REGRESSION [issue #428]: a deactivated session never launches a reviewer batch (defense-in-depth in launchReviewBatch)", async () => {
// The idle path now blocks this upstream, but launchReviewBatch is the very call
// that pins the batch to the goal primary and would switch the user's main agent.
// A future regression in the eligibility check must NOT be able to reach it for a
// session the user has switched to Build — so launchReviewBatch refuses outright
// when state.active is false, surfacing as a SessionBusy deferral (not a hard
// failure that marks every reviewer failed).
const { guard, launched } = makeReviewingGuard("PASS");
await startGoalWithWork(guard.hooks);
// User switches the active goal session to Build.
await guard.hooks["chat.params"]({ sessionID: "g", agent: "build" }, {});
assert.equal(guard.store.stateFor("g").active, false);
await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "g" } } });
assert.equal(launched.length, 0, "no reviewer subtask may launch on a deactivated (Build) session");
// Re-activate and confirm reviews then run normally — the deferral did not wedge.
await guard.hooks["chat.params"]({ sessionID: "g", agent: "goal", model: MODEL }, {});
await guard.hooks["chat.message"]({ sessionID: "g", agent: "goal", model: MODEL }, { parts: [{ type: "text", text: "resume" }] });
await guard.hooks.event({ event: { type: "session.idle", properties: { sessionID: "g" } } });
assert.ok(launched.length >= 5, `reviews run normally after re-activation (got ${launched.length})`);
});
Loading