Stale agent_snapshot on resume blocks publish-time MCP changes from reaching active sessions

## Background

Sessions store a frozen `agent_snapshot` at create-time so the agent's behavior
stays consistent across the session even if the user edits the agent config
mid-conversation. SessionDO reads `agent_snapshot` (`apps/agent/src/runtime/session-do.ts:399-402`)
preferentially over a live DB lookup.

For Slack publish (and Linear/GitHub publish), the integrations gateway
augments the snapshot with the integration's MCP server URL + an `mcp_toolset`
declaration via `injectMcpServersIntoSnapshot` (`apps/main/src/routes/internal.ts`).

## Bug

The augmentation runs ONLY on `sessions.create`. When a webhook event arrives
for an existing scope-bound session (per_channel for Slack, per_issue for
GitHub/Linear), the dispatch path calls `sessions.resume(...)` which just
appends the event — no snapshot mutation.

Consequence: if the publish wiring or the agent's MCP server set changes
between creates, **active sessions never see the change**. We hit this
2026-05-19 when the missing `mcp_toolset` injection was added — existing
slack-bound sessions kept telling users to run curl commands until their
`slack_thread_sessions` row was manually flipped to `inactive`.

## Options for fix

1. **Detect drift on resume.** Before `resume`, compare the session's
   `agent_snapshot.mcp_servers` against the publication's currently-required
   set; if drifted, mark the scope row inactive and create a fresh session.
   Loses the running thread context but no model gets stuck.

2. **Mid-session snapshot patch.** Add a `sessions.augmentSnapshot()` op that
   writes additive changes (new mcp_toolset, new mcp_servers) without
   replacing the whole snapshot. Preserves thread context. Risk: harder to
   reason about (now the snapshot mutates between turns).

3. **Pre-flight at publish time.** Refuse `oma slack bind` (and equivalents)
   when the agent's tools[] doesn't already declare an mcp_toolset for the
   integration's server. Removes the silent-divergence class of bugs entirely
   at the cost of more upfront UX. Composes well with #1.

I lean (3) + (1).

## Repro

1. Publish agent A to Slack channel C — first @mention creates session S
   with snapshot S0 (lacking mcp_toolset for slack at the time, before
   today's fix).
2. Deploy a fix that adds the toolset at create-time.
3. @mention A in C again — S resumes. S0 still drives behavior.

## Out of scope

- The frozen-snapshot design itself for in-conversation agent edits is fine
  by design. This issue is specifically about publish-side wiring drift.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stale agent_snapshot on resume blocks publish-time MCP changes from reaching active sessions #89

Background

Bug

Options for fix

Repro

Out of scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Stale agent_snapshot on resume blocks publish-time MCP changes from reaching active sessions #89

Description

Background

Bug

Options for fix

Repro

Out of scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions