Skip to content

Stale agent_snapshot on resume blocks publish-time MCP changes from reaching active sessions #89

@hrhrng

Description

@hrhrng

Background

Sessions store a frozen agent_snapshot at create-time so the agent's behavior
stays consistent across the session even if the user edits the agent config
mid-conversation. SessionDO reads agent_snapshot (apps/agent/src/runtime/session-do.ts:399-402)
preferentially over a live DB lookup.

For Slack publish (and Linear/GitHub publish), the integrations gateway
augments the snapshot with the integration's MCP server URL + an mcp_toolset
declaration via injectMcpServersIntoSnapshot (apps/main/src/routes/internal.ts).

Bug

The augmentation runs ONLY on sessions.create. When a webhook event arrives
for an existing scope-bound session (per_channel for Slack, per_issue for
GitHub/Linear), the dispatch path calls sessions.resume(...) which just
appends the event — no snapshot mutation.

Consequence: if the publish wiring or the agent's MCP server set changes
between creates, active sessions never see the change. We hit this
2026-05-19 when the missing mcp_toolset injection was added — existing
slack-bound sessions kept telling users to run curl commands until their
slack_thread_sessions row was manually flipped to inactive.

Options for fix

  1. Detect drift on resume. Before resume, compare the session's
    agent_snapshot.mcp_servers against the publication's currently-required
    set; if drifted, mark the scope row inactive and create a fresh session.
    Loses the running thread context but no model gets stuck.

  2. Mid-session snapshot patch. Add a sessions.augmentSnapshot() op that
    writes additive changes (new mcp_toolset, new mcp_servers) without
    replacing the whole snapshot. Preserves thread context. Risk: harder to
    reason about (now the snapshot mutates between turns).

  3. Pre-flight at publish time. Refuse oma slack bind (and equivalents)
    when the agent's tools[] doesn't already declare an mcp_toolset for the
    integration's server. Removes the silent-divergence class of bugs entirely
    at the cost of more upfront UX. Composes well with docs: add CONTRIBUTING.md #1.

I lean (3) + (1).

Repro

  1. Publish agent A to Slack channel C — first @mention creates session S
    with snapshot S0 (lacking mcp_toolset for slack at the time, before
    today's fix).
  2. Deploy a fix that adds the toolset at create-time.
  3. @mention A in C again — S resumes. S0 still drives behavior.

Out of scope

  • The frozen-snapshot design itself for in-conversation agent edits is fine
    by design. This issue is specifically about publish-side wiring drift.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions