Skip to content

fix(plugin): synchronous shutdown hook for OTel span drain (revert PR #74)#75

Merged
Alezander9 merged 2 commits into
mainfrom
alex/v4-sync-shutdown-hook
May 17, 2026
Merged

fix(plugin): synchronous shutdown hook for OTel span drain (revert PR #74)#75
Alezander9 merged 2 commits into
mainfrom
alex/v4-sync-shutdown-hook

Conversation

@Alezander9
Copy link
Copy Markdown
Member

@Alezander9 Alezander9 commented May 17, 2026

Summary

Replace PR #74's bus-based await session.idle / server.instance.disposed mechanism (shipped in v0.1.7) with a synchronous shutdown hook invoked directly from src/index.ts's top-level finally. PR #74 turned out to be a no-op in headless bcode run mode — the only mode V4 cloud uses — because the plugin's bus subscriber fiber gets interrupted by Effect scope teardown before it can process those events.

A/B that triggered the rewrite

Health-check on V4 staging traces in BrowserCode-CLOUD Laminar, identical structure pre- and post-v0.1.7:

Trace Version session.llm ai.streamText Last LLM landed Turn parent landed
a4d10b79 (7.83s) v0.1.6 2 / 2 2 / 2 ❌ (orphan 00...7029...)
034b1b13 (532s, 52 turns) v0.1.6 52 / 52 52 / 52 ❌ (orphan 00...adbf...)
241ebf73 (10.9s) v0.1.7 4 / 4 4 / 4 ❌ (orphan 00...095f...)

PR #74 did not change the trace shape in any measurable way. The leaf LLM spans were always landing thanks to PR #50's existing forceFlush(3000ms) race. What's been missing all along is the bcode-laminar "turn" parent span that chat.message creates and that's supposed to be ended in session.idle / session.deleted / server.instance.disposed.

Why the bus path is unreliable in headless mode

  1. bcode run is non-interactive — calls client.session.prompt() once, exits.
  2. session.idle is published by session/processor.ts:758 when the prompt completes. But after the prompt returns to run.ts, the top-level Effect resolves and scopes start closing.
  3. The plugin layer depends on the Bus layer; the plugin's subscribeAll fiber (forked via Effect.forkScoped) is interrupted at scope close. The fiber may be interrupted before it dequeues session.idle.
  4. server.instance.disposed is published by the Bus's own InstanceState finalizer at bus/index.ts:60-64 — and then PubSub.shutdown(wildcard) is called immediately in the same Effect.gen. The plugin's subscriber may or may not be alive at that point, and even if it is, the in-flight await sdk.shutdown() may be cut short by its own fiber being interrupted.
  5. session.deleted is never published in bcode run mode at all — the headless command doesn't delete the session.

Whatever PR #74's await would have helped with, the events themselves don't reach the handler reliably. Confirmed empirically: identical trace shape v0.1.6 vs v0.1.7.

The fix

Three small surgical changes plus one tiny plugin-side implementation:

1. Add shutdown?: () => void to the plugin SDK (packages/plugin/src/index.ts)

/**
 * Synchronous shutdown hook invoked once per process before
 * `process.exit()`, after the event loop has finished its last task and
 * before the host's OTel span exporter drain. Use this to end any
 * still-open OTel spans your plugin created — async work is not honored
 * here, but ending a span (`span.end()`) is synchronous and the host's
 * `forceFlush` runs right after this hook.
 */
shutdown?: () => void

Additive, no consumer breakage.

2. Expose a module-level shutdown-hook registry (packages/opencode/src/plugin/index.ts)

// Synchronous shutdown hooks invoked from src/index.ts's top-level finally
// before forceFlush. Plugins register here when loaded; runs once per
// process before process.exit(). Module-level intentionally — needs to be
// reachable outside the Effect runtime.
export const pluginShutdownHooks = new Set<() => void>()

After plugins finish loading, the layer registers each plugin's optional shutdown into the set:

for (const hook of hooks) {
  if (hook.shutdown) pluginShutdownHooks.add(hook.shutdown)
}

Also reverts PR #74's Effect.promise(async () => { ... await ... }) dispatch loop back to upstream's Effect.sync + void hook["event"]?.(...) — same surface as anomalyco/opencode again.

3. Invoke the hooks from the top-level finally (packages/opencode/src/index.ts)

} finally {
  const { pluginShutdownHooks } = await import("./plugin")
  for (const hook of pluginShutdownHooks) {
    try { hook() }
    catch (err) { Log.Default.error("plugin shutdown hook failed", { error: err }) }
  }
  // existing forceFlush(3000ms) race stays — drains the just-ended spans
  ...
  process.exit()
}

The await import("./plugin") is dynamic to avoid an import cycle (this file already imports ./plugin indirectly via commands; the dynamic form makes the dependency explicit at the use site).

4. Implement shutdown in bcode-laminar (packages/bcode-laminar/src/plugin.ts)

shutdown: () => {
  for (const [sessionId, span] of Object.entries(sessionCurrentTurnSpan)) {
    span.end()
    delete sessionCurrentTurnSpan[sessionId]
  }
},

The existing session.idle / session.deleted / server.instance.disposed event handlers stay in place as defense in depth — they're no-ops once the shutdown hook has cleared the map, but they remain useful in long-running TUI mode where the events do reliably fire.

Why this works where PR #74 didn't

Step What runs When
1 Plugin synchronous shutdown()span.end() on every open turn span Inside the running finally block, event loop still alive
2 provider.forceFlush() with 3 s timeout Same finally, async-awaited, batch processor drains the just-ended turn spans
3 process.exit() After both above complete

No race with Effect scope closure, no bus pubsub, no fiber lifecycle, no reliance on session.idle / session.deleted / server.instance.disposed firing. Just a direct function call from inside the finally block.

Verification plan

  1. Land this PR.
  2. Cut v0.1.8-rc1 pre-release tag from main.
  3. Bump cloud v4-worker/Dockerfile to --version 0.1.8-rc1 (separate cloud PR, staging-only).
  4. Re-run a long-task smoke against staging.
  5. Verify in BrowserCode-CLOUD Laminar that:
    • The "turn" parent span now appears as received (not inferred-from-children).
    • total_cost and total_tokens on the trace aggregate are non-zero (Laminar rolls up from received parents).
  6. If green, cut v0.1.8 proper; if not, iterate.

Yellow-zone accounting

Net delta vs PR #74 is roughly even — we trade the unreliable bus-await yellow edit for a load-bearing sync-shutdown yellow edit. The new code paths are smaller and simpler. Documented in memory/browsercode/EXCEPTIONS.md under "Phase F (cont.) — synchronous plugin shutdown hook".

Upstream-able

Yes — this is a generic capability every OTel-based opencode plugin needs. After we've validated it in V4 staging, this is a strong candidate to upstream to anomalyco/opencode. The Hooks.shutdown field is additive; the pluginShutdownHooks registry is the kind of escape hatch upstream is likely to accept once we describe the race it solves.

Files

 packages/bcode-laminar/src/plugin.ts  | 12 ++++++++++
 packages/opencode/src/index.ts        | 22 +++++++++++++-----
 packages/opencode/src/plugin/index.ts | 42 ++++++++++++++++-------------------
 packages/plugin/src/index.ts          |  9 ++++++++
 4 files changed, 56 insertions(+), 29 deletions(-)

bun run typecheck clean (all 7 packages green).


Summary by cubic

Adds a synchronous plugin shutdown hook to reliably end and export OTel “turn” spans in headless bcode run. Replaces the bus-await path from #74 and hardens plugin dispatch and shutdown safety.

  • Bug Fixes
    • Added shutdown?: () => void to Hooks in packages/plugin, and a module-level pluginShutdownHooks registry in packages/opencode/src/plugin/index.ts.
    • Invoke all registered shutdown hooks from packages/opencode/src/index.ts top-level finally, then run tracer forceFlush() before exit; wrapped the dynamic await import("./plugin") in try/catch.
    • Implemented shutdown in packages/bcode-laminar/src/plugin.ts to span.end() any open turn spans.
    • Reverted event handling to sync fire-and-forget and restored per-hook error isolation so one bad plugin can't kill the subscriber fiber.
    • Deregister shutdown hooks via Effect.addFinalizer to avoid stale closures in multi-instance TUI mode.

Written for commit 4951144. Summary will update on new commits. Review in cubic

Revert PR #74's bus-await mechanism (v0.1.7) and replace with a synchronous shutdown hook invoked from src/index.ts's top-level finally before forceFlush. PR #74 was a no-op in headless 'bcode run' mode because the plugin's bus subscriber fiber gets interrupted by Effect scope teardown before it can process session.idle / server.instance.disposed events — confirmed by A/B trace shape comparison between v0.1.6 and v0.1.7 (identical, turn parent span missing in both). The new path is a direct function call from inside the running finally, so the event loop is alive, no scope race, no bus dependency.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 4 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/opencode/src/index.ts">

<violation number="1" location="packages/opencode/src/index.ts:258">
P1: Guard the dynamic plugin import in `finally`; an import failure can abort shutdown cleanup and skip `forceFlush`/`process.exit()`.</violation>
</file>

<file name="packages/opencode/src/plugin/index.ts">

<violation number="1" location="packages/opencode/src/plugin/index.ts:258">
P1: A throwing plugin event handler can now terminate the bus subscription fiber because per-hook error isolation was removed.</violation>

<violation number="2" location="packages/opencode/src/plugin/index.ts:271">
P2: Shutdown hooks are added to a global Set but never deregistered on instance disposal, which can retain stale closures and invoke disposed-instance hooks at exit.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Fix all with cubic | Re-trigger cubic

Comment thread packages/opencode/src/index.ts Outdated
Comment thread packages/opencode/src/plugin/index.ts Outdated
Comment thread packages/opencode/src/plugin/index.ts
Three fixes:

1. (src/index.ts) Wrap the dynamic 'await import(./plugin)' in the top-level finally with try/catch so a module-load failure cannot strand the process before forceFlush + process.exit().

2. (plugin/index.ts) Re-add per-hook error isolation on the bus event dispatch loop. Reverting PR #74's await also accidentally removed this. Catches sync throws and observes async rejections via .catch(log.error) so one bad plugin can't terminate the subscription fiber for the rest of the process.

3. (plugin/index.ts) Deregister this layer's shutdown hooks from the module-level Set via Effect.addFinalizer so multi-instance TUI mode doesn't accumulate stale closures across project reopens.
@Alezander9 Alezander9 merged commit 09a7178 into main May 17, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant