🤖 feat: emit OpenTelemetry traces/spans for agent activity#3483
Open
dcieslak19973 wants to merge 2 commits into
Open
🤖 feat: emit OpenTelemetry traces/spans for agent activity#3483dcieslak19973 wants to merge 2 commits into
dcieslak19973 wants to merge 2 commits into
Conversation
Add an opt-in OTLP tracing path so mux agent turns can be observed in any OpenTelemetry backend, similar to codex-cli and opencode. New TracingService (built directly on the upstream OTEL SDK; no vendored code) registers a global tracer provider gated by standard OTEL env vars. StreamManager wraps each turn in a mux.stream span and enables the AI SDK's experimental_telemetry so ai.streamText/ai.toolCall spans (gen_ai.* attributes) nest beneath it. Prompts redacted by default (MUX_OTEL_RECORD_IO to opt in). Off unless an OTLP endpoint is configured; disabled by MUX_DISABLE_TELEMETRY / OTEL_SDK_DISABLED. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Author
|
@codex review |
|
To use Codex here, create a Codex account and connect to github. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds an opt-in OpenTelemetry tracing path so mux agent activity can be observed in any OTLP-compatible backend (Jaeger, Grafana Tempo, SigNoz, Honeycomb, ...), in the same spirit as codex-cli and opencode. Each agent turn becomes one trace rooted at a
mux.streamspan, with the AI SDK's built-in telemetry contributing the nested LLM/tool spans (ai.streamText,ai.streamText.doStream,ai.toolCall) carrying standardgen_ai.*attributes.Background
codex-cli (
service.name=codex_cli_rs) and opencode both ship OTEL exporters that emit per-session/per-request spans for observability. mux already has anonymous product telemetry (PostHog) but no distributed tracing for inspecting agent turns. This adds that, using the standard OpenTelemetry SDK rather than re-implementing those projects' code.Implementation
TracingService(src/node/services/tracingService.ts): registers a globalNodeTracerProviderwith an OTLP/HTTP exporter. Opt-in and OFF by default — enabled only when a standard OTEL env var is set (OTEL_EXPORTER_OTLP_ENDPOINT/OTEL_EXPORTER_OTLP_TRACES_ENDPOINT/OTEL_SERVICE_NAME), and never whenMUX_DISABLE_TELEMETRY=1orOTEL_SDK_DISABLED=true. Startup-safe (all setup wrapped in try/catch → no-op fallback). Wired throughServiceContainer(init + shutdown) andmux serverviacreateCoreServices.streamManager.ts): each turn opens amux.streamspan carryingmux.workspace.id,mux.workspace.name,mux.agent.mode,mux.thinking_level,gen_ai.request.model.streamText()is invoked inside that span's context (and retried requests reuse it), so the AI SDK's spans nest beneath it. The span is closed in the stream's guaranteed-cleanup path (and the abort-before-start bail path) to avoid leaks.log_user_prompt=false); opt in withMUX_OTEL_RECORD_IO=1.docs/reference/tracing.mdx(+ nav/redirect) and a cross-link from the existing telemetry page.Licensing
This is an original implementation built directly on the upstream OpenTelemetry SDK and the AI SDK's
experimental_telemetryhook — no code is vendored from codex-cli or opencode, and the span/attribute names targeted are open OpenTelemetry semantic conventions. The new dependencies (@opentelemetry/*) are all Apache-2.0, compatible with mux's AGPL-3.0.Validation
tracingService.test.tscover the env gating matrix (opt-in, opt-outs, blank values) and the disabled-instance no-op contracts, plus an enabled round-trip asserting our span is the active context (the mechanism that lets AI SDK spans nest). Confirmed a single hoisted@opentelemetry/api@1.9.0instance (shared withai), so context propagation/nesting holds.lint,typecheck(both tsconfigs),check_eager_imports, prettier, and thestreamManagersuite all pass.Risks
Touches the streaming hot path in
streamManager.ts, but all tracing calls are guarded behindthis.tracingService?and become no-ops when tracing is disabled (the default), so behavior is unchanged unless an OTEL endpoint is configured.Generated with
mux• Model:claude-opus-4-8