Skip to content

fix: streaming timeout, Anthropic-native fallback, and routing on the provider-architecture branch#84

Merged
samueltuyizere merged 10 commits into
routatic:mainfrom
hungcuong9125:fix/anthropic-streaming-fallback-and-routing
Jun 20, 2026
Merged

fix: streaming timeout, Anthropic-native fallback, and routing on the provider-architecture branch#84
samueltuyizere merged 10 commits into
routatic:mainfrom
hungcuong9125:fix/anthropic-streaming-fallback-and-routing

Conversation

@hungcuong9125

Copy link
Copy Markdown
Contributor

Summary

This PR integrates the streaming/fallback/timeout stabilization work from the original PR #80 onto the new provider-architecture base (Provider Abstraction + Unified Request Model + Routing Policy Engine, PR #82), and adds two fixes surfaced during the post-merge review.

The original PR #80 (#80) was opened against the pre-#82 architecture and is now obsolete; it has been closed. This PR is its replacement, re-applied cleanly on top of the current main (4fe96a7).

Why a new PR instead of rebasing #80

PR #80 was authored against the pre-architecture-rewrite base. After the provider architecture (#82) was merged, the streaming/fallback code paths moved from the handler/transformer layer into the new provider/router layer (scenarios.go, fallback.go refactor, request transformer split, Anthropic/Zen/Responses/Gemini routing policy engine). A clean re-integration on the current base is more readable than a rebase, and lets the new architecture own the work from the start.

What's included

From PR #80 (re-applied on the new architecture)

  • streaming_timeout_ms for both opencode_go and opencode_zen, with helpers RequestTimeout(model) and StreamingTimeout(model) on the client.
  • HTTP client no longer relies on a shorter global http.Client.Timeout; per-request contexts carry the timeout. The proxy never kills a stream that is actively producing bytes, and the server-level WriteTimeout is set to 0. Each upstream read uses a per-Read deadline via http.ResponseController.SetReadDeadline that is renewed on every successful byte.
  • handleStreaming derives each attempt from the client request context and binds body reads via the new ctxio.NewCtxReadCloser so streaming_timeout_ms aborts mid-stream (returns ErrStreamReadCanceled).
  • ExecuteWithFallback short-circuits on parent ctx.Err() and on context.DeadlineExceeded, and does not record client cancellation as a circuit-breaker failure.
  • Raw Anthropic streaming pauses SSE keepalive writes during passthrough to prevent :keepalive injection into event frames (the client disconnected during anthropic stream error="context canceled" / Unexpected EOF failure mode seen on Claude Code).
  • responseWriter serializes concurrent writes (heartbeat + stream body copy) with a mutex, and exposes a flushWriter wrapper for raw passthrough so events are not buffered in net/http's bufio.Writer.
  • transformTools hardening: skip empty/whitespace names, normalize null/{}/missing schemas, validate type == "object", validate properties is an object, guard schemaObj nil for whitespace null (panic fix from PR fix: stabilize Anthropic-native streaming, timeout handling, and fallback cancellation #80 review).
  • Streaming Scenario Routing (enable_streaming_scenario_routing) documented in CONFIGURATION.md and README.md.
  • Reload messaging in atomic.go reports timeout changes as effective immediately for both timeout_ms and streaming_timeout_ms on both Go and Zen.
  • walkthrough.md documents the integration and timeout fix for future maintainers.

New on this branch (post-#80 review findings)

  • Timeout config overlap fix in loader.go::applyDefaults: when stream_timeout_ms is unset, the loader now inherits streaming_timeout_ms before falling back to timeout_ms. Without this, a user-configured 600s streaming_timeout_ms was silently downgraded to 300s by the idle watchdog (StreamIdleTimeout reads StreamTimeoutMs only). New regression test: TestDefaults_StreamingTimeoutFallback.
  • Anthropic-native routing fix for Go provider: IsAnthropicModel and isAnthropicNativeGo now route minimax-m2.5, minimax-m2.7, minimax-m3, and qwen-plus through /v1/messages directly. These models reject OpenAI-format streaming with tools (400 invalid params, function name or parameters is empty); the previous code only routed qwen3.7-max through the Anthropic-native path on the Go side, so all other Minimax/Qwen variants fell back to the broken Chat Completions branch.

Commits (5)

dfa4c35 fix: integrate PR #80 streaming/fallback fixes with timeout config fix
328646c fix: integrate streaming context reader into new provider architecture and update streaming error handlers
70ebb18 Merge branch 'fix/streaming-fallback-anthropic-raw' into fix/streaming-fallback-new
7dd3ce4 fix: restore schemaObj nil check to prevent panic on whitespace null schema
feb71f0 fix: address PR review — bind streaming read to attempt ctx, guard nil schema

Tests

  • TestRequestTimeout_* and TestStreamingTimeout_* for both Go and Zen providers, including default fallback, configured override, and small-value edge cases.
  • TestExecuteWithFallback_* for cancelled parent context, parent deadline exceeded, per-model timeout, circuit-breaker accounting under cancellation.
  • TestHandleStreaming_* for configurable timeout, client-context cancellation, mid-stream disconnect, per-model timeout fallback, Anthropic raw no-keepalive injection, concurrent response-writer behavior.
  • TestHandleNonStreaming_ParentContextCanceled_No502 and TestHandleNonStreaming_ParentDeadlineExceeded_No502 to prevent the false-502 all models failed regression.
  • TestTransformTools_* covering empty names, whitespace names, null/{} schemas, missing type, missing properties, malformed JSON, valid schema preservation, and the whitespace-null panic guard.
  • TestDefaults_StreamingTimeoutFallback for the loader fix.
  • TestNewCtxReader_* and TestNewCtxReadCloser_* for the new ctxio helpers.

All packages pass under go test -count=1 ./....

Validation

go build ./...
go test -count=1 ./...

Manual checks recommended:

  • long Claude Code streaming session with tools=29+, long_context → minimax-m3
  • confirm no 2-minute cutoff
  • confirm no Unexpected EOF / no client disconnected during anthropic stream error="context canceled" for Anthropic-native raw streams
  • confirm normal client disconnect does not produce false all models failed
  • confirm streaming_timeout_ms: 600000 in config.json actually results in a 600s idle gap (verify via log: stream idle timeout reached fires after ~600s of upstream silence, not 300s)

Files

22 files changed, +2369 / -112:

  • CONFIGURATION.md (+46)
  • README.md (+1)
  • walkthrough.md (new, +46)
  • configs/config.example.json (+6/-2)
  • internal/client/opencode.go (refactored timeout helpers)
  • internal/client/opencode_test.go (+157)
  • internal/config/atomic.go (timeout reload messaging)
  • internal/config/config.go (StreamingTimeoutMs field)
  • internal/config/loader.go (timeout fallback to streaming_timeout_ms)
  • internal/config/loader_test.go (+42)
  • internal/handlers/messages.go (responseWriter mutex, flushWriter, ctxio binding, parent-ctx short-circuit)
  • internal/handlers/messages_test.go (+1106)
  • internal/handlers/streaming.go (+2)
  • internal/provider/opencode_go.go (isAnthropicNativeGo expanded)
  • internal/router/fallback.go (parent-ctx/parent-deadline short-circuit, no circuit-breaker on cancel)
  • internal/router/fallback_test.go (new, +267)
  • internal/transformer/ctxio.go (new, +76)
  • internal/transformer/ctxio_test.go (new, +101)
  • internal/transformer/request.go (transformTools hardening)
  • internal/transformer/request_test.go (+205)
  • internal/transformer/stream.go (heartbeat-paused flag for raw Anthropic)
  • internal/transformer/stream_test.go (+49)

Related

hungcuong9125 and others added 9 commits June 20, 2026 15:58
…l schema

Two warnings from kilo-code-bot:

1. StreamingTimeoutMs only guards request startup; once GetStreamingBody
   returns, the body read was tied to the request context (no timeout),
   so a mid-stream stall could sit forever. Pass the per-model attempt
   context into ProxyStream/ProxyResponsesStream/ProxyGeminiStream and
   the raw Anthropic io.Copy, and wrap the upstream body with a tiny
   ctxio.NewCtxReadCloser so the body Read also respects the deadline.

2. transformTools panicked on valid JSON that unmarshals to a nil map
   (e.g. " null " with decorative whitespace). Treat that case the same
   as a successful parse of "{}" — fall back to the default schema.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…onfig fix

Apply the full feature set, test coverage, and documentation from PR routatic#80
(fix: stabilize Anthropic-native streaming, timeout handling, and
fallback cancellation) onto the new provider-architecture branch
(4fe96a7). Also address issues raised in the post-merge review:

* Add streaming_timeout_ms to OpenCodeGoConfig and OpenCodeZenConfig,
  with HTTP client relying on per-request context timeouts.
* Add RequestTimeout and StreamingTimeout helpers on OpenCodeClient,
  and StreamIdleTimeout for per-byte idle-gap enforcement.
* Expose heartbeat-paused flag during raw Anthropic streaming to
  prevent keepalive injection into SSE frames.
* Bind streaming body reads to the per-attempt ctx via ctxio
  (NewCtxReader / NewCtxReadCloser) so streaming_timeout_ms aborts
  mid-stream, with ErrStreamReadCanceled surfaced.
* Stop fallback chain early on parent ctx cancellation or deadline
  exceeded, and do not record client-cancel as a circuit-breaker
  failure.
* Harden transformTools: skip empty/whitespace names, normalize
  null/empty schemas, validate type==object and properties is an
  object, guard schemaObj nil for whitespace null.
* ApplyDefaults now falls back to StreamingTimeoutMs before
  TimeoutMs when StreamTimeoutMs is unset, so the user's
  streaming_timeout_ms is honored by the idle watchdog.
* Expand IsAnthropicModel and isAnthropicNativeGo to include
  minimax-m2.5/2.7/3 and qwen-plus on the Go provider — these
  models reject OpenAI-format streaming with 400 and must use the
  Anthropic-native /v1/messages branch.
* Document Streaming Scenario Routing in CONFIGURATION.md and
  README.md; add streaming_timeout_ms to config.example.json.
* Reload messaging in atomic.go now reports timeout changes as
  effective immediately.
* Add walkthrough.md documenting the integration and timeout fix.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
- Explicitly discard Write return value in concurrent test to satisfy Go vet
- Remove redundant type assertion on NewCtxReader return value

@samueltuyizere samueltuyizere left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@samueltuyizere samueltuyizere merged commit ded7efd into routatic:main Jun 20, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants