fix: thinking-block continuity on non-Anthropic backends#24
Open
BenSheridanEdwards wants to merge 2 commits into
Open
fix: thinking-block continuity on non-Anthropic backends#24BenSheridanEdwards wants to merge 2 commits into
BenSheridanEdwards wants to merge 2 commits into
Conversation
DeepSeek's anthropic-compat endpoint 400s with:
The \`content[].thinking\` in the thinking mode must be passed back to
the API.
…when the request body has \`thinking: { type: \"enabled\", ... }\` at
the top level but the messages don't carry thinking content blocks.
Background: foreign-backend thinking blocks are invalid against
Anthropic's signing key, so the proxy strips them from messages on
isModelCall. But it left the top-level \`thinking\` config in place,
creating the contradictory state DeepSeek rejects.
Fix: drop both \`thinking\` and \`context_management\` for isModelCall
routes (mirrors what the image-fallback path on PR aattaran#21 already does on
forceAnthropicForImage). Backends like DeepSeek don't honor Anthropic's
extended-thinking config anyway, so dropping it costs nothing and
fixes the 400.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…inuity Previous attempt dropped only the top-level \`thinking\` config; the 400 still fires because DeepSeek's check is on \`content[].thinking\` inside messages — it expects its own prior thinking blocks to be passed back verbatim for conversation continuity. The original strip was added to clean up foreign-backend blocks on backend switches (commit 70518b6), but it also removes DeepSeek's own blocks in pure-DeepSeek sessions, breaking continuity. For now: leave thinking blocks in place on isModelCall so DeepSeek can see its own history. We continue to drop the top-level thinking config since non-Anthropic backends don't honor Anthropic's extended-thinking spec consistently. Backend-switch case (DeepSeek session → Anthropic) is still handled by the Anthropic-side strip (\`hadNonAnthropicSession ? stripAllThinkingBlocks : stripUnsignedThinkingBlocks\`), which shouldn't regress. If a future user reports a foreign-block 400 going INTO DeepSeek (e.g. switching mid-session from openrouter to deepseek), we'll need a finer-grained strip that distinguishes block origin. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
DeepSeek's anthropic-compat endpoint returns 400 mid-conversation:
…once Claude Code emits a thinking block in any assistant turn. The proxy was both stripping all thinking blocks from messages AND leaving the top-level
thinking: { type: "enabled", ... }field in the request body, creating a contradictory state DeepSeek rejects.Two fixes in this PR:
1. Drop top-level
thinkingandcontext_managementon non-Anthropic routes. Backends like DeepSeek don't honor Anthropic's extended-thinking spec consistently, and a stale config field is a noisier error than no config at all.2. Stop stripping thinking blocks on
isModelCall. The original strip from commit 70518b6 was added to clean up foreign-backend blocks on backend switches — but it's too broad for pure-DeepSeek sessions, where DeepSeek's own prior thinking blocks are valid and required for continuity. Removing the strip unblocks the conversation.Backend-switch case (e.g. DeepSeek session → Anthropic) is still handled by the Anthropic-side strip (
hadNonAnthropicSession ? stripAllThinkingBlocks : stripUnsignedThinkingBlocks), which doesn't change in this PR.Test plan
deepclaude(default mode, DeepSeek): multiple turns including tool use, no 400s.Notes
This bug also affects
feat/cost-statusline(#23) and any branch that descends from main while keeping the original strip. Both fixes are also present on #23 by virtue of being committed there first; when one PR merges, the other's rebase will recognise the duplicate patches and drop them.